"Today, with Proxem, we have analyzed nearly 400,000 reviews linked to breakdowns or maintenance reports."
Pierre Jallais, Search & NLP project manager
I am Pierre Jallais, I work within the Technology Group of Total, attached to the Holding and more particularly to Strategy & Innovation, in charge of all subjects related to research, innovation and NLP.
Total is a multi-energy group present in 130 countries, with 100,000 employees. We operate throughout the energy chain, from production to processing. There are multitudes of energies:
Traditional energies like gas and oil and more recent ones like wind power and solar power (that is to say renewable energies)
Our main mission is to lead the networks of the CTG (Group Technology Committee), within Total it is a set of technical business networks common to all branches of the Group. The CTG has existed for more than 20 years and has always been the privileged place for professionals to share their skills, their experiences, implement innovations and regroup them.
We have two main objectives on the CTG side:
In order to contribute to our KM challenges, we have set up a thesaurus; NLP allows it to be enriched and, by combining it with our tools, thus facilitates access to knowledge. This also improves the quality of our search engine; the relevance of the words searched and allows more efficient navigation between all the documents made available.
Concerning the second project, “SIL” (Safety Integrate Level): it is financed and managed by the CTG. It consists in analyzing the breakdown reports of the equipment in certain industrial sites in order to derive maximum value from them. The objective of the project is to respond to safety issues concerning more particularly the Normandy site (Refining Chemicals branch). This involves analyzing the reports written by operators working on equipment related to instrumentation (equipment with safety functions). This guarantees the safety of our facilities.
The whole point is to use NLP to analyze all these unstructured mini-reports, to check if the equipment is working properly. This is a very important piece of information. Before the analysis was done manually on a very small sample of data.
Today, with Proxem, we have analyzed nearly 400,000 reviews linked to breakdowns or maintenance reports.
The two projects have very distinct objectives, while being consistent with the missions of the CTG. However, these are linked because they both use the part of the thesaurus devoted to phenomena that may have an impact on our facilities. We will also find the vocabulary related to breakdowns used on SIL (for example, corrosion).
As you can see, the semantic analysis solution implemented allows us to enrich the vocabulary, to combine it with our research tools and thus to contribute to the improvement of our results.
In addition, we want this vocabulary to be accessible to the whole company (Total group) which works on NLP use cases. Internally, we try to promote these uses, namely the use of semantic resources, so that it can be reused in as much different contexts as possible.
Today, we will publish this vocabulary in another tool, so that Data Scientists can complete the vocabulary to build their own thesauruses.
The “Vocabulary Thesaurus” project took place more than two years ago after a manual construction of this thesaurus.
We had not seen many other solutions suitable for this use. The major advantage of the tool is that it does not require a very technical background, so the teams, in particular our part-time librarian, were able to use it quickly after a short training.
The semantic a,alysis solution implemented allows us to enrich the vocabulary, to combine it with our research tools and thus to contribute to the improvement of our results.
We had 400 keywords in the existing thesaurus. Today with Proxem, we have nearly 6,000 concepts, which is quite consistent. This plays an important role for our search engine to improve all that is the relevance of the search results. Vocabulary had an impact, but there were also other related developments to improve relevancy in particular.
The main objective of the SIL project is not the ROI but the safety of our installations. Thanks to all these results, we were able to calculate the failure rates of our equipment. We have consistent results compared to what has been evaluated in the past. These good results allow us to validate this method in an official way and thus we hope to be able to automate it and especially to deepen its use in order to give even more details.
Regarding the SIL project, we realize that this use case will meet essential security objectives; but we can also apply it to address other issues. We are in the value search phase concerning:
We realize that what we have put in place within the framework of the SIL is interesting when we have to assess the quality of an industrial asset.
Evaluating the quality and safety of our equipment on a large scale is the goal.