The biggest challenge in natural language processing (NLP) is understanding the true meaning behind the words within a sentence. This cannot simply be achieved through a strong tree parser, as it requires human knowledge. The author of any document assumes the reader to possess a basic understanding of the world that enable them to group together sequences of words and grasp the general relationship between actors and actions. A complete understanding of any text demands acknowledgement that words and phrases are only meaningful when considered in proper context.
Expert.ai Studio puts these two (very) different parts of the equation together by providing users with both a parser and a knowledge graph. When combined, these forces can analyze a document and return vast amounts of information about every single element within the text.
Users can then pick the data points they need to deliver their NLP application, shaping the information in the unique way that effectively addresses their specific problem. We call this our NLU (natural language understanding) approach.
Our NLU process is made of a series of analytical techniques:
- Parser: groups words, phrases and proper nouns.
- Sentence analysis: isolates the sentences and different types of clauses.
- POS tagger: assigns grammar types to all the fundamental atoms of the document.
- Logical analysis: identifies the actions as well as their associated subjects and objects.
- Semantic disambiguation: transforms fundamental atoms into concepts.
This final piece of the puzzle leverages all the information stored by our knowledge graph about most concepts belonging to the real world. Consider these examples:
- An apple is a type of fruit, fruit is a type food, and food can be eaten. Therefore, an apple can be eaten.
- To say something is “beautiful” denotes a positive connotation in terms of sentiment. We call these concept syncons (all properties come with them), and we have hundreds of thousands connected within a network of millions of relations and two dimensions (e.g., “is a” and “is part of”).
In other words, Studio comes out of the box with our expert.ai technology that takes care of all the heavy-lifting related to understanding a document at a deeper level. But in most real-life scenarios this isn’t enough. For example, if you’re analyzing a claim related to a car accident, it’s not enough to understand that John Smith is a person, you want to know if he’s the claimant, a witness, a first responder, etc. To put it differently, you need to shape all that information in a way that’s useful to your particular case. That’s when you leverage the other side of Studio, that is the ability to use all of those data points to create a linguistic solution that outputs only what’s relevant about your content and in a way that’s effective for the final user of your application.
To achieve this, Studio offers:
- low-code IDE (Integrated Development Environment) optimized to design linguistic applications for Categorization and Entity Extraction
- inline knowledge graph navigator to easily integrate a unique network of concepts into your project
- a library manager to ingest corpora and annotate documents quickly (only necessary if you want to use our quality measurement tools)
- quality measurement tools to keep track of precision and recall values of your application, and their evolution over time
- AI project builder for entity extraction and document classification
- tight integration with natural languageunderstanding engine
- powerful editor with syntax highlighting and autocomplete with IntelliSense
- quality Dashboard for measuring precision and recall with respect to a golden corpus
- version control support with SVN and GIT native connectors
- fully scriptable and extensible workflow
Expert.ai Edge NL API
Once you’re satisfied with the development of your project, it’s time to package it into a service that can be deployed in a production environment and become part of your larger workflow. This service will be accessed either directly through our API, or integrated with other parts of your architecture. To do this you’ll need expert.ai Edge NL API.
Edge is expert.ai’s NL API server that can be deployed on premise or in your private cloud. It will run your Studio project and make it available through our REST API.
Expert.ai Technology Applied
This is a good example of the importance of context. Our knowledge graph knows that Washington can be a city, a state, and many other things, but it is clear to any human reader that in this text we’re talking about a person since it is clearly the name of the writer’s wife. As you can see in the right pane at the bottom, this inference has been executed correctly.
In the second sentence we have less direct evidence that Washington is a person, but the information is inherited from the previous sentence, since the sentence structure allows it.
Finally, in the third sentence we have an example of an anaphora, where the subject “She” is referencing Washington, which again is not any other female entity the engine knows in advance, but rather the inference that happened in the first sentence.
In this next example we are able to infer that the sentiment expressed (“a good car”) is referred to the BMW at the beginning of the sentence, not the Mercedes close to the verb (which was also presenting a grammatical structure that could trick the engine: “my Mercedes is a good car”). Sentence analysis here was directly responsible for isolating the correct context for the positive statement, since, as the colors underneath every word show, “The BMW is a good car” is the main clause (blue underline) while “[that] I bought to replace my Mercedes” is a relative clause. Not shown here but worth mentioning: the engine knows that relative clause is actually made of 2 parts: “I bought” (as the main relative) and “to replace my Mercedes” (as the purpose, that is the reason for having bought something).
Also relevant in Sentiment analysis as well as many NLP applications, is the level of detail in a noun group. In this example we know that the display is part of the phone, but we also realize that the positive statement is only about “the display”, not the phone as a whole.
The screenshot above and the one below are to highlight the subtleties of the disambiguation process mentioned before, and how it can only happen with a strong connection to context awareness. The verb “to play” has many meanings, and in the screenshot above the engine correctly picked the one that relates to sporting activities.
In this screenshot, instead, we observe a sentence that is almost identical, in fact entirely identical from a grammatical standpoint (and any other strictly linguistic one, besides the actual meaning of the words). The engine this time picked a different meaning of the verb play, correctly once again, which is the one related to music.
In the first screenshot we have a pitcher playing, while in the second we have a pianist playing. In other words, the meaning of “playing” changed because a completely different word in the sentence changed. But we know that that particular word was the subject of the play predicate, and therefore an awareness of context delivered the right answer both times.
Finally, the following 2 screenshots offer a sneak peek to our knowledge graph: one syncon related to the word “peach” (as a fruit) and one syncon related to the word “asset” (as possession). Above and below the selected syncons we see some of the other syncons linked to the selected one in a type of relation commonly known as “Is a” (which in expert.ai we call supernomen/subnomen). For instance, based on what we see here: a peach is a fruit, a nectarine is a peach.
Computational linguistics practitioners, being them data scientists, taxonomists, librarians, have used Studio to successfully build NLP solutions to the most diverse business cases, from chatbots in Customer Care departments to email routing in phone companies, from Risk Engineering in Insurance to asset management reports in Financial Services.
Fil Emanuele, 2020