To read and understand a word or phrase, not only must we know the language it’s written in (with all its grammatical rules, structures and conventions), we also need some knowledge about the topic the text is dealing with. For example, while we can all read the following sentence, it won’t make much sense if we don’t know anything about American football: “In 1998, a Hail Mary pass by the Vikings’ QB helped them defeat their arch-rival, the Bears.”
It should come as no surprise, therefore, when researchers started to develop software to read text, they sought to replicate the same approach: an engine must embed as many of the rules/structures of the language as possible, and it should have access to the knowledge of the domain covered by the text. Advanced research continued in this direction for many years, despite the limitations caused by slow and expensive computers.
Many data scientists and researchers saw the exponential growth in computing power as an opportunity to bypass the complexity of replicating the process the human mind follows when it interprets a text.
However, once computers became more powerful and relatively cheap to operate, instead of leveraging this opportunity to develop more effective ways to replicate human knowledge, a big chunk of the attention shifted to machine and deep learning, techniques that make up what we now know as Artificial intelligence. Many data scientists and researchers saw the exponential growth in computing power as an opportunity to bypass the complexity of replicating the process the human mind follows when it interprets a text. With this approach, words are transformed into “numbers” and, by simply applying statistics and mathematics and using a brute force approach, they hoped to be able to “teach” a computer to read and understand a document in a way that out-performed existing methods.
While great strides have been made in fields like image recognition, for example, we still don’t have comparable results for text understanding. And, while there is still some consensus that “it’s just a matter of time,” there is growing skepticism that this is the case
While great strides have been made in fields like image recognition, for example, we still don’t have comparable results for text understanding. And, while there is still some consensus that “it’s just a matter of time,” there is growing skepticism that this is the case, including by some who originally supported this approach (Geoff Hinton, Riza Berkan) . Beyond scenarios characterized by a narrow domain, abundant sample data and the automation of simple tasks, it has been nearly impossible for ML to obtain relevant results. In the meantime, the information overload phenomena has made the benefits of machine reading and understanding increasingly visible and relevant in all fields (business, government, daily life, the military, etc.). Therefore, we are seeing renewed interest in a more traditional approach based on a human-like understanding of language, leveraging a rich knowledge base.
It is clear that machine learning techniques can, in certain situations, add value. However, if we care about progress in AI-based natural language processing, I believe it is imperative that we start by having an honest discussion about the evident limitations of this approach
It is clear that machine learning techniques can, in certain situations, add value. However, if we care about progress in AI-based natural language processing, I believe it is imperative that we start by having an honest discussion about the evident limitations of this approach, and that we make sure we aren’t running away from complexity when it comes to facing issues like reading and understanding (at least partially) text, which are known for being inherently complicated.
Tasks that are complex, yet essential for humans to understand language—building a reliable knowledge base, leveraging that knowledge to understand the meaning of a text—are just as complex and essential for the software that is meant to replicate this activity. Similarly, investing time and effort to build a vast and deep knowledge base will improve the ability to perform, whether for humans or for software.
In the software world, when we talk about knowledge, we tend to refer to a Knowledge Graph: a representation of the real world where concepts are defined and connected to one another by different kinds of relationships. Knowledge graphs can be wide or narrow, depending on the breadth of the domains they cover, and deep or shallow, depending on how they are able to represent the knowledge of a certain domain from its core to its more specific elements. In addition, knowledge graphs are usually open and explicit, not a black box, and their content and structure can be understood and directly manipulated by humans.
The more the knowledge graph expands in width and depth, the better it becomes in understanding the language. The better it understands the language, the more useful it becomes
There is no such thing as a standard structure of a knowledge graph, and they may be used in many different ways and in many different scenarios. For the sake of this post, I will focus on the structure and content of a knowledge graph that can be used in understanding text of any type and domain and that is flexible enough to continue to grow in tandem with its usage. The more it expands in width and depth, the better it becomes in understanding the language. The better it understands the language, the more useful it becomes.
In a knowledge graph, each item is linked with one or more other items (where links represent the relations between them) and each item has a set of attributes that describes the characteristics of words and concepts. The links and the attributes of a concept (i.e. a motor vehicle has wheels) are transferred effortlessly to the more specific concepts in the chain (i.e. a car) in the same way humans do when they understand the meaning of a concept. This simple feature helps ensure that the knowledge of the software can be extended with limited effort because there is no need to repeatedly input this information every time a new concept is added to the knowledge graph.
If you’re still with me up to this point, hopefully it’s clear that the wider, deeper and richer the knowledge graph concepts are in terms of attributes and relations, the better the software understands the meaning of any text out of the box.
In a second article I will make the case of why this (really) matters.
Executive Vice President Strategy and Business Development at expert.ai