Achieve Your Business Goals with a Hybrid AI Approach to Natural Language

NLP is no longer considered experimental, but a crucial technology for creating tangible ROI and achieving competitive advantage. A hybrid solution that combines both symbolic and machine learning techniques helps automate language-intensive business processes, improve knowledge discovery and accelerate risk evaluation.

Watch this livestream to learn how to:
• Develop successful and value-oriented natural language projects
• Work faster with classification and extraction tools
• Reduce technical debt and increase business value

Transcript:

Luca Scagliarini:

Good day, everybody. Welcome to a new episode of NLP Stream. We have worked and launched our platform to design, develop, test, and put in production NLP solutions a couple of years ago now. In the course of the last couple of years, we have continued to expand on the core set of functionalities and features and capabilities of the platform.

We have learned from what our customers have been saying and we have developed, while keeping always attention on our core principle. The platform has to be something to design practical solutions that can be applied in the real world, leveraging as many different techniques as possible. That’s why we named it hybrid. And keep it open to make sure that the best possible tool to actually create an actual practical application that leverages the understanding of language could be put in production.

So for example, I think everybody probably heard in the last few weeks, the concept of ChatGPT and large language model. We also have opened up our platform to integrate these large GPT-3 language model to make sure that data scientists or practitioners or whoever is engaged in the developing the solution could leverage the buzz of ChatGPT.

Next week, we’re going to have specific sessions around that for people who are interested. But today, Lorenzo is leading our academy. We’ll use the next 40 minutes to go through both the core functionality of the platform, how it can be used and leveraged to develop this language-based solution and also take the opportunity for people who have followed us in previous sessions to take a look at the new features that we just released last week.

So as this intro is probably less interesting than what Lorenzo is going to present, I let Lorenzo go ahead and start with the presentation. If you have questions, make sure that you put them in the comments, and we’ll try to keep five minutes at the end to answer to all or the most relevant questions. In any case, we’ll answer to any questions over the course of next couple of days. Thank you.

Lorenzo Musetti:

Thanks, Luca. I am very excited to be here because today I’d like to put to you a selection of features that are actually out in the latest version of the platform that was recently released in its winter release. So in specific, what I’m going to be focusing on are the categorization or text classification capabilities and projects that you find in the platform, as well as the extraction ones.

So why is that? That is because basically most of the… Let’s say, that most of the features that you can implement with classification and data extraction are probably the most critical to automating processes for enterprises that really do scale up the enterprises and drive decision making. Categorization or text classification is not just about identifying topics in text, it can be used also for intent detection in emails and also for conversational AI.

It is also useful for identifying types of documents, as well as, for instance, classify claims and actually even contract clauses. Extraction, on the other hand, it’s extremely useful for different things, like for instance, collecting specific key data from contracts like personal data, like customers’ personal data, customer codes, or even draw complex relationships between actors as well as even perform event mining type of data mining tasks. And what I’m going to focus on, it’s these two type of capabilities. We’re going to dive deeper into the hybrid aspect of them and then also discuss about runtime environment and pipelines.

I’m just about to be sharing my screen. Okay, so this is the Expert.ai platform. This is basically what you would see if you had been working with the platform for a while. And the green and purple ones are the categorization and extraction projects. We’re going to start from categorization projects, and for today I chose specifically a use case that we sampled around email routing and email management. One of the first things that I’d like to mention is that we’re going to focus on specific classes in that we uploaded a dataset that is specific to some email interactions that are very much around HR as well as specific industry operations.

The first thing I’d like to show you is one of the most important things about a platform is that it integrates multiple tools that are critical and key for the whole process of creating and developing models and NLP solutions from the ground up. So the platform contains multiple tools and components that make it possible for you to accelerate these processes.

They are not that trivial. Yet, it resolves everything to simple UI operations. So basically it’s a low-code environment, meaning that you don’t have necessarily to develop code. And so, why is it hybrid? It’s because it combines multiple approaches, whether it is machine learning or symbolic AI. It combines them together with the objective of being very effective in the tasks it performs.

Let’s say, one of the areas where we’re leveraging symbolic AI a lot is from the very beginning in the data management part. I don’t know if it ever happened to you, but many times you have to deal with a lot of data, and sometimes it’s hard to really grasp what your documents are about.

Now, every time you upload your dataset to the platform, you’re not just uploading your documents, you’re not just creating your database or collection of datasets. The symbolic core technology behind the curtains of the platform is also processing each one of the documents and pulling out automatically a wide variety of complex and structured linguistic data that you can leverage to quickly get a rough idea of the contents of the documents.

Now, this is not just something that goes in support of knowledge discovery tasks, for instance, try and get in depth your dataset before you start training your models. It’s something that you can also leverage, for instance, performing semantic searches, so leveraging NLU in your model development cycle to really focus on those documents that you need to further explore or you need to annotate.

For instance, interview seems to be one of the main, most relevant subjects in this dataset. For instance, we could just simply double click on this item and say we want to identify all those documents that can possibly be related to one of the classes that I have in my project, which is job recruitment. So for instance, I may be selecting all documents where interview is a landmark that is very important, as well as, for instance, resume.

Now, I could perform this search as it is, but there is one problem, which is polysemy. Now, if you know Expert.ai, let’s say, one of our specialties, it’s NLU, so it’s natural language understanding. So the core technology really understands text and the meaning of words. And so, we can leverage this capability in this search because in this case, instead of searching for the lemma resume, so leveraging lemmatization, which would possibly on both resume and resume, we’re going to be looking for all documents where interview or the actual concept or representation or definition of resume is present. This way we have no possibility of also selecting those documents that actually contain resume as a word. And so we’re just focusing on those documents that are very likely going to be about job recruitment.

This is going to be extremely effective in selecting only those documents that I maybe want to annotate as job recruitment and the annotation is just as easy as drug and drop. So this is one way we’re using symbolic to support the most typical machine learning tasks like for instance, data annotation.

There is another area where machine learning and symbolic get combined or hybridized for the best results and that is when you launch experiments. Launching experiments means that you’re literally launching the machine learning algorithm to learn from your training set. Well just to make a first distinction, so training sets are that part of the data set that you’re going to use to be training the machine learning model and then test sets are the gold standard that you’re going to be using for testing the quality of your models.

Of course you need both training sets and test sets to be manually annotated. So when you launch an experiment, you’re literally choosing a specific machine learning algorithm to learn from your training set and then test creating a model and then testing the model against the test set to measure the quality of the acral model capability to generate the proper result, accurate results in terms of, in this case, categorization.

There are multiple algorithms you can leverage in the platform. In this case we’re going to dive deeper into Auto-ML. So Auto-ML now has… It basically covers most main mainstream machine learning. So these are all very popular mainstream machine learning algorithms.

There is a new addition to the families of machine learning algorithms that is Passive Aggressive. Passive Aggressive was actually released lately in the latest version of the platform and it’s an interesting algorithm because it basically learns by basically checking for each one of the samples what is the type of category the model is capable of resulting for a specific document. If the category that the model outcomes is correct, then it will have a passive approach, meaning that it will not learn more from that document because the prediction, so the category is correct. In aggressive mode, the machine learning will identify those documents where the category that is being predicted is at [inaudible 00:13:20] wrong and then it will use that data to learn best how to predict that category well.

So the experiments that you can launch are as… Launching an experiment is actually as easy as following a wizard, the same wizard that you see over here. Every time that you choose one algorithm, you actually have settings for the problem definition. For instance, if you are working on a binary classification task, you can prompt a single label so that you’re not going to be multi label or if you’re working on a multi-class categorization model, then you can still have one label generated for each one of the documents.

You have this feature space. So feature space is really where you go and dive deeper into the hybridization between symbolic and machine learning. What’s special about hybrid ML, which is our formula for hybridized machine learning, is that basically what we do is that we engineer linguistic features using natural language understanding so that machine learning learns from higher quality data and this has been proven to be critical especially for carbon footprint. So for instance, the computing power… Let’s say the computational cost of the machine learning in this case is decreased and as well it was proved to be much more efficient and accurate with smaller data sets compared to transformers or other technologies.

And this is very important, I think, because if you’ve been working in the industry with NLP trying to automate processes and scaling up enterprises, you know that many times it’s hard to retrieve enough data for machine learning. Now I think that one of the best advantages of Auto-ML is that it really does learn from really rich linguistic data and so it doesn’t require as many documents as you would need if you didn’t have this type of linguistic and features engineering.

And the feature space area in the wizard, it’s really where you go down and choose exactly what is the type of linguistics the machine learning needs to learn from. Of course, if you keep it active, the platform will choose it for you, so it will find its perfect recipe and then go ahead and train the model. But then if you want to customize what the machine learning should learn from in terms of linguistics, then you can really narrow down your, let’s say, features extraction into deep.

So then of course you have your hyperparameters area, you have F-Bett optimization if you need… Depending on the algorithm you also have parameters and then the summary section is where you just see the collection of settings that you may have changed and you’re ready to start the experiment.

So basically training model was literally resolved, transformed into simple operations where you have everything at your hands and you just need to modify the way the machine learning learns or shapes the model and just clicking start. There are multiple types of algorithms. We’ve seen this. There are others like, for instance, online machine learning. Now, online machine learning, it’s a type of machine learning that basically instead of learning from a batch of documents, if you feed it the training set, usual machine learning would go through the training set and learn, while online machine learning will literally split the training set into multiple pieces and learn from each batch of data every time. This is very important when you’re working with very large data sets.

It’s going to make a huge difference, I think, both in the learning process but also in terms of performance. In this case, online machine learning was released in the latest version of the platform, again, the latest winter release. It was out with Passive Aggressive online and stochastic gradient descent online. So these two algorithms are already available and usable.

Now the other two types of algorithms are very interesting. These are explainable algorithms. What does it mean? Basically in this case, the hybridization is performed by still doing the linguistic features extraction, but then in this case we’re not generating a machine learning model, we’re using machine learning in combination with the symbolic NL analysis to generate a model that is made of rules. So let’s say it’s a 100% symbolic model that is fully explainable. It’s actually editable, which is perfect for data shifts and you can actually create it in the platform, download it, and then use a special tool that belongs to the platform that is called Studio to actually open up the model and even either modify it, integrate it with more rules. If you know how to create these linguistic rules, basically you have more a wide variety of options there.

Bootstrap follows the same exact criteria. The only difference is that you would choose Bootstrap if you have an already annotated data set and your idea is that you want to speed up the process of developing models and so you use this approach to quickly create a model in a rules model with the help of machine learning and then use that as a baseline where you add more rules on top, sort of as if you wanted to speed up the early stages of building the rules and then you’re off fine-tuning your model.

We’re not going to wait for the model to finish the creation. So just to show you what the final outcomes would be, basically what you do is you land to this page where you have multiple standpoints in terms of quality as you have micro average, you have macro average, which is perfect when you’re dealing with imbalance data, especially for spotting those classes that are not necessarily performing too well, sample average, weighted average and then you have a closer look to how each one of the classes is performing and then if you want it to dive deeper and see each one of the documents, then it’s just the next step.

So usually when you’d be building these models then at some point you reach to a level where you know think you reach the best quality possible or eventually you’re just satisfied with the quality that you are creating with the model that you generated.

So what you would usually do in this case, you would switch over to the model tab and then publish the model. Publishing the model is literally deploying the model to a production environment, a runtime environment where you can pipeline it with additional models, combine it and then transform that combination of capabilities and models into an API, so into a service that you can invoke from any type of client service product outside of the platform.

We’re going to see the runtime environment part later. One thing that I’d like to do before we move on is do a little bit of an overview of what we have now in extraction because we have some interesting additions. So regardless of the fact that as you see the UI, it’s basically the same, which is, I think, a great thing because this way you don’t always have to learn everything from scratch, so it makes user experience incredibly smooth and so it’s very easy to use and you don’t have to learn new things.

The approach is identical. Of course, what changes is the way you annotate. So in platform, the extraction annotation part is made easy by a series of tools that you have available in your dashboard. One is fast mode in which you can simply click on the class that you want to annotate, you select the text and the item gets annotated right away. In this case I annotated the sender email, but then what if you really wanted to boost this process?

If you’ve been working with extraction projects, you know that the annotation part is extremely slow, extremely complex. You really need subject matter experts in this case and also it also may become a little bit dull. So there is the latest versions of the platform, we employed something that is called active learning, which is basically weak supervision. So what we did is we embedded a machine learning model behind the curtains so that basically the more you annotate, the more the machine learning learns from your notations and suggests possible annotations that you may want to apply. So that basically while you are annotating, the system is suggesting your next annotations and the only thing that you need to do is maybe confirm the annotations that you think are correct and eventually check if there is any other annotation that was missed by the suggestions and that is worth annotating.

For instance, basically you would be here working on our extraction project, selecting the sender email and then the recipient and then maybe the candidate name. In this case, this is an HR recruitment email. As you see, if the candidate name is mentioned multiple times, it will propagate the extraction and then when you reach the annotation of at least 10 documents and at least 50 annotations, the machine learning behind is ready to learn from your annotations. What you do is you click on start now and then what you get at the end of the process is that in the next documents that are not annotated, you get these new annotations that are suggested and so what you are supposed to do is eventually just clicking on them if they are correct. In this case this is correctly a recipient and I simply want to annotate it.

So this makes the process of annotating a cycle. So you’re basically looping through suggestions and the more suggestions you generate and you confirm, the more annotations the machine learning will use to be precise and accurate in providing new suggestions. So basically instead of going through each one of the documents looking for the correct classes and the correct words that you want to annotate entities or other things that you need to annotate, basically you are going to go through this loop that is going to remarkably accelerate your annotation process.

In terms of experiments, we have some new additions in the extraction projects as well. So we have Auto-ML in extraction too. Extraction now counts counts up to four algorithms. We have CRF, so conditional random field, we have Passive Aggressive sliding window, we have [inaudible 00:24:56] vector machines, sliding window, and then stochastic gradient descent sliding window. The SGD sliding window and Passive Aggressive are new to the, let’s say, to the collection. All these algorithms, again, just like in categorization have their feature space area. You have the machine learning hyperparameters, you have F-Bett optimization, specific Auto-ML parameters, and then you’re ready to launch your experiment.

The other types of machine learning algorithms also cover the online machine learning. In this case we have Passive Aggressive sliding window in its online version and then stochastic gradient descent sliding window in its online version.

Again, online machine learning, it’s that machine learning that would split large data sets so that it’s just easier for machine learning to learn from multiple batches instead of just going through a huge amount of data, which could be possibly not very efficient in terms of performance but also possibly in terms of learning process. And again, also the online type of machine learning has its own feature space, hyperparameters, F-Bett optimization and then you launch the experiment.

Last but not least, extraction also has its own explainable extraction algorithm and, again, explainable extraction, just like explainable categorization, basically will leverage the symbolic features extraction with machine learning but with the objective of creating a model that is completely explainable because it’s symbolic. So it’s made of rules. So again, this is perfect if you’re kickstarting a project and you want to make it explainable. So one of the KPIs is that the model needs to be explainable or maybe that you are going to be integrating more rules, then this is the perfect place to start. And it’s got, again, the rules generation configurations, features option, specific rules settings, and then you are ready to launch your experiment.

As I previously said, there is no difference in terms of UI between categorization and extraction. And I think it’s a great value [inaudible 00:27:27]. As you see also the quality measurement part of the extraction looks almost identical to the categorization part, which, again, makes it just easier to evaluate the quality of your models. We have the same metrics, we have the classes down here. They are organized, they get exactly the same way. You can really narrow down to the classes that are, let’s say, suffering a little bit. So it’s just so much easier to just say, “Okay, I realize that I need to go back and do something for these classes.” And given that the platform is hybrid, there are multiple ways you can do this. You can either work within the machine learning framework, maybe add more data, increase the variety of data that the machine learning can cover to improve the class’s extraction or eventually you could think of creating a symbolic model to combine with this extraction model so that basically you fill the gap where machine learning falls short.

And so when combining these two models in productions with a perfect combined approach, combined AI composite AI philosophy, basically you get the best of both worlds. In this case I deploy the two production, this model, already. So what we can do is we can go back to the dashboard and then click on workflow. So the runtime environment basically is where you deploy your models. So let’s say you promote them to production and what you do is you can pipeline multiple models together to compose multiple features regardless of the approach of the methodology you develop them with, again, so that you can leverage both machine learning approaches and symbolic approaches so that basically you create specific features that you’re going to be using for processing your data and then eventually integrate this all around, 360 degrees service in your end product, application or client and automate specific processes in an enterprise.

In this case, as I said, we’re focusing on an email routing type of project. So in this case, the classification process that is coming out of this explainable symbolic model for categorization are going to be critical for understanding the intent of the emails and based on the emails, routing the emails that are, for instance, about job recruitment to the HR department. Or eventually those that are about oil and gas, pipeline type of operations, route them to the technical team and maybe even provide a specific level of priority. The models I used for this are, again, an explainable categorization model, a machine learning model for extraction. In this case we chose the CRF model that we’ve seen just a few minutes earlier. And then I chose this model for emotional information detection.

This model here is a really interesting addition to the family of features that are available in the platform. This is called knowledge model. Knowledge models are already developed models inside the platform that focus on specific capability. In this case it’s basically a classification task that is going to be collecting emotional evidence from text and then prompting a category type of output where it just explains what is the emotion that was identified in the input document. These models, so the knowledge models, are already available in the platform. They are out of the box and I think it’s a critical addition as I didn’t have to build this on my own. I didn’t have to create a new project and do it. I didn’t have to choose sentiment analysis if I didn’t want to and I wanted something more sophisticated. All I did is simply select this from the list of knowledge models, insert it in my pipeline and use it.

These knowledge models can even be downloaded and further customized. So this is something really powerful and you would do that with that tool that I mentioned earlier that is called Studio. And that is basically a client application where you can use the symbolic approach to create your own models.

So basically what this workflow is going to do is going to be taking in an email text and then shooting this text to these three models in parallel and then the output is going to be joined together and then passed through this filter that is going to just select the categories from the categorization model, the extractions from the extraction model, the emotions from the emotional traits knowledge model, and then the input email text.

So what we can do just simply to test this one out is we head over to the test area. We simply put sample text in here and then click on test workflow and this is the response that we get. So basically we get that the model that was responsible for the categorization part is generating a job recruitment category. This would be simply the email text, and then we have the motions detection part that shows happiness and that is because we have happy inside the email and then we have multiple extractions like the sender email, the recipient email, as well as the candidate name, the job title, the institutions mentioned in the document and in the email and so on.

So this is crucial as in this case, we are really leveraging the best of the available techniques to develop models and deployed into production. So we’re using symbolic models, we’re using machine learning models, we’re using machine learning operations to deploy the models in production, and we’re using a low code approach to make everything extremely simple, which means that possibly even subject matter experts could start using this technology, the expert AI platform and every component that is available, and they can easily adopt the software to create pipelines and models that are tailored down and specific to automate processes that can drive the decision making and also scale up enterprise processes.

One of the interesting things, it’s also, and this has been mentioned by Luca and this is going to be something on the table also in the next meeting in early March, is the possibility of using this hybrid approach of combining multiple models, also using technologies like, for instance, GPT 3, so using GPT 3 in conjunction with symbolic models so that you get, again, the best of both worlds in terms of approaches and technologies.

Okay, Luca, I don’t know if there are any questions. Sally, I can’t see if…

Luca Scagliarini:

There were a couple of questions. Thank you. Thank you very much Lorenzo. I hope it was informative. Obviously the set of functionalities of the platform is very rich and it’s built to address practical use cases, so not focusing only on one single frequent use case, right? Customers are using it to deploy 10, 15 different use cases where language understanding is a piece of the workflow.

The two questions that were done live were both around the integration of additional large language models. So Lorenzo just said it. Please join next week, March 2nd, I think it’s the date. Where in the next episode of NLP Stream we’re going to look in a very practical and pragmatic way to a workflow that includes different techniques including leveraging large language model. The integration through our workflow management is already there. So for customers, I want to test it. We’ll make it available. But to see some example, I think the best thing would be to join next week.

So if there aren’t any additional question, I want to thank everybody who has looked and watched the live and also the people who are becoming more and more every week who are actually maybe looking at the recording. If you have questions, you can reach out directly to us and I’m looking forward to seeing all of you next week. And thank you Lorenzo for your long thorough presentation around the feature of the platform and with a focus on the new features just released. Thank you very much.

Lorenzo Musetti:

Thanks for having me.