We stand with Ukraine

Turn Language into Action: A Natural Language Hackathon for Good Kickoff Event

In this livestream, Brian Munz kicks off expert.ai’s hackathon “Turn Language into Action: A Natural Language Hackathon for Good”. Join in for a full presentation of the hackathon!

Learn more about the potential of NLP and test your development skills by building or updating a functioning application that leverages one or several of the three knowledge models embedded in expert.ai’s NL API and brings positive change. Here are a few sample use cases for how to use our knowledge models for the good:

  •         Analyze reactions on social media to detect cyber bullying
  •         Evaluate content to identify misinformation and fake news
  •         Monitor company reputation on social media
  •         Improve due diligence and mitigate risks

To participate in the hackathon click here!

Transcript:

Brian Munz:

Hey everybody. Welcome to the NLP Stream, which is our weekly live stream about all things NLP, NLU. And my name is Brian Munz. As usual, I’ll be your host for this one. And actually I’m going to also be presenting again, but before you log out and avoid another session by me, I think you may be interested in the topic I’m going to talk about, which is going to be a little bit different this time where I want to dive into a hackathon that Expert.AI, which is of course company for whom I work. We’re running a hackathon and this is a kickoff, but what I also want to do is hopefully it will be interesting enough to people that aren’t participating or may not be interested in giving it a try at all. But it should be interesting to see how NLP and NLU AI might be applicable and interesting to projects where you give a lot of creativity like this. What I figure I’ll do is just dive right in and share my screen. Hopefully this works.

Brian Munz:

I think this is the right one. Sorry, one second. My trademark fumbling around with technology as a person that works for a technology company. There we go. As I mentioned, today is our official kickoff of our most recent hackathon, which we’re calling Turn Language Into Action and part of this hackathon that we thought would be interesting is there’s lots of different reasons that people hold hackathon… People, companies usually hold hackathons. One of course is being completely open, of course, you want people to use your technology to try out your technology. It’s of course good for exposure and marketing, but it’s also very fun to interact with developers, see how creative they can get, that kind of thing. But one thing we thought was pretty interesting is in the past we had seen with some of our other hackathons that there were projects that were geared towards positive impacts. Not talking about some huge world altering thing, but it was interesting to see how NLP is well suited for improving things or having a positive impact.

Brian Munz:

Nowadays language I think more than ever has been at the center of some bad stuff to put it lightly. Whether it’s disinformation, misinformation, fake news, or just a rise in extremism and just navigating the online landscape of being able to be anonymous online and trolling, all of this stuff, cyber bullying is just all a part of I think our world moving towards this or continuing to move towards this online society and also just other things in the world. I think it actually makes a lot of sense too that NLP NLU can have a huge impact in trying to provide positive things and find ways to prevent some of the worst of that we have to offer from benefiting from language. We decided that it would be good to actually focus a hackathon around this concept of doing a project for good.

Brian Munz:

And what I’m going to do here is I want to first introduce the hackathon itself, what the parameters are, things like that. But also I want to get into some of these what we mean by a solution that can provide good things, give some examples, stuff like that. But as I said, I think some of these concepts are pretty easy to identify, which is one of the reasons we to do this of course. As I said, Turn Language Into Action, what we’ve described it as to create or enhance an existing web application. Existing? Actually it should be new or existing web application. Sorry about that. With the purpose of highlighting or enabling ways to improve the world using deep natural language processing. That’s just a more complicated way of saying we’d like you to build something using NLP that has some benefit to society or highlight some positive thing or an insight that can be used in this way.

Brian Munz:

As I mentioned starting today, hackathons have morphed into being this online thing. When I first started doing them a long time ago, they were more something where you’d show up on a Friday and college students would stay up for two days and then present something on Sunday. Luckily, we’re not doing that. I’m too old for that and I would not do it. But we’re giving you all about almost two months. And of course it’s a virtual hackathon, but in a lot of other ways I want to retain that spirit, which I’ll talk about later. In terms of requirements, one thing we wanted to focus on is four particular what we call knowledge models, which is basically a language model that Expert offers. And the four we wanted to focus on seem to make sense within the context of this challenge. First one is hate speech detection.

Brian Munz:

This is one that you can go back and watch a live stream that we did on this maybe about a month ago. ESG sentiment, which we also did another live stream of. And these two are relatively new so when you go into our demo site, they’re not even there. And that’s because they’re brand new and we’re interested to see what you can make of them as your test driving. But all of the documentation is there, don’t worry about that. But later on I’ll of course show how to reach these APIs. But hate speech detection of course makes a lot of sense in that within the problems that I mentioned earlier, detecting hate speech can prevent all kinds of problems beyond just cyber bullying and things like that and can keep online communities as to be a healthy place. ESG sentiment is also interesting because it’s a very hot topic in some ways where what it does is analyzes text to find concepts that are around ESG, which is environmental, social and governance.

Brian Munz:

Sorry, I didn’t get much sleep last night. What it does is also determines the sentiment on each of these topics because these are topics that have become of more interest to investors to ensure that they are investing in companies that are going to be somewhat free from controversy for one thing, but also that they want to know that their money is going towards sustainable companies or ones that are socially conscious, et cetera. But in this way, not to jump ahead, one thing I think would be interesting with any of these projects is ESG is also something where a lot of people are saying it’s a scam or it’s overblown, it’s overused. If that’s the case, let’s see a project where you evaluate the efficacy of it or you evaluate if it’s truly having the impact that it should be. Are companies actually behaving themselves? Et cetera.

Brian Munz:

Within that context, there’s lots of ways you can go with it besides just analyzing text and saying, “We found, the model found these particular elements.” Emotional traits is one that we’ve had for a while. And what that does is it identifies emotions contained within text. It’s different from sentiment in that sentiment, of course, is trying to give a positive or negative sentiment. Emotional traits is going to actually try to find emotions in that text.

Brian Munz:

And then of course we have sentiment. I do want to hop over really quickly to show a little more detail in what I’m talking about. Here we go. We’ve got the recursion thing happening. One very useful place that I wanted to highlight is the try site. As I mentioned, the hate speech and ESG is not on here. But what I wanted to point out is two things. One is the way we have our models organized is we have these categories of document analysis, classification and information detection. And within document analysis, you’ll find all of the core capabilities of NLP so just ambiguating the text, finding key elements, entities, et cetera. You’ll also find sentiment in here, it’s very straightforward in that it provides you with positive, negative value based on the text.

Brian Munz:

What you’ll find within classification, and this is the reason I mention that, is an emotional trait. What it’s doing is it’s classifying the text. And it’s taking, speaking the text, evaluating it, and finding these emotions within the text. You can see here within this sample it’s found satisfaction in the text, excitement, amusement, joy, things like that. And this is obviously a common thing within NLP, it’s just classifying the text. One difference that I wanted to mention because of the hate speech detection and ESG is we also have this concept of just information detection. What this does, and I’ll show an example using our PII model, is it not only will classify the text, it will extract elements from the text. And within the JSON, which you should become very comfortable with this JSON structure, it’s all within our documentation, which I point to later, but it’s also helpful on the try site to be able to see it.

Brian Munz:

But you’ll see here if I collapse some of this stuff, but also take note that this is all… In all of the calls, this disambiguation piece and the identifying of tokens, entities that is coming back from the call. But you can see here within extractions. The two main things to look for when you’re working with the APIs is, as I mentioned, it’ll have categories, but also in detectors it’ll have extractions where you can see here it actually found… It’s not just classifying, it’s finding the instance of a telephone number in the text. Telling you where it is. And within… I’ll show hate speech with this really dumb bit of hate speech, I’ll show you what I mean and this is within the platform. This isn’t anything you can try yourself, just wanted to show so what this does is, as you can see, it identifies the categories of the text, classism it found because it’s has the word poor in it, but also the extractions that it found.

Brian Munz:

As I mentioned here, it has the fields and it says here actually it’s cyber-bullying. And why did they identify it? Why was this extracted as cyber-bullying? Because of this text and within the detection models you actually find a bit more useful insights within ESG as well, which has interesting implications for if you were wanting to build for ESG, a brand management thing where a company wants to see how it’s being spoken of in the context of ESG sentiment, you may be able to connect those dots. Could be an interesting project. Let me hop back to the presentation because I feel like I’m going to go long on this one.

Brian Munz:

The next requirement is when the project is submitted, there’s a demo video. This is to take the place of… In most hackathons there’s a big presentation at the end where people show their solution. This is pretty much to take the place of that, it doesn’t have to be some Tarantino epic. Just recorded a screen share, something like that. Also, we’d like to see the code and this can be done and the mechanics of that we’ll share at a later time. But we want to see how within the code you leverage our technology and other technologies just to see and that’ll be shared through a GitHub or whatever repository you use. And then a write up, which the primers of that will also be shared of writing it up. It’ll pretty much just be the details of what you’ve shown. And then the last, this is optional, where if you have made the project live somewhere, if you decided to host it through something, a URL to that project, of course that’s not mandatory but it’s definitely bonus.

Brian Munz:

And hopefully this isn’t the only reason people would be participating, but we are providing some pretty nice prizes. First place is $5,000 US dollars and promotion of your project within a press release, you have a meeting with the Expert.AI team. We will feature you within one of these live streams where if you want, you can present your solution and promote yourself however you want. And also on LinkedIn and YouTube channel, which is where the live stream goes. And then lastly, a blog post featuring your project. Some of these go across first, second and third, second place is $2,500. You also get to promote within a press release, you can also talk on the livestream and a blog post. And then third place is a $1,000, which is still not bad for third place and you get to promote your project in a blog post. One thing I also wanted to point out was these honorable mentions. What we wanted to here is call out projects that were specifically excelled in particular categories.

Brian Munz:

Even if you didn’t win first, second or third, you may be able to get one of these honorable mentions so this is best projects. We have three. Best projects using hate speech detection, best using ESG and then the best using emotional traits and sentiment analysis and what this could mean, of course, you may be wondering, if it’s already the first place project, might it automatically going to be one of these honorable mentions? And I’d say not necessarily because a first place project could be that the hate speech detector for example, was used in a very out of box way. It was connected to as part of a larger solution but the solution itself is incredible. It seems like a fully fleshed out idea and project and is the most impressive one.

Brian Munz:

If there’s then a project though that is very specific to hate speech detection and found something very interesting, we may reward that with this $500 in the honorable mentions. And that is going to be the judging criteria, which I’ll post later in the group that I’ll mention, is going to not just be of course how impressive the project is, but it’ll be a combination of basically creativity, wow factor, the project itself just overall of how well executed it was and if it did what it said basically and how well it did it. And then also lastly would be how you use the APIs. Those are the three factors that we’ll use in judging.

Brian Munz:

What I wanted to do is point out a few things in terms of example projects, because some things come to mind pretty quickly but I see two different types of projects in our previous hackathons and hackathons in general and that’s tools and insights. And I’ll explain what I mean by insights in a minute, but tools, what I mean is an application so you’re providing a service of sorts where for example, if it’s a bot, that’s a tool that is used to identify hate speech for example. It’s basically like an application or project website. Something that is used. Some examples of that are manipulation, deception detection and company reputation monitoring as I mentioned with ESG. One interesting thing would be predicting and identifying disaster. And in previous hackathons that I did years ago, we worked with some charities where they wanted to be able to basically listen on some social media or communications and as certain words or certain sentiment things changed to identify if something bad was happening in a certain nation to be able to respond to it the fastest.

Brian Munz:

And determining what ways this could happen and what would trigger these ways of identifying it, those kinds of things would be interesting. And along those lines too would be diplomacy and national relations. Another interesting need I had heard of from a charity before was, as weird as it sounds, there’s many nations in this planet and they see the communications the way a lot of us do and miscommunications happen all the time. It can cause different problems. And determining it could be what is the focus of these nations in their public communications? Are they focused on climate change? Are they not care? Does there seem to be radicalism bubbling up, et cetera? And then of course a very easy one to wrap your head around, which is prevention of violence and cyber-bullying. But what I want to point out with this is don’t feel the need to reinvent the wheel or come up with some crazy idea that’s going to change, blow everybody’s minds.

Brian Munz:

That would be amazing but you could just create something similar to what I showed in my hate speech presentation that I did with Valentina where I had a bot on a Discord server where when somebody would say hate speech, it would just pop up a message saying don’t do that basically. You can take a pretty simple concept and maybe you just do it better than we’ve seen before or maybe it works better in some way, maybe you tweak it so you don’t have to feel the need to come up with some completely unique idea. Just applying these things to different interesting use cases is going to be valuable in and of itself. Quickly, what I wanted to point out though was this idea of insights, and I’ll show examples of both of these in a minute, but I think it also could be especially useful just using these models and using NLP to do basically some journalism or data and analysis.

Brian Munz:

And for example, as we know, there are disinformation campaigns that are done by different groups that surround certain events. When elections come up, on Twitter you see this huge influx of misinformation, disinformation. And identifying trends or different things that you may be able to find using NLP and data analytics could be interesting just to… Again, you’re not providing a tool or an application, but the project could be some insight that you found that is interesting. And I just want to encourage everyone, don’t be afraid to take a risk on an idea you might have of a way that NLP could be useful in things like this. As I mentioned, you could identify trends, analyzing language on social media. You could have something where it’s just, this is what we found about the way the people speak on Reddit or these subreddits or in regard to this particular thing could just be interesting in and of itself is almost like a journalistic activity.

Brian Munz:

What I want to do quickly here is show some examples because as I mentioned, we’ve done hackathons in the past, and of course I’m going to highlight ones where they had a beneficial aspect to it, a social justice related thing perhaps. But one very cool project from our very first hackathon was called Legal Expert. And this I thought was interesting because what it does is it addresses this problem that this group identified back in their home country of India, there were these farmers being taken advantage of by these other companies where they would pretty much get these farmers to sign on for terms that were not beneficial to them.

Brian Munz:

And part of it was there’s a language barrier and they were putting pressure, et cetera. What they wanted to do was build a solution where the contracts would come in and through a combination of our technology, other technologies, translation API from Azure, they would identify the concepts within these contracts so that people can know, they would translate it and they would use the Expert.AI API to pull out these entities so that they can categorize them, basically manage this process better and hopefully prevent these farmers from being defrauded.

Brian Munz:

Something like that is very specific and a very specific problem to yourself is very interesting and I think a really good use case for this. Of course, this doesn’t have any application to the four models that we’ve talked about, but I do want to make sure to be very clear that while we did call out those four knowledge models, you can feel free to use any of the other ones as well. It’s just that we want to make sure that those are included. That’s all. Another very interesting project that came out of our last hackathon is there’s this company called Bywire who is an independent news network and what they really wanted to focus on just in their company, this is a live site, their existing company, they have a focus on trying to identify trustworthy news and verified news sources.

Brian Munz:

And what came out of this hackathon actually is… If I open up one of these articles, you can see where it says verified here. And what they actually have done is they use our technology now based on some of their proof of concepts and things that they did for the last hackathon, but they use it as part of a larger algorithm to determine the trustworthiness of the news source. And this is of course a very interesting use case and apply as well to this hackathon and, again, you don’t have to come up with a new concept. You could still try to do fake news identification but maybe you have interesting twist on it, maybe you can apply it to a place that it hasn’t been applied before. And lastly, in terms of the insight aspect of it, where like I said it could be for doing research or something like that. This project on our last hackathon evaluated sentiment and opinion mining as it related to 2021 Israel Palestine crisis to see what changes that were in the languages as the situation escalated.

Brian Munz:

And focusing on an event like that or a use case where language was at the center of how things maybe escalated. There’s of course been some controversial stuff with media companies where they’ve let certain rhetoric go on for too long that have resulted in some situations that were very bad. Those are opportunities as well and while I’m on this site, I’m going to share all these links in a second, but this DevPost site when you’re on here for the landing of this hackathon, make sure to check out our old hackathons if you want to get ideas and inspiration. A lot of these projects even share their GitHub repository so you could even potentially use one as a starting point. It’s The Open Source Way. Definitely go there. And on that note, I do want to jump back to this really quickly.

Brian Munz:

I don’t want to go too long so I’m trying to keep it short. What I want to point out lastly is some very important links and to use this as a way to give a little bit of guidance and advice but one of the big ones of course is the DevPost landing page. This is the home base for the hackathon itself. If this is annoying for you to write down, we’ll be sharing all of these links on this community group. Not to jump down, but this is something I really want to harp on, is within the Expert.AI community, we have a group set up for this particular hackathon. Please join that because I will post all of these links. I will also be posting helpful bits of code pointing you to different places you can go to find free marks, things that we offer.

Brian Munz:

And of course you’ll be able to ask any question that you have there. But perhaps second most important is this, the developer portal. This is where you’re going to have to go to actually register to use the APIs. You definitely need this and go in there and get an account and then you’ll be good to go. And in order to get started you can go to our documentation, definitely go there. As I mentioned, hate speech and ESG are brand new and we do have documentation for them, which will be very important in navigating all the aspects of it. We will usually have the taxonomy of the categorizations in there so you can get an eye on that. Again, I want to point out the community group, definitely join that because that’s where you’ll be able to find us in asking questions.

Brian Munz:

And very importantly too is the Expert.AI GitHub area. Let me just head over there quickly because what we have here is SDKs for Python, for Node.js and for Java where this will be an easy way where if you decide to build a project in Python, this makes it very easy to get up and running quickly with our APIs. Everything is documented on here so that’ll be good. And if you find… I want to point out also within this hackathon that one of the objectives too is to gain feedback. Feel free if you are struggling or something’s not working, you can’t find something, to reach out to us through the community, through email, whatever it might be on the DevPost landing, you can find all this stuff. Don’t worry about that one. And lastly, just a quick little helpful thing will be, I wanted to talk about the data aspect of it. If we’re doing NLP projects, there’s always question of where you’re going to get the text.

Brian Munz:

Of course most people will use APIs if they’re connecting to Reddit for example. They have an API that you can use to actually bring the text in and ingest it. But if you’re doing something that is not in those particular use cases, you can find a lot of sample data sets on Kaggle. If you’re at all knowledgeable within the NLP world, you probably already know about this, but you can find a lot of interesting data sets and a lot of times they already have, if it’s about sentiment, they have scores that you could compare your project to, things like that. It’s definitely a good resource. But I also want to encourage everyone to, even though saying all of that to be creative and if you don’t have the particular data, if it’s for a more pie in the sky or just vague use case of, for example, communications or where you’re just assuming that it’s happening between two people, you can come up with this yourself if you want just to prove out.

Brian Munz:

Of course, if you use it for real life text, that works as well. But there are definitely cases where it’s going to be hard to find data. Feel free to be creative. You’re not building a product and a product that you’re just going to put out the end of this. It should be a place where you can really try things out, explore different possibilities and just push the limits and however you need to do that. And however complete it ends up being, as long as you’re showing the purpose behind it and that the idea is good and it’s executed well then you’ll have as good of a chance as somebody who has a full complete project.

Brian Munz:

With that, I’m very excited to start this. I look forward to talking to you over in the community group or you can reach me through email, which is… I should have had one here, but I don’t. [email protected]. But I’m very excited to see what everyone comes up with and if you are not going to participate, I know that you’ll at least be interested to see what everyone comes up with. It’s always very fun to see what these hackathons come up with. And I would encourage you to go to the project pages of our previous hackathons too. It’s always cool to see how people are pushing limits and coming up with new interesting ideas with that. I’m going to stop sharing and oh wow, we have a lot of questions.

Brian Munz:

Team sizes. That’s a really good question. I didn’t call that out. I’m actually trying to remember if we defined a limit for this hackathon. I don’t believe we did. I can actually go over here and see. This is very professional of me, but I don’t believe there’s limitations on team sizes. I know in the past we’ve definitely had several people. Of course the money is not a per person thing.

Brian Munz:

But I will definitely post somewhere if I find out otherwise. But I don’t believe there’s any size limits on teams. We encourage teams and again, if you would like a team and you don’t have one, I would suggest going to our community site and just putting it out there saying, “I’d like to find someone to work on this with.” I think that’s actually all the questions because they were addressed while I was talking. Great. Again, thanks for watching. Next week we’re going to be talking about WebCrow, the multilingual crossword solver, which we had a live stream about that a little while ago, but we’re going to dive into it a bit more and see what’s happening with it. I definitely encourage you to join and hopefully you all have a good week and I will see you next week. Thanks.

 

Related Reading