Words Matter: Using NLP to Weaken the Power of Hate

The internet has been beyond transformative in its ability to connect people to information and each other. In the workplace and in our personal lives, we can share our thoughts and experiences to anyone in the world, at any time. Unfortunately, a byproduct of this freedom is the ability for people to dispense hate, bully, and even threaten violence (often anonymously).

At this especially tense time in our history, it’s more important than ever to fight against these destructive forces in our online communities for the good of ourselves, our families and our society. In this presentation, Brian Munz will approach hate speech from an NLP perspective by exploring what constitutes hateful and violent speech and how it can be detected. Then, he will demonstrate how hate speech can be monitored on a social platform, using Slack and an expert.ai knowledge model.

Video Transcript:

Brian:

Hey, everyone. Welcome again to the NLP Stream. I am Brian Munz and hopefully my connection isn’t too slow. Okay. I might be having connectivity problems, but I hope not.

Brian:

But I am the product manager at Expert AI, and this is a weekly live stream we have where we talk about all things NLP. And just to show you interesting things in the world, and this week is especially interesting. We have Valentina Nieddu who is going to talk to us about some stuff that we’ve been working on at Expert AI, and a project she led up. But right off the top, I wanted to quickly mention that as you may have noticed when you clicked on this, that the topic of this week is hate speech. We of course, try to steer away from anything to outrageous, but just fair warning that we are talking about something that is terrible by its very nature and offensive. But we wanted to talk about this in terms of NLP and how these kinds of things are detected. Especially, it’s become more and more of a problem in the world today. Without further ado, Valentina, take it away.

Valentina:

Hello. And thank you for joining us. As Brian will say, my name is Valentina, and I’m one of the R&D knowledge engineers who develop the hate speech knowledge model. And today I’m going to provide you with some information about this project. Now let me share my screen so that we can start.

Brian:

There we go.

Valentina:

Yes. Perfect. So first things first, I would like to go through one of the many definitions of hates speech that you can find on institutional websites. And this one in particular comes from the United Nation Strategy and Plan of Action on Hate Speech. And they basically define it as any kind of communication that uses abusive or discriminatory language with reference to an individual or a social group on the basis of some identity factors like ethnicity, religion, sexual orientation, gender, and so on and so forth.

Valentina:

And of course this kind of communication comes from and generates intolerance and hatred and can also be demeaning and divisive. And despite the lack of accurate figures about hate speech incidents, many institutions like DEU Parliament are reporting a sharp increase of this phenomenon. So that’s why we were particularly motivated to address this issue using Expert AI language technology.

Valentina:

But not only hate speech is one of the main problems of today’s digital world, it is also difficult to tackle for several reasons. And one of these is that obviously different countries and international institutions adopted a different definition and enforced different solutions for tackling hate speech. And for this reason we adopted a linguistic approach for our knowledge model. And more specifically, we conducted a discourse analysis research, and for those who are not familiar with linguistics or humanities in general, discourse analysis is the study of language in use and its social context. And usually it is referred to as the study of language beyond the sentence.

Valentina:

So focusing on the structure of the hate speech knowledge model, we can say that it includes two complimentary modules. The first one is for document classification and relies on the hate speech taxonomy that we will see in a minute, while the second one extracts full instances of hate speech contained in the text together with their target. And in order to build a comprehensive dichotomy for the classification model, during our research, we focused on the main purposes of hate speech, and we could identify these three.

Valentina:

The first one is to insult someone personally. And this kind of abusive language usually contains a dominant attacks, and it is extremely contest related. And in theory, it is addressed to specific individual or a small group of people with no significant discrimination history, for example, fans of a public personality or supporters or of an association. But of course the real world living is way more complex than theory. And there could be exceptions. As many [inaudible 00:05:34] have a discriminatory route. And although most of the times users are not aware of that, basically, in this cases on top of insulting someone, hate speech may have a further aim, which is to discriminate.

Valentina:

Discrimination, which is the second main purpose of hate speech aims at targeting social groups at different escalating levels. And we can quickly go through them thanks to the pyramid of hate by the Anti-Defamation League that you can see in the picture that I’m showing. So as you can see, hate speech is featured at different levels, starting from the bottom, bias attitudes, where we have stereotypes to acts of bias, where we can see, for example, non-inclusive language, insensitive remarks, name-calling, ridicule, bullying, slurs, up to a threat in [inaudible 00:06:26]motivated violence.

Valentina:

Threatening and inciting violence is the third purpose of hate speech. And it is likely to lead to violence actions in the offline world. And this is one of the reasons why it is important to prevent it, or at least detect it as soon as possible.

Valentina:

So on top of these three main purposes, which are the main nodes, the hate speech taxonomy also includes the seven most common kinds of discrimination, subcategories, and they are racism, sexism, ableism, religious hatred, homophobia, classism, and body shaming. And in order to classify text as such, we developed symbolic roles based on relevant linguistic patterns. Besides customization and other advantage of the symbolic approach is its explainability. Because for example, I can show you some of the linguistic features that we took into consideration when coding these goals.

Valentina:

During the [inaudible 00:07:33] analysis, we identified some recurrent characteristics and here you can find just a few of them starting from an informal register, made of slang and misspelled words, like frequently use verbs, but especially slurs because people posting hate speech don’t want it to be detected by social media algorithms.

Valentina:

So the main slurs are often skated on purpose by replacing a letter with a number, for example. But we also found many cases in which violent verbs or offensive nouns and adjectives were replaced by emojis. So we used these special characters instead of some parts of speech when writing dedicated symbolic roles.

Valentina:

Then widening our scope, we also found full sentences that often lead to discrimination, such as generalizations and stereotypes, as we could see in the pyramid of eight. And tends to expertize symbolic language, we could reproduce and collect these patterns. Patterns like for example, a woman should always, plus some words or people with disabilities are usually plus an adjective, and combine them with specific conditions for detecting more subtle discriminatory contents.

Valentina:

These symbolic rules also define the extraction features that I was mentioning earlier for detecting and extracting full instances of discriminatory hate speech, but also for example, cyber bullying, when the statement contains personal insult or body shaming against an individual, violent messages, including threats, and sexual harassment. And usually these instances are extracted together with the target when it is explicitly mentioned. And based on the classification output, the standard targets can be an individual, the LGBT group, women, men, or religious groups, people with disabilities, and ethnic groups.

Valentina:

So hoping that the show presentation was helpful for understanding the main features of this knowledge model. I leave the floor to Brian who will show you a practical use case for this knowledge model.

Brian:

Thanks. Yeah. So I’ll try to share my screen. I think an interesting part of all this, especially is because the use case I want to show is sort of around community moderation. So it’s a pretty obvious use case of you’re trying to do an online community of some sort, and you just want to keep everything friendly, and to not have kind of hate of course or negativity even. And I think it’s interesting because when I’ve used those in the past, a lot of times it just identify specific words, where of course, anytime a certain word is used it’s going to be bad. But something interesting that Valentina touched on is within the world of NLP you more have the ability to determine context and find linguistic patterns, which will indicate that it’s not just about having a specific word, it’s about also identifying hate speech through the context.

Brian:

So hopefully everyone can see my screen. Is everyone able to see it? It looks like it. Okay. What I wanted to use is, Discord is a very common platform that is used amongst young people especially to communicate about just about anything. I know it’s very big in the gaming community. If you’re not familiar with it, they and Slack would argue, it’s not true, but it’s pretty much Slack. But it’s sort of your own personal server that has the ability to communicate. You can create channels and things like that. One of the interesting things with Slack and Discord is they have the ability to create bots, which typically are used for, it could use for all kinds of things, but it could be used for adding games into the actual site itself, funny gifs, whatever it might be, but it also can offer tools to monitor communities.

Brian:

And so what I did, and I don’t want to get too much into the technical side of it, but it’s fairly easy if you have development experience, where even if you’re pretty rusty like me to build a bot. And Discord has these helpful tools where you can install… Let go back to that. Where you can install this bot pretty quickly and get it up and running where, I mean here, I basically copy and paste of code, but essentially what it is it sits here. And when the bot is on, you of course have to name the bot. I named it something very stupid called Civil Drivel. But you name it, and then once the bot is active and alive, it sits there and it waits for each message.

Brian:

And when the message comes in, you can see it takes it here. And it sends it through to the Expert AI platform to run through the model that Valentina talked about. And so once it identifies that it has actually come back as it’s not empty, then it creates a message basically indicated why the message was flagged.

Brian:

And then in terms of getting it up and running, it’s very easy as well, where you have this little bit to run the code, it’s connected to Discord, and I will jump over here. Actually, I don’t know why I did that. We’ll jump over here. And so of course, imagining this is kids talking to each other, but another very interesting use case of course is within schools, because I know my kids communicate with their teachers as well as each other on platforms like this for school. And so I’ll just put in a few examples.

Brian:

First of all, you can see if you just say something normal, nothing happens, which is good. But if a person was for example, and I’m trying to find ones that are again more around context, but if somebody said a statement that is unfortunately common of telling people to go back to their home country. You can see that it comes back, and it says this comment has been flagged for the following types of hateful speech, racism.

Brian:

And so that’s a pretty powerful way to actually identify something, where I believe actually… I shouldn’t do this on the fly, but I think… So you can even see based on the context, telling someone to go back to Mexico, isn’t always going to be a bad thing if they’re from there. But within the larger context, and we have more context clues within the linguistic model, we can determine better whether it’s hate speech or not. And so of course, something to keep in mind here is that, so on the Expert AI platform, and when you are creating models like this, there’s always going to be a need to tweak it. There may be cases where there’s certain things people call each other, which is all in good fun, I guess, that if your platform wants to allow it, you should be able to allow it and tweak it. The main purpose of course, is for you to monitor, to keep everything civil.

Brian:

Another one I wanted to sort of point out is above and beyond just the obvious stuff around racism, things like that. These are just course personal insults. So here I’ll put something that people said about me. I have a very big head. So it picks up personal insults, of course. So that’s another thing is it’s easier to detect certain types of hate speech, especially around racism. There’s certain words, of course, that would tip anything off, but in terms of stopping something like cyber bullying, which is a big problem, is a very, very powerful use case for this as well. So this is a model that we focused on, particularly for hate speech and cyber bullying, but of course there is violence and things like that. We didn’t make this specifically to be overly sophisticated in detecting threats of violence, but there’s that of course is something that’s very needed nowadays. But you can see, this can also pick up that kind of thing.

Brian:

So if a bully is saying tomorrow, they’re going to basically attack this person and stuff them in a locker. Yeah. That didn’t work. Oh. Yeah. So it knows that it’s obviously not a good thing to be locked inside a locker. So it didn’t even need to say anything like, and I hope you can’t breathe and that kind of stuff.

Brian:

And then another example that Valentina mentioned, which I had nothing to do with. So again, some of the linguistics cues of a woman should always, a woman should basically anything is… But it can of course pick up sexism and things like that.

Brian:

So this is a very simple and basic example, but it’s something that I think you could imagine that this is within Discord, but of course, if you have a more standard traditional community, you could scan the text there, and then you could also leave some of it up to users where if something gets flagged, it gets sent back and you can determine whether the person should be penalized in some way on the platform.

Brian:

So I thought that was an interesting use case, but overall… Yeah. So thanks, Valentina. It was very interesting to see how the taxonomy is created. And of course, it’s unfortunately a very complex field, because people are always figuring out new ways to hate each other, unfortunately, but in that way, using things like this, it could hopefully, especially protect kids from this kind of a thing and maybe monitor somewhat our online communications to keep things focused on what we want to keep them on. But thanks for, for sharing your work. It was really interesting.

Valentina:

Thank you.

Brian:

Great. That was a short but sweet one, but hopefully you found it interesting. Next week, you’re going to see more of me because I am presenting next week, also. And then I promise I will just be hosting for a while. But next week I’m going to be talking about making the world smaller with NLP and linked data. And so if you’ve ever heard of JSON-LD or linked data in general and wondering why it’s anything anyone should care about, I’m going to be talking about that next week. But in the meantime, I’ll see you next week. And again, thanks Valentina, and hope everyone has a good week.

Valentina:

Thank you.