Hate is in the air! But where? Introducing an algorithm to detect hate speech in digital microenvironments

metadata cyber place hate speech Twitter random forest

Journal article

Fernando Miró-Llinares (Crimina Center for the Study and Prevention of Crime at Miguel Hernandez University) , Asier Moneva (Crimina Center for the Study and Prevention of Crime at Miguel Hernandez University) , Miriam Esteve (Center of Operations Research at Miguel Hernández University)


With the objective of facilitating and reducing analysis tasks undergone by law enforcement agencies and service providers, and using a sample of digital messages (i.e., tweets) sent via Twitter following the June 2017 London Bridge terror attack (N = 200,880), the present study introduces a new algorithm designed to detect hate speech messages in cyberspace. Unlike traditional designs based on semantic and syntactic approaches, the algorithm hereby implemented feeds solely on metadata, achieving high level of precision. Through the application of the machine learning classification technique Random Forests, our analysis indicates that metadata associated with the interaction and structure of tweets are especially relevant to identify the content they contain. However, metadata of Twitter accounts are less useful in the classification process. Collectively, findings from the current study allow us to demonstrate how digital microenvironment patterns defined by metadata can be used to create a computer algorithm capable of detecting online hate speech. The application of the algorithm and the direction of future research in this area are discussed.



Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".