Multilingual Hate Speech Modeling by Leveraging Inter-Annotator Disagreement

Patricia-Carla Grigor, Petra Kralj Novak and Bojan Evkoski

Abstract
As social media usage increases, so does the volume of toxic
content on these platforms, motivating the Machine Learning
(ML) community to focus on automating hate speech detec-
tion. While modern ML algorithms are known to provide nearly
human-like results for a variety of downstream Natural Lan-
guage Processing (NLP) tasks, the classification of hate speech
is still an open challenge, partially due to its subjective anno-
tation, which often leads to disagreement between annotators.
This paper adopts a perspectivist approach that embraces sub-
jectivity, leveraging conflicting annotations to enhance model
performance in real-world scenarios. A state-of-the-art multi-
lingual language model for hate speech detection is introduced,
trained, and evaluated using diamond standard data with metrics
that consider disagreement. Various strategies for incorporat-
ing disagreement are compared in the process. Results demon-
strate that the model performs equally or better on all evalu-
ated languages compared to respective monolingual models and
drastically outperforms on multilingual data. This highlights
the effectiveness of multilingual and perspectivist methods in
addressing the complexities of hate speech detection. The pre-
sented multilingual hate speech detection model is available at:
https://huggingface.co/IMSyPP/hate_speech_multilingual.