Fraunhofer-Gesellschaft

Publica

Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

FraunhoferSIT at GermEval 2019: Can Machines Distinguish Between Offensive Language and Hate Speech? Towards a Fine-Grained Classification

 
: Vogel, Inna; Regev, Roey

:
Volltext (PDF; )

Gesellschaft für Sprachtechnologie & Computerlinguistik -GSCL-:
15th Conference on Natural Language Processing, KONVENS 2019. Proceedings. Online resource : October 9-11, 2019, Erlangen
Erlangen, 2019
https://corpora.linguistik.uni-erlangen.de/data/konvens/proceedings/
S.377-381
Conference on Natural Language Processing (KONVENS) <15, 2019, Erlangen>
Englisch
Konferenzbeitrag, Elektronische Publikation
Fraunhofer SIT ()

Abstract
In this paper, we describe the Fraunhofer-SIT submission for the “GermEval 2019 – Shared Task on the Identification of Offensive Language”. We participated in two subtasks: task 1 is a binary classification of German tweets on the identification of offensive language. Task 2 is a fine-grained classification to distinguish between three subcategories of offensive language. Our best model is an SVM classifier based on tfidf character n-gram features. Our submitted runs in the shared task are: Fraunhofer-SIT coarse [1-3].txt for task 1 and FraunhoferSIT fine [1-3].txt for task 2. Our final system reaches 0.70 macro-average F1-score for the binary classification and 0.46 F1-score for the fine-grained classification. The achieved results show that the problem of automatically distinguishing between offensive language and “Hate Speech” is far from being solved.

: http://publica.fraunhofer.de/dokumente/N-572417.html