HarmLLaMA: Harmful Language Detection with Large Language Models

Truică, Ciprian-Octavian; Apostol, Elena-Simona; Ilie, Alexandru-Gabriel; Paschke, Adrian

doi:10.1109/ICCP68926.2025.11427180

2025

Conference Paper

Abstract

Online platforms are complex systems that influence the commercial, social, and political environment, debating important real-life topics, e.g., health, emigration, elections, climate change, etc. These online environments offer users freedom of expression through anonymous posting. In addition to their obvious advantages, some users abuse this freedom to spread harmful content, e.g., misinformation, propaganda, harmful conspiracy theories, or abusive, aggressive, and offensive speech. Automated detection techniques can effectively reduce the negative influence of antisocial behavior used by these malicious actors. In this article, we propose HarmLLAMA, a fine-tuned LLAMA2 model using LORA. The experimental results on two real-world datasets show that our model, HarmLLaMA, outperforms current state-of-the-art models in terms of Accuracy, Precision, Recall, and F1-Score.

Author(s)

Truică, Ciprian-Octavian

Apostol, Elena-Simona

Ilie, Alexandru-Gabriel

Paschke, Adrian

Freie Universität Berlin

Mainwork

IEEE 21st International Conference on Intelligent Computer Communication and Processing, ICCP 2025. Proceedings

Conference

International Conference on Intelligent Computer Communication and Processing 2025

Options

HarmLLaMA: Harmful Language Detection with Large Language Models