• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Anderes
  4. ToxiGAN: Toxic Data Augmentation via LLM-Guided Directional Adversarial Generation
 
  • Details
  • Full
Options
January 6, 2026
Paper (Preprint, Research Paper, Review Paper, White Paper, etc.)
Title

ToxiGAN: Toxic Data Augmentation via LLM-Guided Directional Adversarial Generation

Title Supplement
Published on arXiv
Abstract
Augmenting toxic language data in a controllable and class-specific manner is crucial for improving robustness in toxicity classification, yet remains challenging due to limited supervision and distributional skew. We propose Toxi-GAN, a class-aware text augmentation framework that combines adversarial generation with semantic guidance from large language models (LLMs). To address common issues in GAN-based augmentation such as mode collapse and semantic drift, ToxiGAN introduces a two-step directional training strategy and leverages LLM-generated neutral texts as semantic ballast. Unlike prior work that treats LLMs as static generators, our approach dynamically selects neutral exemplars to provide balanced guidance. Toxic samples are explicitly optimized to diverge from these exemplars, reinforcing class-specific contrastive signals. Experiments on four hate speech benchmarks show that ToxiGAN achieves the strongest average performance in both macro-F1 and hate-F1, consistently outperforming traditional and LLM-based augmentation methods. Ablation and sensitivity analyses further confirm the benefits of semantic ballast and directional training in enhancing classifier robustness.
Author(s)
Li, Peiran
Freie Universität Berlin  
Fillies, Jan
Freie Universität Berlin
Paschke, Adrian  
Freie Universität Berlin  
Open Access
File(s)
Download (699.05 KB)
Rights
CC BY 4.0: Creative Commons Attribution
DOI
10.48550/arXiv.2601.03121
10.24406/publica-7194
Language
English
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024