Options
2025
Conference Paper
Title
Improving Language Model Performance by Training on Prototypical Contradictions
Abstract
We present an informed approach to augment existing contradiction detection datasets with prototypical examples for language model training. The samples are created by combining linguistic knowledge with the generative capabilities of current large language models. Specifically, we investigate three approaches that employ rule-based augmentation, data generation using GPT models and few-shot-prompting, as well as a combination of both. We find that adding prototypical samples to the training helps to significantly reduce the training set size, while maintaining or even improving performance on the downstream task.
Author(s)
Freischlad, Marie-Christin