Shallow Context Analysis for German Idiom Detection
Paper presented at KONVENS 2021, Session Shared Task on the Disambiguation of German Verbal Idioms, Düsseldorf, Germany, 06-09 September 2021, published on Zenodo
In order to differentiate between figurative and literal usage of verb-noun combinations for the shared task on the disambiguation of German Verbal Idioms issued for KONVENS 2021, we apply and extend an approach originally developed for detecting idioms in a dataset consisting of random ngram samples. The classification is done by implementing a rather shallow, statistics-based pipeline without intensive preprocessing and examinations on the morphosyntactic and semantic level. We describe the overall approach, the differences between the original dataset and the dataset of the KONVENS task, provide experimental classification results, and analyse the individual contributions of our feature sets.