
Publica
Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten. Token level code-switching detection using Wikipedia as a lexical resource
: Claeser, Daniel; Felske, Dennis; Kent, Samantha
| Rehm, G.: Language Technologies for the Challenges of the Digital Age. 27th International Conference, GSCL 2017 : Berlin, Germany, September 13-14, 2017, Proceedings Cham: Springer International Publishing, 2017 (Lecture Notes in Computer Science 10713) ISBN: 978-3-319-73705-8 (Print) ISBN: 978-3-319-73706-5 (Online) ISBN: 3-319-73705-8 S.192-198 |
| International Conference on Language Technologies for the Challenges of the Digital Age (GSCL) <27, 2017, Berlin> |
|
| Englisch |
| Konferenzbeitrag, Elektronische Publikation |
| Fraunhofer FKIE () |
Abstract
We present a novel lexicon-based classification approach for code-switching detection on Twitter. The main aim is to develop a simple lexical look-up classifier based on frequency information retrieved from Wikipedia. We evaluate the classifier using three different language pairs: Spanish-English, Dutch-English, and German-Turkish. The results indicate that our figures for Spanish-English are competitive with current state of the art classifiers, even though the approach is simplistic and based solely on word frequency information.