Options
2013
Conference Paper
Titel
Detecting natural disaster events on twitter across languages
Abstract
Social media such as Twitter can act as a human sensor network for real-time event detection and recently has been extensively exploited for crisis management. However, little attention has been paid to applying text mining and NLP techniques to monitor events in a multilingual setting and most of the work focusses on one single language only. This paper investigates a unified framework for detecting natural disaster events on twitter across a variety of languages, and is embedded into a larger system for real-time decision support for Natural Crisis Management. This work presents the results achieved in classifying tweets in various languages. Among the four ML classifiers we evaluated for each language, the Random Forest classifier produces the best results, achieving on average 85.02% accuracy for known languages. We also bootstrapped classifiers for unknown languages by cross-language text classification. In this case, the average accuracy drops to 66.64% for the Random Forest classifier. Our results based on a specific test scenario indicate that for a timely detection of an earthquake event it is important to consider the distribution of languages spoken at the location of the event.