EULAide: Interpretation of end-user license agreements using ontology-based information extraction
Ignoring End-User License Agreements (EULAs) for online services due to their length and complexity is a risk undertaken by the majority of online and mobile service users. This paper presents an Ontology-Based Information Extraction (OBIE) method for EULA term and phrase extraction to facilitate a better understanding by humans. An ontology capturing important terms and relationships has been developed and used to guide the OBIE process. Through a feedback cycle we have improved its domain-specific coverage by identifying additional concepts. In the detection and extraction, we focus on three key rights and conditions: permission, prohibition and duty. We present the EULAide system, which comprises a custom information extraction pipeline and a number of custom extraction rules tailored for EULA processing. To evaluate our approach, we created and manually annotated a corpus of 20 well-known licenses. For the gold standard we achieved an Inter-Annotator Agreement (IAA) of 90%, resulting in 193 permissions, 185 prohibitions and 168 duties. An evaluation of the OBIE pipeline against this gold standard resulted in an F-measure of 70-74% which, in the context of the IAA, proves the feasibility of the approach.