• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Cross-domain authorship attribution based on compression
 
  • Details
  • Full
Options
2018
Conference Paper
Title

Cross-domain authorship attribution based on compression

Title Supplement
Notebook for PAN at CLEF 2018
Abstract
Authorship attribution (AA) is a very well studied research subject and the most prominent subtask of authorship analysis. The goal of AA is to identify the most likely author of an anonymous document among a set of known candidate authors, for which sample documents exist. Even after more than a century of intensive research, AA is still far from being solved. One open question, for example is, if the goal of AA can be successfully achieved, if the anonymous document and the known sample documents come from different domains such as genre or topic. We present a lightweight authorship attribution approach named COBAA ("Compression-Based Authorship Attribution") which is an attempt to answer this question. COBAA is based solely on a compression algorithm as well as a simple similarity measure and does not involve a training procedure. Therefore, the method can be used out-of-the-box even in real-world scenarios, where no training data is available. COBAA has been evaluated at the PAN 2018 Author Identification shared task and was ranked third among 11 participating approaches. The method achieved 0.629 in terms of Mean Macro-F1 on a corpus with attribution problems, distributed across five languages (English, French, Italian, Polish and Spanish).
Author(s)
Halvani, Oren  
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
Graner, Lukas
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
Mainwork
CLEF 2018, Conference and Labs of the Evaluation Forum. Working Notes. Online resource  
Conference
Conference and Labs of the Evaluation Forum (CLEF) 2018  
Link
Link
Language
English
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024