TAVeer: An interpretable topic-agnostic authorship verification method

Halvani, Oren; Graner, Lukas; Regev, Roey

doi:10.1145/3407023.3409194

2020

Conference Paper

Abstract

A central problem that has been researched for many years in the field of digital text forensics is the question whether two documents were written by the same author. Authorship verification (AV) is a research branch in this field that deals with this question. Over the years, research activities in the context of AV have steadily increased, which has led to a variety of approaches trying to solve this problem. Many of these approaches, however, make use of features that are related to or influenced by the topic of the documents. Therefore, it may accidentally happen that their verification results are based not on the writing style (the actual focus of AV), but on the topic of the documents. To address this problem, we propose an alternative AV approach that considers only topic-agnostic features in its classification decision. In addition, we present a post-hoc interpretation method that allows to understand which particular features have contributed to the prediction of the proposed AV method. To evaluate the performance of our AV method, we compared it with eight competing baselines (including the current state of the art) on four challenging data sets. The results show that our approach outperforms all baselines in two cases (with a maximum accuracy of 84%), while in the other two cases it performs as well as the strongest baseline.

Author(s)

Halvani, Oren

Graner, Lukas

Regev, Roey

Mainwork

ARES 2020, 15th International Conference on Availability, Reliability and Security

Conference

International Conference on Availability, Reliability and Security (ARES) 2020

Options

TAVeer: An interpretable topic-agnostic authorship verification method