Two Attempts to Predict Author Gender in Cross-Genre Settings in Dutch

Brito, Eduardo; Sifa, Rafet; Bauckhage, Christian

doi:10.24406/publica-fhg-408345

2019

Conference Paper

Abstract

This paper describes the systems designed by the FraunhoferIAIS team at the CLIN29 shared task on cross-genre gender detection in Dutch. We show two alternative classification approaches: a rather standard one consisting of feature engineering and a random forest classifier; and an alternative one involving a LSTM classifier. Both are enhanced by a LDA model trained on stems. We considered various features such as frequency of function words, parts-of-speech and sentiment among others. We achieved 53.77% average accuracy in the cross-genre settings.

Author(s)

Brito, Eduardo

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Sifa, Rafet

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Bauckhage, Christian

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Mainwork

Proceedings of the Shared Task on Cross-Genre Gender Prediction in Dutch at GxG@CLIN29. Online resource

Conference

Shared Task on Cross-Genre Gender Prediction in Dutch (GxG-CLIN29) 2019

Conference on Computational Linguistics in The Netherlands (CLIN) 2019

Options

Two Attempts to Predict Author Gender in Cross-Genre Settings in Dutch