• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Objective Functions to Determine the Number of Topics for Topic Modeling
 
  • Details
  • Full
Options
2021
Conference Paper
Title

Objective Functions to Determine the Number of Topics for Topic Modeling

Abstract
Topic modeling is a well-known task in unsupervised machine learning, where clustering algorithms are used to find latent topics. Several algorithms are presented in the literature, but the best known of them suffer from the drawback of requiring a lot of hyperparameter tuning to achieve good results. Especially, the number of latent topics or clusters (k) needs to be known in advance. In view of this situation, this paper analyses objective functions that help to evaluate the models in order to determine optimal hyperparameters. An empirical qualitative study was conducted using the NMF algorithm on different datasets to experimentally determine numerical properties of topic models which indicate an optimal k. Based on this study, we propose objective functions to select optimal topic models and discuss their results on different datasets.
Author(s)
Peikert, Silvio
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Kubach, Clemens
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Al Qundus, Jamal
Middle East University Amman, Jordan
Vu, Le Duyen Sandra
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Paschke, Adrian  
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Mainwork
iiWAS2021 & MoMM 2021. Proceedings  
Project(s)
Qurator
Funder
Bundesministerium für Bildung und Forschung BMBF (Deutschland)  
Conference
International Conference on Information Integration and Web Intelligence (iiWAS) 2021  
International Conference on Advances in Mobile Computing & Multimedia Intelligence (MoMM) 2021  
DOI
10.1145/3487664.3487710
Language
English
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Keyword(s)
  • topic model evaluation

  • topic modeling

  • Hyperparameter Tuning

  • topic model coherence

  • non-negative matrix factorization

  • latent dirichlet allocation

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024