Options
2021
Conference Paper
Title
Objective Functions to Determine the Number of Topics for Topic Modeling
Abstract
Topic modeling is a well-known task in unsupervised machine learning, where clustering algorithms are used to find latent topics. Several algorithms are presented in the literature, but the best known of them suffer from the drawback of requiring a lot of hyperparameter tuning to achieve good results. Especially, the number of latent topics or clusters (k) needs to be known in advance. In view of this situation, this paper analyses objective functions that help to evaluate the models in order to determine optimal hyperparameters. An empirical qualitative study was conducted using the NMF algorithm on different datasets to experimentally determine numerical properties of topic models which indicate an optimal k. Based on this study, we propose objective functions to select optimal topic models and discuss their results on different datasets.
Author(s)