• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Interpretable Topic Extraction and Word Embedding Learning Using Non-Negative Tensor DEDICOM
 
  • Details
  • Full
Options
2021
Journal Article
Title

Interpretable Topic Extraction and Word Embedding Learning Using Non-Negative Tensor DEDICOM

Abstract
Unsupervised topic extraction is a vital step in automatically extracting concise contentual information from large text corpora. Existing topic extraction methods lack the capability of linking relations between these topics which would further help text understanding. Therefore we propose utilizing the Decomposition into Directional Components (DEDICOM) algorithm which provides a uniquely interpretable matrix factorization for symmetric and asymmetric square matrices and tensors. We constrain DEDICOM to row-stochasticity and non-negativity in order to factorize pointwise mutual information matrices and tensors of text corpora. We identify latent topic clusters and their relations within the vocabulary and simultaneously learn interpretable word embeddings. Further, we introduce multiple methods based on alternating gradient descent to efficiently train constrained DEDICOM algorithms. We evaluate the qualitative topic modeling and word embedding performance of our proposed methods on several datasets, including a novel New York Times news dataset, and demonstrate how the DEDICOM algorithm provides deeper text analysis than competing matrix factorization approaches.
Author(s)
Hillebrand, Lars Patrick  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Biesner, David  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Bauckhage, Christian  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Sifa, Rafet  
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Journal
Machine learning and knowledge extraction  
Funder
Bundesministerium für Bildung und Forschung  
Open Access
DOI
10.3390/make3010007
Additional link
Full text
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Keyword(s)
  • matrix factorization

  • NLP

  • tensor factorization

  • topic modeling

  • word embeddings

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024