• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics
 
  • Details
  • Full
Options
2024
Conference Paper
Title

PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics

Abstract
Despite tremendous advancements in large language models (LLMs) over recent years, a notably urgent challenge for their practical deployment is the phenomenon of “hallucination”, where the model fabricates facts and produces non-factual statements. In response, we propose PoLLMgraph-a Polygraph for LLMs-as an effective model-based white-box detection and forecasting approach. PoLLMgraph distinctly differs from the large body of existing research that concentrates on addressing such challenges through black-box evaluations. In particular, we demonstrate that hallucination can be effectively detected by analyzing the LLM's internal state transition dynamics during generation via tractable probabilistic models. Experimental results on various open-source LLMs confirm the efficacy of PoLLMgraph, outperforming state-of-the-art methods by a considerable margin, evidenced by over 20% improvement in AUCROC on common benchmarking datasets like TruthfulQA. Our work paves a new way for model-based white-box analysis of LLMs, motivating the research community to further explore, understand, and refine the intricate dynamics of LLM behaviors.
Author(s)
Zhu, Derui
Technische Universität München
Chen, Dingfan
CISPA - Helmholtz Center for Information Security
Li, Qing
Universitetet i Stavanger
Chen, Zongxiong
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
Ma, Lei
The University of Tokyo
Grossklags, Jens
Technische Universität München
Fritz, Mario
CISPA - Helmholtz Center for Information Security
Mainwork
Findings of the Association for Computational Linguistics Naacl 2024 Findings
Funder
Bundesministerium für Bildung und Forschung  
Conference
2024 Findings of the Association for Computational Linguistics: NAACL 2024
Open Access
DOI
10.18653/v1/2024.findings-naacl.294
Additional link
Full text
Language
English
Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024