• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Towards Reducing Latency Using Beam Search in an Interactive Conversational Speech Agent
 
  • Details
  • Full
Options
2024
Conference Paper
Title

Towards Reducing Latency Using Beam Search in an Interactive Conversational Speech Agent

Abstract
The rapid advancement of generative artificial in-telligence (AI) has led to groundbreaking developments in large language models. As large language models generate textual sequences autoregressively, mitigating latency becomes imper-ative for providing a highly immersive interaction experience within a realtime conversation, for example, providing fast and accurate responses to users' questions. Current efforts focus on accelerating inference processes, yet often at the expense of model architecture alterations, leading to compromised quality. In this paper, we explore latency reduction in the case of speech-based conversational agents. We leverage mathematical functions based on Beam Search to analyze autoregressive textual sequences, enabling a nuanced evaluation of semantic quality during auditory interaction, for example, for use within interactive web podcasts. We implemented our concepts and used the software to evaluate the concepts within (1) an automated evaluation of 1000 question-answer pairs and (2) a user survey. The results show that the semantic quality of autoregressive textual sequences could be assessed successfully by our proposed mathematical terms.
Author(s)
Ott, Nikolas
Hochschule RheinMain
Horst, Robin
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Dörner, Ralf
Hochschule RheinMain
Mainwork
IEEE Gaming, Entertainment, and Media Conference, GEM 2024  
Project(s)
Kooperative Rekrutierungs- und Qualifizierungslinien, Vorhaben RheinMain  
Funder
Bundesministerium für Bildung und Forschung -BMBF-  
Conference
Gaming, Entertainment, and Media Conference 2024  
DOI
10.1109/GEM61861.2024.10585772
Language
English
Fraunhofer-Institut für Graphische Datenverarbeitung IGD  
Keyword(s)
  • Human-Computer Interaction

  • Interactive Podcasts

  • Large language models (LLM)

  • Latency Reduction

  • Branche: Information Technology

  • Branche: Cultural and Creative Economy

  • Research Line: Human computer interaction (HCI)

  • Research Line: Machine learning (ML)

  • LTA: Machine intelligence, algorithms, and data structures (incl. semantics)

  • Conversational user interfaces

  • Human-computer interaction (HCI)

  • Beam Search

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024