• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Speech-to-Text in Upper Sorbian: Current State
 
  • Details
  • Full
Options
March 2025
Conference Paper
Title

Speech-to-Text in Upper Sorbian: Current State

Abstract
This study presents recent advancements in Upper Sorbian Speech-to-Text (STT) technology. We provide an overview of the Sorbian languages, the available speech and language resources, and the development of an STT system based on a traditional approach, which includes acoustic, pronunciation, and language modeling. Due to the scarcity of resources for Sorbian languages, our approach leverages sub-word and word-class modeling techniques. The word-class modeling is based on Finite-State Transducer definitions, which are applicable to both offline text parsing and integration into the decoding graph of the STT system. Word-class parsing is performed on the speech corpus and utilized for language modeling with complete words, sub-word units, or both. Additionally, the same definitions can be applied to Named Entity Recognition during the post-processing of recognized transcriptions. This approach significantly reduces out-of-vocabulary words and enables greater customization of the recognizer for domain-specific applications. The system was implemented for the real-time transcription of church sermon broadcasts in Upper Sorbian. The domain-specific system achieved performance comparable to fine-tuned OpenAI Whisper models developed also by other initiatives while also providing a resource-efficient solution with semantically tagged recognition results.
Author(s)
Kraljevski, Ivan  
Fraunhofer-Institut für Keramische Technologien und Systeme IKTS  
Duckhorn, Frank  orcid-logo
Fraunhofer-Institut für Keramische Technologien und Systeme IKTS  
Sobe, Daniel
Foundation for the Sorbian People
Tschöpe, Constanze  
Fraunhofer-Institut für Keramische Technologien und Systeme IKTS  
Wolff, Matthias
Brandenburg University of Technology Cottbus-Senftenberg
Mainwork
Elektronische Sprachsignalverarbeitung 2025  
Conference
Konferenz Elektronische Sprachsignalverarbeitung 2025  
File(s)
Download (786.53 KB)
Rights
Use according to copyright law
DOI
10.24406/publica-4711
Language
English
Fraunhofer-Institut für Keramische Technologien und Systeme IKTS  
Keyword(s)
  • Sprachsignalverarbeitung

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024