• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Artikel
  4. Refining maritime Automatic Speech Recognition by leveraging synthetic speech
 
  • Details
  • Full
Options
August 12, 2024
Journal Article
Title

Refining maritime Automatic Speech Recognition by leveraging synthetic speech

Abstract
Maritime transport serves as a critical component of global trade and logistics, enabling the movement of goods and resources across oceans and waterways. Especially in busy waterways and ports, effective and accurate communication is essential, as it ensures the seamless exchange of information and the coordinated execution of port activities. However, comprehensibility is often hindered by factors such as poor audio quality, background noise, and diverse languages and accents. Automatic Speech Recognition (ASR) systems can mitigate these issues by providing real-time transcription and enabling the implementation of automated, value-adding services to enhance situational awareness. While pre-trained ASR models excel on general speech, maritime ASR faces unique challenges due to a lack of annotated data, diverse accents, and specialized terminology.
To this end, we focus on improving the transcription quality of pre-trained ASR models for maritime communication with a particular focus on accurately recognizing maritime-specific terminology such as vessel and location names. Due to the scarcity of transcribed maritime communication, we create a synthetic training dataset tailored to regional maritime terminology. The synthetic audio is augmented with general human speech and used to fine-tune an end-to-end ASR model under various settings. The evaluation of the models employs a proprietary dataset of regional maritime radio communication from the port of Hamburg.
The experimental results demonstrate a notable enhancement in ASR performance. Specifically, our approach yields an absolute improvement over the pre trained baseline of 13.46% Word-Error-Rate and an increase of 41.57% recall for vessel names and 38.65% recall for locations. Our findings underscore the efficacy of integrating synthetic training data to address the challenges encountered in maritime ASR, paving the way for more robust and accurate speech recognition systems tailored to maritime applications.
Author(s)
Martius, Christoph Georg Rudolf
Fraunhofer-Institut für Materialfluss und Logistik IML  
Nakilcioglu, Emin Cagatay  orcid-logo
Fraunhofer-Institut für Materialfluss und Logistik IML  
Reimann, Maximilian
Fraunhofer-Institut für Materialfluss und Logistik IML  
John, Ole  orcid-logo
Fraunhofer-Institut für Materialfluss und Logistik IML  
Journal
Maritime Transport Research  
Open Access
DOI
10.1016/j.martra.2024.100114
Language
English
Fraunhofer-Institut für Materialfluss und Logistik IML  
Keyword(s)
  • Automatic speech recognition

  • Domain Adaptation

  • VHF Radio

  • Maritime Communication

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024