Potentials and Challenges of AI-based Audio Analysis in Industrial Sound Analysis
The rapid advances in artificial intelligence (AI)-based algorithms offer great potential for various domestic and industrial applications. AI-driven analysis algorithms are already integrated into everyday applications such as face recognition, object recognition, and voice assistance systems. However, several challenges need to be addressed when such algorithms are to be deployed in industrial sound analysis (ISA) scenarios such as the monitoring of production processes. First, training AI models requires a large quantity of diverse and well-balanced data combined with high-quality annotations, which are consistent and accurate. Imbalanced and wrongly labelled datasets can cause models to be biased, which limits their performance. Second, when being deployed in real-world application scenarios, these models need to be robust towards domain shifts, which are caused by changing acoustic conditions due to different recording devices and ambient background noise. In this paper, we will discuss effective countermeasures to face these challenges. This includes data balancing techniques to reduce data sparsity, data augmentation techniques to increase the variety of the available training data, as well as data normalization techniques to reduce domain shift. Furthermore, we will present how transfer learning and semi-supervised learning can help to create robust models for different ISA application scenarios when only few annotated examples are available for training.