Options
2025
Conference Paper
Title
Dataset for Industrial Question Answering with Explanation and Scalable Ensemble Generation
Abstract
The digital and green transition under Industry 4.0 has accelerated the adoption of AI in industries such as manufacturing, energy, and mining. Question Answering with Explanation (QAE), as a way of human interaction with AI, is crucial for enhancing transparency and trust in high-stakes industrial applications. However, industrial QAE remains underexplored due to the lack of publicly available, high-quality datasets, hindered by the need for expert effort and corporate restrictions. To this end, we introduce PANDAX (https://doi.org/10.5281/zenodo.14510798), the first open-source industrial QAE dataset, and SEG, a scalable method for generating high-quality QAE datasets using LLMs. PANDAX focuses on three key topics of industrial system information: partonomy, functionality, and parameters, across critical domains such as green technology and cooling systems. SEG ensures scalability and quality through ensemble generation, majority voting, expert ranking, etc. The human evaluation validates PANDAX's high quality, positioning it as a valuable resource for advancing QAE techniques, benchmarking language technologies, and supporting research in explainable AI for industrial systems.
Author(s)
Conference