Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Identification of Spurious Labels in Machine Learning Data Sets using N-Version Validation

: Mues, M.; Gerard, S.; Howar, F.


Institute of Electrical and Electronics Engineers -IEEE-:
IEEE 23rd International Conference on Intelligent Transportation Systems, ITSC 2020 : September 20 - 23, 2020, Virtual Conference
Piscataway, NJ: IEEE, 2020
ISBN: 978-1-7281-4150-3
ISBN: 978-1-7281-4149-7
International Conference on Intelligent Transportation Systems (ITSC) <23, 2020, Online>
Fraunhofer ISST ()

Machine learning components are becoming popular for the automotive industry. More and more data sets become available for training machine learning components. All of them provide ground truth labels for images. The labeling process is expensive and potentially error-prone. At the same time, label correctness defines the business value of a data set. In this paper, we use N-Version approach to assess the label quality in a data set. The approach combines N state-of-the-art neural networks and aggregates their results in a single verdict using majority voting. We analyze this majority vote against the ground truth label and compute the percentage of disagreeing pixels along with other metrics, enabling the automated and detailed analysis of label quality on data sets. We evaluate our methodology by classifying the BDD100K drivable area data set. The evaluation shows that the approach identifies misclassified scenes or inconsistencies between label semantics for similar scenes.