Options
2024
Conference Paper
Title
Can you trust your ML metrics? Using Subjective Logic to determine the true contribution of ML metrics for safety
Abstract
Metrics such as accuracy, precision, recall, F1 score, etc. are generally used to assess the performance of machine learning (ML) models. From a safety perspective, relying on such single point estimates to evaluate safety requirements is problematic since they only provide a partial and indirect evaluation of the true safety risk associated with the model and its potential errors. In order to obtain a better understanding of the performance insufficiencies in the model, factors that could influence the quantitative evaluation of safety requirements such as test sample size, dataset size and model calibration need to be taken into account. In safety assurance, arguments typically combine complementary and diverse evidence to strengthen confidence in the safety claims. In this paper, we make a first step towards a more formal treatment of uncertainty in ML metrics by proposing a framework based on Subjective Logic that allows for modelling the relationship between primary and secondary pieces of evidence and the quantification of resulting uncertainty. Based on experiments, we show that single point estimates for common ML metrics tend to overestimate model performance and that a probabilistic treatment using the proposed framework can help to evaluate the probable bounds of the actual performance.
Conference
Open Access
Rights
CC BY 4.0: Creative Commons Attribution
Language
English