Now showing 1 - 5 of 5
  • Publication
    Statistical Property Testing for Generative Models
    ( 2023)
    Seferis, Emmanouil
    ;
    ;
    Generative models that produce images, text, or other types of data are recently be equipped with more powerful capabilities. Nevertheless, in some use cases of the generated data (e.g., using it for model training), one must ensure that the synthetic data points satisfy some properties that make them suitable for the intended use. Towards this goal, we present a simple framework to statistically check if the data produced by a generative model satisfy some property with a given confidence level. We apply our methodology to standard image and text-to-image generative models.
  • Publication
    Statistical Guarantees for Safe 2D Object Detection Post-processing
    ( 2023)
    Seferis, Emmanouil
    ;
    ;
    Kollias, Stefanos
    ;
    Safe and reliable object detection is essential for safetycritical applications of machine learning, such as autonomous driving. However, standard object detection methods cannot guarantee their performance during operation. In this work, we leverage conformal prediction in order to provide statistical guarantees for back-box object detection models. Extending prior work, we present a postprocessing methodology that can cover the entire object detection problem (localization, classification, false negatives, detection in videos, etc.), while offering sound safety guarantees on its error rates. We apply our method on state-of-the-art 2D object detection models and measure its efficacy in practice. Moreover, we investigate what happens as the acceptable error rates are pushed towards high safety levels. Overall, the presented methodology offers a practical approach towards safety-aware object detection, and we hope it can pave the way for further research in this area.
  • Publication
    Can Conformal Prediction Obtain Meaningful Safety Guarantees for ML Models?
    ( 2023)
    Seferis, Emmanouil
    ;
    ;
    Conformal Prediction (CP) has been recently proposed as a methodology to calibrate the predictions of Machine Learning (ML) models so that they can output rigorous quantification of their uncertainties. For example, one can calibrate the predictions of an ML model into prediction sets, that guarantee to cover the ground truth class with a probability larger than a specified threshold. In this paper, we study whether CP can provide strong statistical guarantees that would be required in safety-critical applications. Our evaluation on the ImageNet demonstrates that using CP over state-of-the-art models fails to deliver the required guarantees. We corroborate our results by deriving a simple connection between the CP prediction sets and top-k accuracy.
  • Publication
    Intelligent Testing for Autonomous Vehicles - Methods and Tools
    In this talk, I first give a tutorial on some fundamental AI testing methods with their strengths and weaknesses. For testing complex autonomous driving systems, an intelligent combination of basic AI testing techniques makes it possible to generate highly diversified test cases while enabling efficient bug hunting.
  • Publication
    Selected Challenges in ML Safety for Railway
    Neural networks (NN) have been introduced in safety-critical applications from autonomous driving to train inspection. I argue that to close the demo-to-product gap, we need scientifically-rooted engineering methods that can efficiently improve the quality of NN. In particular, I consider a structural approach (via GSN) to argue the quality of neural networks with NN-specific dependability metrics. A systematic analysis considering the quality of data collection, training, testing, and operation allows us to identify many unsolved research questions: (1) Solve the denominator/edge case problem with synthetic data, with quantifiable argumentation (2) Reach the performance target by combining classical methods and data-based methods in vision (3) Decide the threshold (for OoD or any kind) based on the risk appetite (societally accepted risk).