Gala, Viraj RohitViraj RohitGalaSchneider, MartinMartinSchneiderVogt, MarvinMarvinVogt2024-11-042024-11-042024-10-29https://publica.fraunhofer.de/handle/publica/47826410.1109/QRS-C63300.2024.00016A significant surge of innovations and new implementations now hinges on advanced AI-based systems. To foster trust in artificial intelligence systems, it is imperative to address the current lack of a structured approach to assess these systems. An evaluation methodology for AI is of paramount importance, especially for implementation in safety-critical applications. This paper is an initial step toward establishing a framework for the evaluation methodology of ML systems. We propose incorporating a multi-property assessment of an ML model and state the different building blocks that can facilitate the compliance of AI systems for developers as well as certification authorities. We demonstrate the implementation of the proposed framework for the evaluation of ML systems, one by assessing the robustness property, and two by assessing the data quality property of dataset used for ML model. In assessing the robustness property of the ML model through adversarial attacks, we use the implementation of the CW attack for an LSTM model trained on the OpenSky dataset. For data quality assessment, we evaluate data consistency through the implementation of outlier detection algorithms. We illustrate our results on the OpenSky dataset and highlight the challenges involved in assessing the robustness of deep neural networks.enTechnological innovationData integritySoftware algorithmsNearest neighbor methodsRobustnessData modelsTowards an Evaluation Methodology of ML Systems from the Perspective of Robustness and Data Qualityconference paper