Options
2024
Conference Paper
Title
Ensuring FAIRness in Machine Learning Projects
Abstract
Subsymbolic approaches like machine learning (ML), deep learning, and Large Language Models (LLMs) have significantly advanced Artificial Intelligence, excelling in tasks such as question answering and ontology matching. Despite their success, the lack of openness in LLMs’ training datasets and source codes poses challenges. For instance, some ML-based models do not share training data, limiting transparency. Current standards like schema.org provide a framework for dataset and software metadata but lack ML-specific guidelines. This position paper addresses this gap by proposing a comprehensive schema for ML model metadata aligned with the FAIR (Findability, Accessibility, Interoperability, Reusability) principles. We aim to provide insights into the necessity of an essential metadata format for ML models, demonstrate its integration into ML repository platforms, and show how this schema, combined with dataset metadata, can evaluate an ML model’s adherence to the FAIR principles, fostering FAIRness in ML development.
Author(s)
Mainwork
Ceur Workshop Proceedings
Conference
4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment, Sci-K 2024
Language
English
Keyword(s)