Options
2025
Conference Paper
Title
An Evaluation of Open Source Data Anonymisation Tools for Medical Data
Abstract
Medical data is inherently multimedia in nature. A single patient record typically contains diverse data types, including multimodal medical images, sensor signals, and Electronic Health Records (EHRs). Given the recent revolution in Artificial Intelligence (AI), these medical datasets are crucial for a wide range of AI-based clinical applications. However, sharing this data presents several significant challenges e.g. data privacy. Data privacy tools enable data sharing and utility while ensuring the confidentiality of such sensitive information. Data anonymisation allows sharing an anonymised version of the dataset, which may protect the privacy of the individuals and organisations involved. This study provides a comprehensive evaluation of popular open-source anonymisation tools by assessing their practical performance and compliance when applied to sensitive medical datasets that must adhere to legal frameworks. It compares these tools across 12 key dimensions on 3 different datasets of different size. Our findings show a trade-off: feature-rich tools such as ARX and sdcMicro require a significant learning curve and specialist technical knowledge. Conversely, other tools, such as GreenMask, Neosync, and ClustEm4Ano, offer greater user-friendliness but are lacking in important capabilities. The study highlights key gaps across the multimedia, especially in the medical domains e.g., the weak support for healthcare ontologies and the absence of standardised evaluation benchmarks.
Author(s)
Open Access
File(s)
Rights
CC BY 4.0: Creative Commons Attribution
Additional link
Language
English