Options
2025
Conference Paper
Title
Tackling Data Sparsity and Combinatorial Challenges in Rare Disease Matching with Medical Informed Machine Learning
Abstract
With over 7,000 known rare diseases and a prevalence of less than one in a thousand, rare diseases pose substantial challenges to advanced medical support networks. This study investigates the efficacy of Unrare.me, a novel social networking platform designed for individuals affected by rare diseases, including patients, their family members, and medical professionals, addressing data sparsity and combinatorial complexities in user matching. We demonstrate that simple matching heuristics already serve as a decent basis for collecting user feedback on match quality. Leveraging over 10,000 user matching feedback scores from more than 2,000 active users, we evaluate algorithms including collaborative filtering and user embedding similarity with state-of-the-art Large Language Models (LLMs). With a top-10 and top-5 hit-rate of 55% and 37%, respectively, we show that a combination of medical data augmentation and embeddings significantly enhances performance beyond the initial heuristic baseline.
Author(s)
Bell Felix de Oliveira, Thiago
Conference