Options
2025
Conference Paper
Title
Joint image clustering and self-supervised representation learning through debiased contrastive loss
Abstract
Joint self-supervised representation learning and image clustering have emerged as some of the most effective techniques for visual representation learning. However, existing methods often rely on artificially balanced datasets, raising concerns about their performance on imbalanced and long-tail data distributions. To address this challenge, we propose a novel framework that combines debiased self-supervised representation learning with joint clustering. By adapting the debiased contrastive loss, our approach mitigates the under-clustering of minority classes in imbalanced datasets. Furthermore, integrating the debiased contrastive loss with a divergence clustering loss significantly improves the quality of learned representations. We conducted extensive experiments on diverse datasets, including CIFAR-10, CIFAR-100, iNaturalist-2018, ISIC-2018 (skin lesions), and two ophthalmic retina fundus glaucoma datasets. Our framework was compared against state-of-the-art methods such as SimCLR, SimSiam, Debiased, and BNN, as well as other self-supervised and clustering algorithms. The results demonstrate that our method outperforms existing deep clustering, self-supervised, and semi-supervised techniques across various classification and clustering tasks, particularly on imbalanced and clinical datasets. These findings establish the effectiveness of our framework for representation learning under challenging data distributions, offering new insights into addressing imbalances in real-world applications.
Author(s)