Options
2025
Conference Paper
Title
Towards Practical Audio Phylogeny: Multi-Transformation Detection in Incomplete Trees
Abstract
Audio phylogeny aims to reconstruct the transformation history of near-duplicate audio files by identifying parent-child relationships and tracing their modification paths. A key challenge is accurately estimating the transformations between file pairs, particularly when sequences of edits (e.g., compression, trimming, fading) or missing intermediate files are involved.We propose a deep learning-based method that formulates transformation detection as a multi-label classification task, enabling the identification of multiple transformations per audio pair. Unlike existing approaches limited to detecting one or two transformations, our method achieves notable improvements in accuracy when reconstructing sparse and incomplete trees. To address the variability of real-world scenarios - including both full and sparse trees - we further introduce a hybrid strategy that combines our model with the current state-of-the-art, balancing precision in dense trees with robustness in sparse conditions.