A video similarity measure combining alignment, graphical and speech features
A large volume of video content on the web is available today, which demands efficient management. To effectively manage, search, retrieve and copy detection, similarity methods play a critical role. In this paper, a novel video similarity measure using visual features, alignment distances and speech transcripts is proposed. Video files are represented by a sequence of segments set where each segment contains color histograms, start time and a set of syllables extracted from the speech in the audio track. In a first step, textual, alignment and visual features are extracted. They complement each other and can be further combined to boost the segment similarity. The second step describes how the Maximum Bipartite Matching and some statistical features are applied to find segments correspond ences and calculate a global similarity value respectively. Experiments for video similarity were performed on a dataset and promising results were achieved to demonstrate the effectiveness of this method.