Impact of Fused Visible-Infrared Video Streams on Visual Tracking
Currently state-of-the-art trackers rely on fully convolutional neural network (FCNN) for extracting salient features in order to create an appearance representation of the target. Ordinarily, most of them intend to work with input streams from the visible spectrum, yet how does an input stream from the infrared spectrum and a fused visible-infrared stream affect their performances and how does it benefit or detriment them? Towards this end, we compare the performance of various reference trackers utilizing FCNN for feature extraction, on visible, infrared and fused spectrums. By utilizing a carefully processed publicly available data set for the evaluation, containing visible-infrared paired sequences, we ensure to find synchronized and same attributes at the same locations, effectively studying only the impact of a spectral change. Thus, by analyzing quantitative results, we identify visual attributes which benefit or detriment from a fused approach on typical visual tracking scenarios.