CNN-based thermal infrared person detection by domain adaptation
Imaging sensors capturing the surroundings of an autonomous vehicle are vital for its understanding of the environment. While thermal infrared cameras promise improved bad weather and nighttime robustness compared with standard RGB cameras, detecting objects, such as persons, in thermal infrared imagery is a tough problem because image resolution and quality is typically far lower, especially for low-cost sensors. Currently, deep learning based object detection frameworks offer an impressive performance on high-quality images. However, applying them to low-quality data in a different spectral range causes significant performance drops. This work proposes a strategy to make use of elaborate CNN-based object detector frameworks which are pre-trained on visual RGB images. Two key steps are undertaken: First, an appropriate preprocessing strategy for the IR data is suggested which transforms the IR data as close as possible to the RGB domain. This allows pre-trained RGB features to be effective on the novel domain. Second, the remaining domain gap is addressed by fine-tuning the pre-trained CNN on a limited set of thermal IR data. Different IR preprocessing options are explored, each addressing a different aspect of the domain gap between thermal IR and RGB data. Examples include dynamic range, blur or contrast. Because no preprocessing can cover all aspects alone, providing preprocessing combinations to the CNN allows addressing more than one aspect at once and further improves the results. Experiments indicate significant person detection improvements on the public KAIST dataset with the optimized preprocessing strategy.