Automated face detection for occurrence and occupancy estimation in chimpanzees
Surveying endangered species is necessary to evaluate conservation effectiveness. Camera trapping and biometric computer vision are recent technological advances. They have impacted on the methods applicable to field surveys and these methods have gained significant momentum over the last decade. Yet, most researchers inspect footage manually and few studies have used automated semantic processing of video trap data from the field. The particular aim of this study is to evaluate methods that incorporate automated face detection technology as an aid to estimate site use of two chimpanzee communities based on camera trapping. As a comparative baseline we employ traditional manual inspection of footage. Our analysis focuses specifically on the basic parameter of occurrence where we assess the performance and practical value of chimpanzee face detection software. We found that the semi-automated data processing required only 2-4% of the time compared to the purely manual analysis. This is a non-negligible increase in efficiency that is critical when assessing the feasibility of camera trap occupancy surveys. Our evaluations suggest that our methodology estimates the proportion of sites used relatively reliably. Chimpanzees are mostly detected when they are present and when videos are filmed in high-resolution: the highest recall rate was 77%, for a false alarm rate of 2.8% for videos containing only chimpanzee frontal face views. Certainly, our study is only a first step for transferring face detection software from the lab into field application. Our results are promising and indicate that the current limitation of detecting chimpanzees in camera trap footage due to lack of suitable face views can be easily overcome on the level of field data collection, that is, by the combined placement of multiple high-resolution cameras facing reverse directions. This will enable to routinely conduct chimpanzee occupancy surveys based on camera trapping and semi-automated processing of footage.