Options
October 31, 2024
Master Thesis
Title
Pixel-wise Out-of-Distribution Detection in Semantic Segmentation for Aerial Imagery using Normalising Flow Networks for Negative Feature Synthesis
Abstract
In a world that is always evolving where autonomous systems have to navigate through unpredictable environments, the challenge of detecting suspicious and unknown objects in its environment has never been more critical. Despite State-of-the-Art (SOTA) models producing almost flawless results in tasks like object detection and segmentation, they struggle to recognize unknown data. In safety-critical applications such as autonomous driving and autonomous flight, even a small misunderstanding can lead to fatal accidents.
The issue with deep learning models, is that they often operate under a closed-world assumption - that is, the model assumes that the data encountered during inference always belongs to the set of classes it has encountered while training. In reality, it is an open-world setting, because there will be data that may not belong to the trained set of classes. Detecting data that is not part of the training distribution, Out-of-Distribution (OoD), in the scene is crucial.
This master’s thesis is part of a larger project that focuses on trustworthy environmental perception. In this research work, a framework has been developed to train an energy module that indicates whether a pixel is an OoD pixel or an In-Distribution (ID) pixel. This energy module is trained using the inlier features generated by the main network (UPerNet) and the negative features that are generated by a normalizing flow network. The normalizing flow network is trained by the features produced by UPerNet. The implementation is first tested with the Cityscapes dataset as the inlier dataset and then later this implementation has been extended to an aerial dataset where the UAVid dataset is used as an inlier dataset. In both cases, evaluation of the models is done on outlier exposed validation sets of the respective datasets. Various images have been included showing the results of the model’s performance and limitations.
The issue with deep learning models, is that they often operate under a closed-world assumption - that is, the model assumes that the data encountered during inference always belongs to the set of classes it has encountered while training. In reality, it is an open-world setting, because there will be data that may not belong to the trained set of classes. Detecting data that is not part of the training distribution, Out-of-Distribution (OoD), in the scene is crucial.
This master’s thesis is part of a larger project that focuses on trustworthy environmental perception. In this research work, a framework has been developed to train an energy module that indicates whether a pixel is an OoD pixel or an In-Distribution (ID) pixel. This energy module is trained using the inlier features generated by the main network (UPerNet) and the negative features that are generated by a normalizing flow network. The normalizing flow network is trained by the features produced by UPerNet. The implementation is first tested with the Cityscapes dataset as the inlier dataset and then later this implementation has been extended to an aerial dataset where the UAVid dataset is used as an inlier dataset. In both cases, evaluation of the models is done on outlier exposed validation sets of the respective datasets. Various images have been included showing the results of the model’s performance and limitations.
Thesis Note
Siegen, Univ., Master Thesis, 2024
Author(s)
Advisor(s)
Language
English