Fundamental Research on Convolutional Neural Network Architectures for Extracting Spectral Features from Hyperspectral Data - An Approach of Adapting Pre-trained Models
Hyperspectral image classification is a powerful technique to gain knowledge about rec-orded objects in many fields. To acquire remotely sensed hyperspectral imagery from space, however, no publicly available imaging spectrometer exists, but several sensors are planned to be launched in the near future. Spaceborne data usually comes with a medium spatial resolution, which impedes the integration of the spatial domain for a classification due to mixed pixels and the absence of clear edges in the data. Hence, it is important to be able to perform a meaningful classification only relying on spectral information. The choice of the classifier can highly influence a classification result, too. For image classification, convolutional neural networks (CNNs) recently caught the attention of researchers since they can outperform traditional methods and are capable of transfer learning from a source task to a related target task. In this thesis, fundamental research is conducted for extracting spectral features from hyperspectral imagery with CNNs. For that, hyperspectral data of various object types (nuts, beans, peas and dried fruits) is acquired. By strategically optimising a CNN, a suitable model architecture to classify the field campaign data is derived. The pre-trained CNN is then used to evaluate the models capability of transfer learning. Therefore, three airborne hyperspectral benchmark datasets are classified using the pre-trained model. The results of this study show that spectral feature extraction for the field campaign data is successful as a classification accuracy of 77.13% is achieved. For the benchmark data, substantial differences in classification accuracies (from 71.77% to 99.79%) are accomplished. They are mainly caused by variations in the number of samples, the spectral separability of the classes as well as the existence of mixed pixels for the dataset with low spatial resolution. For the dataset that is classified least accurately, the greatest improvement with pre-training is achieved (difference of 3.25% in accuracy compared to the same, but non-pre-trained model). For the benchmark data that is classified with the highest accuracy, no transfer learning is observed. Hence, pre-training is beneficial especially for datasets that are difficult to classify. As the spectral feature detection for the field campaign data is successful, the pre-trained CNN developed in this study can be recommended for spectral classification of hyper-spectral data. The CNN's knowledge of spectral feature detection from the Cubert data is transferable to other data and classes. Thus, the pre-trained model created in this work is recommended to classify other hyperspectral data, especially when limited accuracies are expected due to restricted spectral separability of the classes or when only little la-belled data is available or extensive labelling is not possible due to time constraints.
Karlsruhe, Karlsruher Institut für Technologie, Master Thesis, 2019