Investigation of the Wavelet Transformation as a Feature Enhancement Tool for Machine Learning
Forecasting of time series is one of the most important areas in the field of data science. In the past, most of the time series are converted into segregated signals (decomposition) and then the solution to this prediction problem is found. Decomposition of signals goes along with the removal of most of the time components, included in the common time series and can be very difficult to handle. Data from the process industry is strong depending on the time. Many researchers and programmers use LSTM networks to handle time-dependent components within time series. However, the handling of the time dependent components can be limited because of the used lags and storage capabilities of the LSTM. The thesis addressed the time series prediction analysis by considering the time components from the signal wavelet coefficients. For this purpose, data from the chemical process industry was collected and subsequent adjustments in terms of initial data analysis (IDA) and the exploratory data analysis (EDA) where the made. After removing unneeded data, samples are processed to obtain the process signal wavelet coefficients along with the process signal samples in MATLAB. Each wavelet coefficient contains a specific information about the respective process signal's. Our goal is to predict the time series using wavelet coefficients through the neural network architecture. Typically, the predictive analysis of an artificial neural network is not combined with wavelet coefficients. For the prediction analysis, the collected data from the process signal and its wavelet coefficients informations serve as input to the neural network. An LSTM architecture is used for our predictive analysis because large amounts of historical data samples can be stored. The parameter configurations are chosen with respect to the used process industry signal and will be discussed below in detail. The main configuration parameters for the LSTM are found by examining synthetic signals such as sinewave, chirp, frequency modulation (FM) and amplitude modulation (AM) signals. From these synthetic data, the model parameters of the given process signal can be estimated by observing their characteristic results such as the change in amplitude and frequency. There are some challenges in integrating the coefficients and their observed samples into the machine-learning part and these are addressed by different approaches. For each approach a detailed discussion about the challenges and occurred problems will be given. Overall, it will be shown if the used approaches to the prediction analysis are viable.
Magdeburg, Univ., Master Thesis, 2019