Dimensionality reduction of high-dimensional data with a nonLinear principal component aligned generative topographic mapping
Most high-dimensional real-life data exhibit some dependencies such that data points do not populate the whole data space but lie approximately on a lower-dimensional manifold. A major problem in many data mining applications is the detection of such a manifold and the expression of the given data in terms of a moderate number of latent variables. We present a method which is derived from the generative topographic mapping (GTM) and can be seen as a nonlinear generalization of the principal component analysis (PCA). It can detect certain nonlinearities in the data but does not suffer from the curse of dimensionality with respect to the latent space dimension as the original GTM and thus allows for higher embedding dimensions. We provide experiments that show that our approach leads to an improved data reconstruction compared to the purely linear PCA and that it can furthermore be used for classification.