On the information and representation of non-Euclidean pairwise data
Two common data representations are mostly used in intelligent data analysis, namely the vectorial and the pairwise representation. Pairwise data which satisfy the restrictive conditions of Euclidean spaces can be faithfully translated into a Euclidean vectorial representation by embedding. Non-metric pairwise data with violations of symmetry, reflexivity or triangle inequality pose a substantial conceptual problem for pattern recognition since the amount of predictive structural information beyond what can be measured by embeddings is unclear. We show by systematic modeling of non-Euclidean pairwise data that there exists metric violations which can carry valuable problem specific information. Furthermore, Euclidean and non-metric data can be unified on the level of structural information contained in the data. Stable component analysis selects linear subspaces which are particularly insensitive to data fluctuations. Experimental results from different domains support our pattern recognition strategy.