Publica
Hier finden Sie wissenschaftliche Publikationen aus den FraunhoferInstituten. Simplex distributions for embedding data matrices over time
 Society for Industrial and Applied Mathematics SIAM, Philadelphia/Pa.: 12th SIAM International Conference on Data Mining. Proceedings : Anaheim, California, April 26  28, 2012 Madison, Wisconsin: Omnipress, 2012 ISBN: 9781611972320 S.295306 
 International Conference on Data Mining <12, 2012, Anaheim/Calif.> 

 Englisch 
 Konferenzbeitrag 
 Fraunhofer IAIS () 
Abstract
Early stress recognition is of great relevance in precision plant protection. Presymptomatic water stress detection is of particular interest, ultimately helping to meet the challenge of "How to feed a hungry world?". Due to the climate change, this is of considerable political and public interest. Due to its largescale and temporal nature, e.g., when monitoring plants using hyperspectral imaging, and the demand of physical meaning of the results, it presents unique computational problems in scale and interpretability. However, big data matrices over time also arise in several other reallife applications such as stock market monitoring where a business sector is characterized by the ups and downs of each of its companies per year or topic monitoring of document collections. Therefore, we consider the general problem of embedding data matrices into Euclidean space over time without making any assumption on the generating distribution of each matrix. To do so, we repre sent all data samples by means of convex combinations of only few extreme ones computable in linear time. On the simplex spanned by the extremes, there are then natural candidates for distributions inducing distances between and in turn embeddings of the data matrices. We evaluate our method across several domains, including synthetic, text, and financial data as well as a largescale dataset on water stress detection in plants with more than 3 billion matrix entries. The results demonstrate that the embeddings are meaningful and fast to compute. The stress detection results were validated by a domain expert and conform to existing plant physiological knowledge.