Now showing 1 - 3 of 3
  • Publication
    Plant phenotyping using probabilistic topic models: Uncovering the hyperspectral language of plants
    ( 2016) ;
    Mahlein, A.-K.
    ;
    ;
    Steiner, U.
    ;
    Oerke, E.-C.
    ;
    Modern phenotyping and plant disease detection methods, based on optical sensors and information technology, provide promising approaches to plant research and precision farming. In particular, hyperspectral imaging have been found to reveal physiological and structural characteristics in plants and to allow for tracking physiological dynamics due to environmental effects. In this work, we present an approach to plant phenotyping that integrates non-invasive sensors, computer vision, as well as data mining techniques and allows for monitoring how plants respond to stress. To uncover latent hyperspectral characteristics of diseased plants reliably and in an easy-to-understand way, we "wordify" the hyperspectral images, i.e., we turn the images into a corpus of text documents. Then, we apply probabilistic topic models, a well-established natural language processing technique that identifies content and topics of documents. Based on recent regularized topic models, we demonstrate that one can track automatically the development of three foliar diseases of barley. We also present a visualization of the topics that provides plant scientists an intuitive tool for hyperspectral imaging. In short, our analysis and visualization of characteristic topics found during symptom development and disease progress reveal the hyperspectral language of plant diseases.
  • Publication
    Early drought stress detection in cereals: Simplex volume maximization for hyperspectral image analysis
    ( 2012)
    Römer, Christoph
    ;
    ;
    Ballvora, Agim
    ;
    Pinto, Francisco
    ;
    Rossini, Micol
    ;
    Cinzia, Panigada
    ;
    Behmann, Jan
    ;
    Léon, Jens
    ;
    ; ; ;
    Rascher, Uwe
    ;
    Plümer, Lutz
    Early water stress recognition is of great relevance in precision plant breeding and production. Hyperspectral imaging sensors can be a valuable tool for early stress detection with high spatio-temporal resolution. They gather large, high dimensional data cubes posing a significant challenge to data analysis. Classical supervised learning algorithms often fail in applied plant sciences due to their need of labelled datasets, which are difficult to obtain. Therefore, new approaches for unsupervised learning of relevant patterns are needed. We apply for the first time a recent matrix factorisation technique, simplex volume maximisation (SiVM), to hyperspectral data. It is an unsupervised classification approach, optimised for fast computation of massive datasets. It allows calculation of how similar each spectrum is to observed typical spectra. This provides the means to express how likely it is that one plant is suffering from stress. The method was tested for drought stress, applied to potted barley plants in a controlled rain-out shelter experiment and to agricultural corn plots subjected to a two factorial field setup altering water and nutrient availability. Both experiments were conducted on the canopy level. SiVM was significantly better than using a combination of established vegetation indices. In the corn plots, SiVM clearly separated the different treatments, even though the effects on leaf and canopy traits were subtle.
  • Publication
    More influence means less work: Fast latent dirichlet allocation by influence scheduling
    Name ambiguity arises from the polysemy of names and causes uncertainty about the true identity of entities referenced in unstructured text. This is a major problem in areas like information retrieval or knowledge management, for example when searching for a specific entity or updating an existing knowledge base. We approach this problem of named entity disambiguation (NED) using thematic information derived from Latent Dirichlet Allocation (LDA) to compare the entity mention's context with candidate entities in Wikipedia represented by their respective articles. We evaluate various distances over topic distributions in a supervised classification setting to find the best suited candidate entity, which is either covered in Wikipedia or unknown. We compare our approach to a state of the art method and show that it achieves significantly better results in predictive performance, regarding both entities covered in Wikipedia as well as uncovered entities. We show that our approach is in general language independent as we obtain equally good results for named entity disambiguation using the English, the German and the French Wikipedia.