Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Towards large-scale gaussian process models for efficient bayesian machine learning

: Berns, F.; Beecks, C.


Hammoudi, Slimane (Ed.) ; Institute for Systems and Technologies of Information, Control and Communication -INSTICC-, Setubal:
9th International Conference on Data Science, Technology and Applications 2020. Proceedings : 7 - 9 July, 2020, web-based event
Setúbal: INSTICC, 2020
ISBN: 978-989-758-440-4
International Conference on Data Science, Technology and Applications (DATA) <9, 2020, Online>
Conference Paper
Fraunhofer FIT ()

Gaussian Process Models (GPMs) are applicable for a large variety of different data analysis tasks, such as time series interpolation, regression, and classification. Frequently, these models of bayesian machine learning instantiate a Gaussian Process by a zero-mean function and the well-known Gaussian kernel. While these default instantiations yield acceptable analytical quality for many use cases, GPM retrieval algorithms allow to automatically search for an application-specific model suitable for a particular dataset. State-of-the-art GPM retrieval algorithms have only been applied for small datasets, as their cubic runtime complexity impedes analyzing datasets beyond a few thousand data records. Even though global approximations of Gaussian Processes extend the applicability of those models to medium-sized datasets, sets of millions of data records are still far beyond their reach. Therefore, we develop a new large-scale GPM structure, which incorporates a divide-&- conquer-based paradigm and thus enables efficient GPM retrieval for large-scale data. We outline challenges concerning this newly developed GPM structure regarding its algorithmic retrieval, its integration with given data platforms and technologies, as well as cross-model comparability and interpretability.