Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Non-parametric policy gradients

A unified treatment of propositional and relational domains
: Kersting, K.; Driessens, K.

McCallum, A. ; University of Helsinki:
Twenty-Fifth International Conference on Machine Learning, ICML 2008. Proceedings : Held July 5 - 9 at the University of Helsinki, in Helsinki, Finland. Co-located with COLT-2008, the 21st Annual Conference on Computational Learning Theory, and UAI-2008, the 24th Conference on Uncertainty in Artificial Intelligence, workshops organized jointly
Helsinki, 2008
ISBN: 978-1-605-58205-4
International Conference on Machine Learning (ICML) <25, 2008, Helsinki>
Fraunhofer IAIS ()

Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains only. Without extensive feature engineering, it is difficult - if not impossible - to apply them within structured domains, in which e.g. there is a varying number of objects and relations among them. In this paper, we describe a non-parametric policy gradient approach - called NPPG - that overcomes this limitation. The key idea is to apply Friedmann's gradient boosting: policies are represented as a weighted sum of regression models grown in an stage-wise optimization. Employing off-the-shelf regression learners, NPPG can deal with propositional, continuous, and relational domains in a unified way. Our experimental results show that it can even improve on established results.