Publica
Hier finden Sie wissenschaftliche Publikationen aus den FraunhoferInstituten. Adaptive state space quantisation for reinforcement learning agents
 Nahavandi, S. ; Natural and Artificial Intelligence Systems Organization NAISO: ICAIS 2002. CDROM : First International NAISO Congress on Autonomous Intelligent Systems, Geelong, Australia, 1215 February 2002 Canada / The Netherlands: ICSCNAISO Academic Press, 2002 ISBN: 3906454304 
 International NAISO Congress on Autonomous Intelligent Systems (ICAIS) <1, 2002, Geelong, Australia> 

 Englisch 
 Konferenzbeitrag 
 Fraunhofer AIS ( IAIS) () 
 reinforcement learning; vector quantizer 
Abstract
Autonomous automata should not only be able to learn how to behave efficiently in any predefined internal or external state, but also to construct an internal state representation optimized due to the agent's goal. Predefining all important states the agent should be able to distinguish is almost impossible. To discretize a state space of a given low dimensional environment it seems promising to simply fill this space with state representing prototypes of a vector quantizer. In the following investigation these prototypes form the basis for a reinforcement QLearning algorithm. By combining this algorithm with a new generalizing action selection, solutions could be found much faster than with standard QLearning. In addition this action selection makes the learning speed nearly independent of the number of state representing prototypes an agent has, provided that the number of prototypes is much higher than a minimal sufficient quantity. On the other hand the number of dimensions of an agent's state space may be too high to fill this space efficiently with more state representing prototypes than necessary to approximate an acceptable problem solution. In such complex environments with high dimensional state spaces we found a way to adjust the local resolution of the state space representation automatically, by minimizing the variance of the Qvalues and by optimizing the position of the prototypes. These approaches to learning algorithms are verified in a squash game simulation.