Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Adaptive state space quantisation for reinforcement learning agents

: Breithaupt, R.; Fischer, J.; Bode, M.

Nahavandi, S. ; Natural and Artificial Intelligence Systems Organization -NAISO-:
ICAIS 2002. CD-ROM : First International NAISO Congress on Autonomous Intelligent Systems, Geelong, Australia, 12-15 February 2002
Canada / The Netherlands: ICSC-NAISO Academic Press, 2002
ISBN: 3-906454-30-4
International NAISO Congress on Autonomous Intelligent Systems (ICAIS) <1, 2002, Geelong, Australia>
Fraunhofer AIS ( IAIS) ()
reinforcement learning; vector quantizer

Autonomous automata should not only be able to learn how to behave efficiently in any predefined internal or external state, but also to construct an internal state representation optimized due to the agent's goal. Predefining all important states the agent should be able to distinguish is almost impossible. To discretize a state space of a given low dimensional environment it seems promising to simply fill this space with state representing prototypes of a vector quantizer. In the following investigation these prototypes form the basis for a reinforcement Q-Learning algorithm. By combining this algorithm with a new generalizing action selection, solutions could be found much faster than with standard Q-Learning. In addition this action selection makes the learning speed nearly independent of the number of state representing prototypes an agent has, provided that the number of prototypes is much higher than a minimal sufficient quantity. On the other hand the number of dimensions of an agent's state space may be too high to fill this space efficiently with more state representing prototypes than necessary to approximate an acceptable problem solution. In such complex environments with high dimensional state spaces we found a way to adjust the local resolution of the state space representation automatically, by minimizing the variance of the Q-values and by optimizing the position of the prototypes. These approaches to learning algorithms are verified in a squash game simulation.