DIVA: A self organizing adaptive world model for reinforcement learning

Fischer, J.; Breithaupt, R.; Bode, M.; Hertzberg, J.

2002

Conference Paper

Abstract

Reinforcement learning algorithms without an internal world model often suffer from overly long time to converge. Mostly the agent has to be successful a several hundred times before it could learn how to behave in even simple environments. In this case, a world model could be useful to reduce the number of real world trials by performing the action virtually in the world model. This may help to propagate the Reinforcement Q- or V- values much faster through the state (action) space and could be interpreted as a simple form of planning. In the following investigation we introduce a self organizing deterministic world model ""DIVA"" (""Discretization Improvement by Variance reduction"") with an adaptive discretization, which can speed up learning by using common methods like Suttons Dyna-Q. Proposed in this article, the ""DIVA""-model is implemented in a six legged walking robot, which learns how to walk in a minimum of time and with a minimum of real world moving trials.

Author(s)

Fischer, J.

Breithaupt, R.

Bode, M.

Hertzberg, J.

Hauptwerk

ICAIS 2002. CD-ROM

Konferenz

International NAISO Congress on Autonomous Intelligent Systems (ICAIS)

Language

English

google-scholar

AIS

Options

DIVA: A self organizing adaptive world model for reinforcement learning