Options
2025
Conference Paper
Title
Diversity in Reinforcement Learning Through the Occupancy Measure
Abstract
Quality-Diversity algorithms search for a set of diverse, high-performing solutions to optimization problems, including reinforcement learning problems. In the case of reinforcement learning problems, Quality-Diversity algorithms foster diversity by differentiating solutions using behaviour descriptors. We introduce a straightforward, powerful approach to generically characterise behaviour using the so-called occupancy measure. Our approach avoids the manual definition of behaviour descriptors and does not rely on further black-box learning.We investigate four established benchmark problems inspired by robotics, concerning locomotion and maze navigation. To measure the ability to overcome local optima we consider the number of solved configurations and the maximum average score. The use of the occupancy measure is competitive with problem-specific, custom behaviour descriptors and superior to an established generic behaviour descriptor. Our work contributes to the establishment of MAP-Elites as a versatile, robust, out-of-the-box solver for complex non-convex reinforcement learning scenarios.
Author(s)
Open Access
File(s)
Rights
CC BY 4.0: Creative Commons Attribution
Additional full text version
Language
English