Diversity in Reinforcement Learning Through the Occupancy Measure

Feiden, Arno; Garcke, Jochen

doi:10.1145/3712256.3726337

2025

Conference Paper

Abstract

Quality-Diversity algorithms search for a set of diverse, high-performing solutions to optimization problems, including reinforcement learning problems. In the case of reinforcement learning problems, Quality-Diversity algorithms foster diversity by differentiating solutions using behaviour descriptors. We introduce a straightforward, powerful approach to generically characterise behaviour using the so-called occupancy measure. Our approach avoids the manual definition of behaviour descriptors and does not rely on further black-box learning.We investigate four established benchmark problems inspired by robotics, concerning locomotion and maze navigation. To measure the ability to overcome local optima we consider the number of solved configurations and the maximum average score. The use of the occupancy measure is competitive with problem-specific, custom behaviour descriptors and superior to an established generic behaviour descriptor. Our work contributes to the establishment of MAP-Elites as a versatile, robust, out-of-the-box solver for complex non-convex reinforcement learning scenarios.

Author(s)

Feiden, Arno

Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI

Garcke, Jochen

Institut für Numerische Simulation, Universität Bonn

Mainwork

GECCO 2025, Genetic and Evolutionary Computation Conference. Proceedings

Conference

Genetic and Evolutionary Computation Conference 2025

Options

Diversity in Reinforcement Learning Through the Occupancy Measure