Options
2023
Conference Paper
Title
Overcoming Deceptive Rewards with Quality-Diversity
Abstract
Quality-Diversity offers powerful ideas to create diverse, high-performing populations. Here, we investigate the capabilities these ideas hold to solve exploration-hard single-objective problems, in addition to creating diverse high-performing populations.
We find that MAP-Elites is well suited to overcome deceptive reward structures, while an Elites-type approach with an unstructured, distance based container and extinction events can even outperform it.
Furthermore, we analyse how the QD score, the standard evaluation of MAP-Elites type algorithms, is not well suited to predict the success of a configuration in solving a maze. This shows that the exploration capacity is an entirely different dimension in which QD algorithms can be utilized, evaluated, and improved on. It is a dimension that does not currently seem to be covered, implicitly or explicitly, by the current advances in the field.
We find that MAP-Elites is well suited to overcome deceptive reward structures, while an Elites-type approach with an unstructured, distance based container and extinction events can even outperform it.
Furthermore, we analyse how the QD score, the standard evaluation of MAP-Elites type algorithms, is not well suited to predict the success of a configuration in solving a maze. This shows that the exploration capacity is an entirely different dimension in which QD algorithms can be utilized, evaluated, and improved on. It is a dimension that does not currently seem to be covered, implicitly or explicitly, by the current advances in the field.