Novelty-Guided Reinforcement Learning via Encoded Behaviors
Despite the successful application of Deep Reinforcement Learning (DRL) in a wide range of complex tasks, agents either often learn sub-optimal behavior due to the sparse/deceptive nature of rewards or require a lot of interactions with the environment. Recent methods combine a class of algorithms known as Novelty Search (NS), which circumvents this problem by encouraging exploration towards novel behaviors. Even without exploiting any environment rewards, they are capable of learning skills that yield competitive results in several tasks. However, to assign novelty scores to policies, these methods rely on neighborhood models that store behaviors in an archive set. Hence they do not scale and generalize to complex tasks requiring too many policy evaluations. Addressing these challenges, we propose a function approximation paradigm to instead learn sparse representations of agent behaviors using auto-encoders, which are later used to assign novelty scores to policies. E xperimental results on benchmark tasks suggest that this way of novelty-guided exploration is a viable alternative to classic novelty search methods.