Options
February 15, 2024
Master Thesis
Title
Lead-Critic Guided Learning Schemes for Multi-Agent Reinforcement Learning
Other Title
Lead-Critic-geleitete Lernschemata für Multi-Agenten-Systeme im Reinforcement Learning
Abstract
This Master’s thesis is part of a larger project that focuses on improving traffic control in Ingolstadt, Germany. While Reinforcement Learning (RL) and Multi-Agent Reinforcement Learning (MARL) are already popular approaches in research to develop better-performing traffic light control systems, their design assumptions often make them infeasible to apply directly to real-world scenarios.
This thesis explores a novel multi-agent Actor-Critic (AC) architecture, extended by two main components: an attention module to encode the global state of the system, and a guiding lead-critic to potentially enhance cooperation between agents. This includes conducting an ablation study analyzing the proposed architectural improvements using benchmarking environments as well as transferring the gathered insights to real-world traffic control using SUMO simulations.
Key challenges to be addressed include the asynchronous execution of actions across multiple agents and the correct balance between the effects of global and local optimization. The results of the thesis are expected to facilitate the transition of MARL methods from simplified simulation environments to running urban traffic management systems. In addition, the proposed learning schemes could introduce the ability to provide high-level guidance for cooperation in multi-agent scenarios.
This thesis explores a novel multi-agent Actor-Critic (AC) architecture, extended by two main components: an attention module to encode the global state of the system, and a guiding lead-critic to potentially enhance cooperation between agents. This includes conducting an ablation study analyzing the proposed architectural improvements using benchmarking environments as well as transferring the gathered insights to real-world traffic control using SUMO simulations.
Key challenges to be addressed include the asynchronous execution of actions across multiple agents and the correct balance between the effects of global and local optimization. The results of the thesis are expected to facilitate the transition of MARL methods from simplified simulation environments to running urban traffic management systems. In addition, the proposed learning schemes could introduce the ability to provide high-level guidance for cooperation in multi-agent scenarios.
Thesis Note
München, TU, Master Thesis, 2024