• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Scopus
  4. Distributed Consensus Algorithm for Decision-Making in Multiagent Multiarmed Bandit
 
  • Details
  • Full
Options
2024
Journal Article
Title

Distributed Consensus Algorithm for Decision-Making in Multiagent Multiarmed Bandit

Abstract
In this article, we study a structured multiagent multiarmed bandit (MAMAB) problem in a dynamic environment. A graph reflects the information-sharing structure among agents, and the arms' reward distributions are piecewise-stationary with several unknown change points. The agents face the identical piecewise-stationary MAB problem. The goal is to develop a decision-making policy for the agents that minimizes the regret, which is the expected total loss of not playing the optimal arm at each time step. Our proposed solution, restarted Bayesian online change point detection in cooperative upper confidence bound (RBO-Coop-UCB) algorithm, involves an efficient multiagent UCB algorithm as its core enhanced with a Bayesian change point detector. We also develop a simple restart decision cooperation that improves decision-making. Theoretically, we establish that the expected group regret of RBO-Coop-UCB is upper bounded by O(KNMT + KMTT), where K is the number of agents, M is the number of arms, and T is the number of time steps. Numerical experiments on synthetic and real-world datasets demonstrate that our proposed method outperforms the state-of-the-art algorithms.
Author(s)
Cheng, Xiaotong
Ruhr-Universitat Bochum
Maghsudi, Setareh
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Journal
IEEE transactions on control of network systems  
Funder
Deutsche Forschungsgemeinschaft  
DOI
10.1109/TCNS.2024.3395850
Language
English
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut HHI  
Keyword(s)
  • Change point detection

  • distributed learning

  • multiagent cooperation

  • multiarmed bandit (MAB)

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024