Graphical Partially Observable Monte-Carlo Planning

Pfrommer, J.

doi:10.24406/publica-fhg-395203

2016

Presentation

Abstract

Monte-Carlo Tree Search (MCTS) techniques are state-of-the-art for online planning in Partially Observable Markov Decision Problems (POMDP). The recently proposed Factored-Value Multiagent POMCP (FV-MPOMCP) algorithm improves on the scalability of MCTS in Multiagent POMDP (MPOMDP) environments by estimating several Q-values, each considering a subset of the actions and observations, and combining these Q-values via Variable Elimination. However, in MPOMDP, only the cumulated reward for each step is known, with no insight on the reward structure. In this work, we additionally exploit the structure of reward that decomposes into local reward terms. The proposed Graphical Partially Observable Monte-Carlo Planning (GPOMCP) algorithm combines Monte-Carlo Tree Search with a variation of the message passing algorithm (Max-Sum) known from Graphical Probabilistic Models and Distributed Constraint Optimization.

Author(s)

Pfrommer, J.

Conference

Annual Conference on Neural Information Processing Systems (NIPS) 2016

Options

Graphical Partially Observable Monte-Carlo Planning