The Dynamics of Reinforcement Learning in
Cooperative Multiagent Systems
Craig Boutilier, University of British Columbia
Reinforcement learning can provide a robust and natural means for agents
to learn how to coordinate their action choices in cooperative multiagent
systems. We examine some of the factors that can influence the dynamics
of the learning process in such a setting. We first distinguish reinforcement
learners that are unaware of (or ignore) the presence of other agents
from those that explicitly attempt to learn the value of joint actions
and the strategies of their counterparts. We study Q-learning in cooperative
multiagent systems under these two perspectives, focusing on the influence
of partial action observability, game structure, and exploration strategies
on convergence to (optimal and suboptimal) Nash equilibria and on learned
Q-values. We also consider variants of the usual exploration strategies
that can induce convergence to optimal equilibria in cases where they might
not otherwise be attained.
Joint work with Caroline Claus
Kee-Eung Kim
Get
Back