Tech Report CS-07-10

No-Regret Learning in Convex Games

Geoff Gordon, Amy Greenwald, Casey Marks and Martin Zinkevich

October 2007


Quite a bit is known about minimizing different kinds of regret in experts problems, and how these results translate into statements about different kinds of equilibria in the multiagent setting of repeated matrix games. Much less is known about the possible kinds of regret in online convex programming problems (OCPs), or about the analogous multiagent setting of repeated convex games. This gap is unfortunate, since convex games are much more expressive than matrix games, and since many modern machine learning algorithms can be expressed as convex programs. In this paper, we work to close this gap: in addition to \emph{external} regret (which is the focus of most previous OCP algorithms, but leads to only a weak form of equilibrium) and \emph{swap} regret (which leads to the stronger notion of correlated equilibrium, but does not yet have efficient algorithms), we define three alternative types of regret for OCPs. We analyze these regret types and their corresponding equilibria, showing relationships among them and to known types of regret and equilibrium. And, we provide new algorithms for minimizing the most important of these forms of regret, leading to the most efficient known learning method that converges to correlated equilibria in convex games.

(complete text in pdf)