Density Estimation and Markov Decision Processes

Ronald Parr

The Markov Decision Process (MDP) and Partially Observable Markov Decision Process (POMDP) frameworks are rich formal frameworks for describing stochastic planning and control problems. In these problems one is interested in finding a policy for acting that minimizes cost or maximizes benefit. This is typically done either by constructing a value function, which estimates the value of each state in the environment, or by using some scheme for searching the space of policies. Both of these methods have limitations when applied to very large problems, making an efficient and fully general approach to solving such problems an elusive goal.

In this talk I will give an overview of some recent work with colleagues at Stanford and Berkeley on scaling (PO)MDP methods to solve larger problems. The use of approximate probability distributions plays an important role in each of these methods. In some cases it plays a supporting role, while in others it offers a fairly radical departure from traditional methods. I will present some basic theoretical results and some promising preliminary simulation results showing the efficacy of these methods. These results suggest that density estimation may become as important as value function approximation in the search for general and powerful methods for (PO)MDPs.

This talk will describe joint work with Daphne Koller, Andrew Ng, and Andres Rodriguez.

Kee-Eung Kim

Get Back

Last modified: Mon Oct 18 14:37:49 EDT 1999