Planning and control in stochastic domains with imperfect information

Milos Hauskrecht

Partially observable Markov decision processes (POMDPs) can be used to model complex control problems that include both action outcome uncertainty and imperfect observability. A control problem within the POMDP framework is expressed as a dynamic optimization problem with a value function that combines costs or rewards from multiple steps. Problems suitable for the framework include medical therapy planning, robot navigation tasks, and the troubleshooting and repair of malfunctioning systems.

Although the POMDP framework is more expressive than other simpler frameworks, its associated optimization methods are more demanding computationally. In fact, in practice, only very small problems can be solved exactly. In my thesis research I explored two possible approaches that can be used to make the framework applicable to larger real-world domains: approximation methods and exploitation of additional problem structure.

First, I designed a number of new approximation methods and improved some of the existing algorithms that find good control solutions efficiently. Second, I studied the ideas related to the modeling and exploitation of the additional structure on a medical therapy planning problem - the management of patients with chronic ischemic heart disease. The new extensions proposed include factored and hierarchically structured models that combine the advantages of POMDP and MDP frameworks and cut down the size and complexity of the state space one needs to work with.

Kee-Eung Kim

Get Back