Description: Through a combination of classic papers and more recent work, the course explores automated decision making from a computer-science perspective. It examines efficient algorithms, where they exist, for single agent and multiagent planning as well as approaches to learning near-optimal decisions from experience. Topics will include Markov decision processes, stochastic and repeated games, partially observable Markov decision processes, and reinforcement learning. Of particular interest will be issues of generalization, exploration, and representation. Depending upon enrollment, each student may be expected to present a published research paper and will participate in a group project to create a reinforcement-learning system for a video game. Participants should have taken a graduate-level computer science course and should have some exposure to reinforcement learning from a previous computer-science class or seminar; check with instructor if not sure.


Sutton (1990)
Silver, Sutton, and Mueller (2008). Optional: Chaslot, Winands, Herik, Uiterwijk, and Bouzy (2008)

The RL survey referred to below is Kaelbling, Littman, Moore (1996).

UCT and Go. Recent Alberta work on function approximation. Bayesian RL. Natural policy gradient. RL in Neuroscience. Unlearning in SARSA(0) in Tetris. Ramon et al..

