Kavosh Asadi

Ph.D. Candidate

Brown University

about me:

I am a 5th year Ph.D. candidate at Brown University working with Professor Michael Littman and Professor George Konidaris. My career goal is to understand computational principles underlying intelligence. More specifically, I study agents that interact with a sequential environment to improve their behavior through trial and error. This is nicely formulated as a Reinforcement Learning problem. You can read my research statement here.

I am also a passionate animal advocate.


  • Reinforcement Learning
  • Large-scale Optimization
  • Ethical Artificial Intelligence


  • Ph.D. in Computer Science, 2015 - Present

    Brown University

  • M.Sc. in Computer Science, 2015

    University of Alberta

  • B.Eng. in Computer Engineering, 2013

    University of Tehran


Deep RBF Value Functions for Continuous Control

Lipschitz Continuity for Model-based Reinforcement Learning

An Alternative Softmax Operator for Reinforcement Learning


An Alternative Look at Discount Rates in Reinforcement Learning

You love it if you have studied RL. It helps evade the awkward situation where you need to compare two infinite sums when ranking two …

Recent Publications

Smoothness in Reinforcement Learning with Large State and Action Space

Reinforcement learning (RL) is the study of the interaction between an environment and an agent that learns to achieve a goal through trial-and-error. Owing to its generality, RL has successfully been applied to various applications including those with enormous state and action spaces. In light of the curse of dimensionality, a fundamental question here is how to design RL algorithms that are compatible with function approximation but also capable of tackling longstanding challenges of RL including convergence guarantees, exploration-exploitation, and planning.

Deep RBF Value Functions for Continuous Control

A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned state–action value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep RBF value functions: state–action value functions learned using a deep neural network with a radial-basis function (RBF) output layer. We show that the optimal action with respect to a deep RBF value function can be easily approximated up to any desired accuracy.

Lipschitz Lifelong Reinforcement Learning

We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks. We introduce a novel metric between Markov Decision Processes and establish that close MDPs have close optimal value functions. Formally, the optimal value functions are Lipschitz continuous with respect to the tasks space. These theoretical results lead us to a value transfer method for Lifelong RL, which we use to build a PAC-MDP algorithm with improved convergence rate.

Research Statement

Towards a Simple Approach to Multi-step Model-based Reinforcement Learning

When environmental interaction is expensive, model-based reinforcement learning offers a solution by planning ahead and avoiding costly mistakes. Model-based agents typically learn a single-step transition model. In this paper, we propose a multi-step model that predicts the outcome of an action sequence with variable length. We show that this model is easy to learn, and that the model can make policy-conditional predictions. We report preliminary results that show a clear advantage for the multi-step model compared to its one-step counterpart.