University of Birmingham > Talks@bham > Artificial Intelligence and Natural Computation seminars > Reinforcement Learning using Policy Gradients in Reproducing Kernel Hilbert Space

Reinforcement Learning using Policy Gradients in Reproducing Kernel Hilbert Space

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Lars Kunze.

Host: Dr. Ata Kaban

I will present a system for non-parametric policy search in reproducing kernel Hilbert space for solving reinforcement learning problems. The method has many benefits over standard parametric approaches: policies can be modeled in rich function classes; there is less need to rescale the search space using, e.g., natural gradients; the policy gradient can be easily derived and estimated; the method is adaptive to the complexity of the problem. The system uses sparse-greedy approaches to function estimation both to estimate the value function, and to maintain a compact, but expressive, policy representation. I will demonstrate the method on benchmark MDPs and simulated quadrocopter navigation experiments. If time permits I will present recent extensions to second order policy search methods.

Speaker’s homepage: http://www0.cs.ucl.ac.uk/staff/G.Lever/

This talk is part of the Artificial Intelligence and Natural Computation seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Talks@bham, University of Birmingham. Contact Us | Help and Documentation | Privacy and Publicity.
talks@bham is based on talks.cam from the University of Cambridge.