Policy gradient methods for Reinforcement Learning
Event details
Date | 27.08.2018 |
Hour | 14:00 › 16:00 |
Speaker | Paul Rolland |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Olivier Lévêque
Thesis advisor: Prof. Volkan Cevher
Co-examiner: Prof. Wulfram Gerstner
Abstract
We consider the problem of policy optimization in Reinforcement Learning via policy gradient method. We first show that it is possible to design a theoretical update rule for the policy parameters that converges to an optimal policy. We then present a state-of-the-art algorithm, called ``Trust Region Policy Optimization", that aims to approximate this update rule. Finally, we present a smart way of parametrizing the policy, which links both standard tabular methods and policy gradient methods.
Background papers
Policy Gradient Methods for Reinforcement Learning with Function Approximation, by Sutton, R., et al.
Trust Region Policy Optimization, by Schulman, J., et al.
Value Iteration Networks, by Tamar, A., et al.
Exam president: Prof. Olivier Lévêque
Thesis advisor: Prof. Volkan Cevher
Co-examiner: Prof. Wulfram Gerstner
Abstract
We consider the problem of policy optimization in Reinforcement Learning via policy gradient method. We first show that it is possible to design a theoretical update rule for the policy parameters that converges to an optimal policy. We then present a state-of-the-art algorithm, called ``Trust Region Policy Optimization", that aims to approximate this update rule. Finally, we present a smart way of parametrizing the policy, which links both standard tabular methods and policy gradient methods.
Background papers
Policy Gradient Methods for Reinforcement Learning with Function Approximation, by Sutton, R., et al.
Trust Region Policy Optimization, by Schulman, J., et al.
Value Iteration Networks, by Tamar, A., et al.
Practical information
- General public
- Free
Contact
- EDIC - [email protected]