BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:Policy gradient methods for Reinforcement Learning
DTSTART:20180827T140000
DTEND:20180827T160000
DTSTAMP:20260406T172845Z
UID:e799a09a8a9dc98ce3c9e800204c71724fa92565bf98955825a44969
CATEGORIES:Conferences - Seminars
DESCRIPTION:Paul Rolland\nEDIC candidacy exam\nExam president: Prof. Olivi
 er Lévêque\nThesis advisor: Prof. Volkan Cevher\nCo-examiner: Prof. Wulf
 ram Gerstner\n\nAbstract\nWe consider the problem of policy optimization i
 n Reinforcement Learning via policy gradient method. We first show that it
  is possible to design a theoretical update rule for the policy parameters
  that converges to an optimal policy. We then present a state-of-the-art a
 lgorithm\, called ``Trust Region Policy Optimization"\, that aims to appro
 ximate this update rule. Finally\, we present a smart way of parametrizing
  the policy\, which links both standard tabular methods and policy gradien
 t methods.\n\nBackground papers\nPolicy Gradient Methods for Reinforcement
  Learning with Function Approximation\, by Sutton\, R.\, et al.\nTrust Reg
 ion Policy Optimization\, by Schulman\, J.\, et al.\nValue Iteration Netwo
 rks\, by Tamar\, A.\, et al.
LOCATION:ELD 120 https://plan.epfl.ch/?room=ELD120
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
