BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:Variance reduction in reinforcement learning optimization
DTSTART:20220825T090000
DTEND:20220825T110000
DTSTAMP:20260407T051026Z
UID:f028cdf156506c9bab7e4e957c3e90685432d0420e2752489fb36af9
CATEGORIES:Conferences - Seminars
DESCRIPTION:Mohammadsadegh Khorasani\nEDIC candidacy exam\nExam president:
  Prof. Martin Jaggi\nThesis advisor: Prof. Negar Kiyavash\nThesis co-advis
 or: Prof. Matthias Grossglauser\nCo-examiner: Prof. Patrick Thiran\n\nAbst
 ract\nThe variance-reduced gradient estimators for policy gradient methods
  has been one of the main focus of research in reinforcement learning in r
 ecent years as they allow acceleration of the estimation process. In this 
 report\, I review some recent work attempting to adapt variance reduction 
 techniques into the context of RL and achieve $\\epsilon$-first-order stat
 ionary point in $O(\\epsilon^{-3})$ number of trajectories. Most previous 
 work requires huge batch size or importance sampling techniques that can c
 ompromise the advantage of the variance reduction process. Moreover\, I pr
 esent some ideas that can be adapted to recent variance reduction methods 
 from a method in the context of optimization into the context of RL while 
 using a batch size of $O(1)$ and without importance sampling. Furthermore\
 , I discuss experimental results comparing previous methods.\nFinally\, I 
 will propose some possible future directions to devise variance reduction 
 methods well-matched to the RL setting. Furthermore\, I will discuss some 
 practical issues that should be taken into account in the implementation a
 nd evaluation of optimization algorithms.\n\nBackground papers\n1- "Hessi
 an Aided Policy Gradient" (Shen 2019 ). As the main reference paper in t
 his area.\n2- "Better SGD using Second-order Momentum"(Tran21). The paper 
 that we applied for RL.\n3- "Momentum-Based Policy Gradient Methods" (link
 ). The most recent work on variance reduction in RL is based on STORM.\n
  
LOCATION:
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR