BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:A game theoretic perspective on Reinforcement and Imitation Learni
ng
DTSTART;VALUE=DATE-TIME:20220712T090000
DTEND;VALUE=DATE-TIME:20220712T110000
UID:73e3ba188a9ad46d5b8d0101a33df5073643e314eeb5349121aa5a9e
CATEGORIES:Conferences - Seminars
DESCRIPTION:Luca Viano\nEDIC candidacy exam\nExam president: Prof. Nicolas
Boumal\nThesis advisor: Prof. Volkan Cevher\nCo-examiner: Prof. Maryam Ka
mgarpour\n\nAbstract\nThe Proximal Point Method (PPM) enjoys favorable con
vergence properties due to Rockafellar\, 1976 and Gueler\, 1991 but it can
be rarely implemented in practice as the implementation of a single updat
e can be as hard as the original problem. It is known however that proxima
l point can be implemented for linear losses where proximal point coincide
s with gradient descent in the euclidean case or to mirror descent in the
Bregman setup. This fact has been leveraged in the reinforcement learning
community to develop algorithms like Relative Entropy Policy Search (REPS)
(Peters et al.\,2010 and Pacchiano et al.\,2021) for the online RL settin
g\, O-REPS for the adversarial MDP setting and PRO-RL for the offline sett
ing.\nIn this document we notice that proximal point can be still implemen
ted for the particular case of functions defined in max form that is of in
terest for the imitation learning. Under this setting proximal point can s
till be implement (approximately) while being different from mirror descen
t. We also present IQ-Learn (Garg et al.\, 2021) a recently proposed\, eff
icient algorithm for imitation learning under a proximal point perspective
.\n\nBackground papers\nOn the convergence of the proximal point algorithm
for convex minimisation\, Osman Gueler\, 1991\nhttps://www.researchgate.n
et/publication/267054708_On_the_Convergence_of_the_Proximal_Point_Algorith
m_for_Convex_Minimization\n\nNear Optimal Policy Optimisation via REPS\,
Aldo Pacchiano et al.\, 2021\nhttps://proceedings.neurips.cc/paper/2021/fi
le/08d562c1eedd30b15b51e35d8486d14c-Paper.pdf\n\nIQ-learn: Inverse soft-Q
learning for imitation\, Divyansh Garg et al.\, 2021\nhttps://proceedings.
neurips.cc/paper/2021/file/210f760a89db30aa72ca258a3483cc7f-Paper.pdf\n\n\
n
LOCATION:ELD 120 https://plan.epfl.ch/?room==ELD%20120
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR