IEM Distinguished Lecturers Seminar: Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

Thumbnail

Event details

Date 27.10.2023
Hour 13:1514:00
Speaker Prof. Quanquan Gu
University of California, Los Angeles, USA
Location Online
Category Conferences - Seminars
Event Language English
The seminar will take place in ELA 2 and will be simultaneously broadcasted in the main auditorium in Neuchâtel Campus (MC A1 272).

Coffee and cookies will be served at 13:00 before the seminar, in front of the two auditoriums. 

Abstract
How to make reinforcement learning (RL) efficient with large state and action spaces has been a central research problem in the RL community. A widely used approach is function approximation, which approximates the value function in RL with a predefined function class for efficient exploration and exploitation. In this talk, I will focus on RL with linear function approximation. For episodic time-inhomogeneous linear Markov decision processes (linear MDPs) whose transition dynamic can be parameterized as a linear function of a given feature mapping, I will present the first computationally efficient algorithm that achieves the nearly minimax optimal regret \tilde{O}(d\sqrt{H^3K}), where d is the dimension of the feature mapping, H is the planning horizon, and K is the number of episodes. Our algorithm is based on a weighted linear regression scheme with a carefully designed weight, which depends on a novel variance estimator that (1) directly estimates the variance of the optimal value function, (2) monotonically decreases with respect to the number of episodes to ensure a better estimation accuracy, and (3) uses a rare-switching policy to update the value function estimator to control the complexity of the estimated value function class. Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest.

This is a joint work with Jiafan He, Heyang Zhao and Dongruo Zhou.


Bio
Quanquan Gu is an Associate Professor of Computer Science at UCLA. His research is in the area of artificial intelligence and machine learning, with a focus on developing and analyzing nonconvex optimization algorithms for machine learning to understand large-scale, dynamic, complex, and heterogeneous data and building the theoretical foundations of deep learning and reinforcement learning. He received his Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign in 2014. He is a recipient of the Sloan Research Fellowship, NSF CAREER Award, Simons Berkeley Research Fellowship among other industrial research awards.