Tight Finite Time Bounds of Two-Time-Scale Linear Stochastic Approximation with Markovian Noise

Event details

Date	25.06.2024
Hour	14:00 › 15:00
Speaker	Professor Sajad Khodadadian
Location	ODY 4 03 Online
Category	Conferences - Seminars
Event Language	English

Abstract: Stochastic approximation (SA) is an iterative algorithm to find the fixed point of an operator given noisy samples of this operator. SA appears in many areas such as optimization and Reinforcement Learning (RL). When implemented in practice, the noise that appears in the update of RL algorithms is naturally Markovian. Furthermore, in some settings, such as gradient temporal difference (GTD), SA is employed in a two-time-scale manner. The mix of Markovian noise along with the two-time-scale structure results in an algorithm which is complex to analyze theoretically. In this talk, we characterize a tight convergence bound for the iterations of linear two-time-scale SA with Markovian noise. Our results show the convergence behavior of this algorithm given various choices of step sizes.

Applying our result to the well-known temporal difference with gradient correction (TDC) algorithm, we show the first O(1/ϵ) sample complexity for the convergence of this algorithm, outperforming all the previous work. Similarly, our results can be applied to establish the convergence behavior of a variety of RL algorithms, such as TD-learning with Polyak-Ruppert averaging, GTD, and GTD2.

Bio sketch: Sajad Khodadadian is an Assistant Professor in the Grado Department of Industrial and Systems Engineering in Virginia Tech. Dr. Khodadadian’s research is on design and analysis of Reinforcement Learning algorithms. He received his PhD in Operations Research from School of Industrial and Systems Engineering in Georgia Institute of Technology.

Practical information

Informed public
Registration required

Organizer

Professor Negar Kiyavash

Contact

cdm-seminars@epfl.ch

Export Event

Event broadcasted in

Send a reminder