Tight Finite Time Bounds of Two-Time-Scale Linear Stochastic Approximation with Markovian Noise


Event details

Date 25.06.2024
Hour 14:0015:00
Speaker Professor Sajad Khodadadian
Location Online
Category Conferences - Seminars
Event Language English
Abstract: Stochastic approximation (SA) is an iterative algorithm to find the fixed point of an operator given noisy samples of this operator. SA appears in many areas such as optimization and Reinforcement Learning (RL). When implemented in practice, the noise that appears in the update of RL algorithms is naturally Markovian. Furthermore, in some settings, such as gradient temporal difference (GTD), SA is employed in a two-time-scale manner. The mix of Markovian noise along with the two-time-scale structure results in an algorithm which is complex to analyze theoretically. In this talk, we characterize a tight convergence bound for the iterations of linear two-time-scale SA with Markovian noise. Our results show the convergence behavior of this algorithm given various choices of step sizes.

Applying our result to the well-known temporal difference with gradient correction (TDC) algorithm, we show the first O(1/ϵ) sample complexity for the convergence of this algorithm, outperforming all the previous work. Similarly, our results can be applied to establish the convergence behavior of a variety of RL algorithms, such as TD-learning with Polyak-Ruppert averaging, GTD, and GTD2.

Bio sketch: Sajad Khodadadian is an Assistant Professor in the Grado Department of Industrial and Systems Engineering in Virginia Tech. Dr. Khodadadian’s research is on design and analysis of Reinforcement Learning algorithms. He received his PhD in Operations Research from School of Industrial and Systems Engineering in Georgia Institute of Technology.

Practical information

  • Informed public
  • Registration required


  • Professor Negar Kiyavash