Accumulation points of gradient-based optimization methods: conditions for point-wise convergence and saddle avoidance

Event details
Date | 07.07.2023 |
Hour | 15:15 › 17:15 |
Speaker | Andreea-Alexandra Musat |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Nicolas Flammarion
Thesis advisor: Prof. NIcolas Boumal
Co-examiner: Prof. Lénaïc Chizat
Abstract
Standard convergence results in optimization guarantee that accumulation points of the sequence of iterates are critical points of the cost function. However, this does not imply that the sequence converges, nor does it exclude convergence to bad critical points, such as saddles. We review a set of sufficient conditions for ensuring that the iterates converge and demonstrate how classic algorithms satisfy these conditions under their usual assumptions. Additionally, we sketch the main arguments for showing that gradient descent is unlikely to converge to saddles exhibiting a direction of negative curvature, thus establishing almost sure convergence to a local minimum
Background papers
Exam president: Prof. Nicolas Flammarion
Thesis advisor: Prof. NIcolas Boumal
Co-examiner: Prof. Lénaïc Chizat
Abstract
Standard convergence results in optimization guarantee that accumulation points of the sequence of iterates are critical points of the cost function. However, this does not imply that the sequence converges, nor does it exclude convergence to bad critical points, such as saddles. We review a set of sufficient conditions for ensuring that the iterates converge and demonstrate how classic algorithms satisfy these conditions under their usual assumptions. Additionally, we sketch the main arguments for showing that gradient descent is unlikely to converge to saddles exhibiting a direction of negative curvature, thus establishing almost sure convergence to a local minimum
Background papers
- First-order methods almost always avoid strict saddle points --- J. D. Lee, I, Panageas, G. Piliouras, M. Simchowitz, M. I. Jordan & B. Recht --- paper link
- Convergence of the Iterates of Descent Methods for Analytic Cost Functions --- P. A. Absil, R. Mahony, and B. Andrews --- paper link
- Global convergence of the gradient method for functions definable in o-minimal structures --- C. Josz --- paper link
Practical information
- General public
- Free