Adaptive gradient algorithm on non-convex optimization
Event details
Date | 28.08.2019 |
Hour | 13:00 › 15:00 |
Speaker | Chaehwan Song |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Patrick Thiran
Thesis advisor: Prof. Volkan Cevher
Co-examiner: Prof. Ali Sayed
Abstract
First-order gradient method is the most basic and powerful strategy for the optimization problem, and its application lies nearly every field of science and technology. Adaptive gradient method is a variant of first-order gradient descent method, and numerous experiments shows its ability about fast and robust convergence by automatically adjusting the learning rate by its past gradients. Therefore, adaptive gradient method has various applications, especially for neural network training. However, most of its convergence analysis are focusing on convex optimization problem and analysis of adaptive method for nonconvex setting has been just started getting interested recently. Our research goal is to propose solid convergence analysis for modern adaptive gradient methods for nonconvex optimization problem. In this proposal, we discuss three related pieces of research. We first start with the introduction of adaptive gradient method and some famous algorithms, from traditional Adagrad to modern methods including Accelegrad. Then we discuss the pros and cons of these methods, comparing with plain gradient descent(GD) and stochastic gradient descent(SGD). Finally, we introduce recent research about the convergence analysis of adaptive gradient method for nonconvex optimization.
Background papers
The Marginal Value of Adaptive Gradient Methods in Machine Learning, by Wilson, A.,C., et al.
Online Adaptive Methods, Universality and Acceleration, by Kfir Y. Levy, Alp Yurtsever, Volkan Cevher.
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization, by Chen, X., et al.
Exam president: Prof. Patrick Thiran
Thesis advisor: Prof. Volkan Cevher
Co-examiner: Prof. Ali Sayed
Abstract
First-order gradient method is the most basic and powerful strategy for the optimization problem, and its application lies nearly every field of science and technology. Adaptive gradient method is a variant of first-order gradient descent method, and numerous experiments shows its ability about fast and robust convergence by automatically adjusting the learning rate by its past gradients. Therefore, adaptive gradient method has various applications, especially for neural network training. However, most of its convergence analysis are focusing on convex optimization problem and analysis of adaptive method for nonconvex setting has been just started getting interested recently. Our research goal is to propose solid convergence analysis for modern adaptive gradient methods for nonconvex optimization problem. In this proposal, we discuss three related pieces of research. We first start with the introduction of adaptive gradient method and some famous algorithms, from traditional Adagrad to modern methods including Accelegrad. Then we discuss the pros and cons of these methods, comparing with plain gradient descent(GD) and stochastic gradient descent(SGD). Finally, we introduce recent research about the convergence analysis of adaptive gradient method for nonconvex optimization.
Background papers
The Marginal Value of Adaptive Gradient Methods in Machine Learning, by Wilson, A.,C., et al.
Online Adaptive Methods, Universality and Acceleration, by Kfir Y. Levy, Alp Yurtsever, Volkan Cevher.
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization, by Chen, X., et al.
Practical information
- General public
- Free