Theory of Neural Nets Seminar: 21st June 2021

Event details
Date | 21.06.2021 |
Hour | 16:30 › 17:30 |
Speaker | Marco Mondelli (IST Austria) |
Location | Online |
Category | Conferences - Seminars |
This seminar consists of talks about current research on the theory of neural networks. Every session lasts one hour and comprises a talk (about 30 minutes) followed by a discussion with questions from the audience.
Speaker: Marco Mondelli (IST Austria)
Title: Mode Connectivity and Convergence of Gradient Descent for (Not So) Over-parameterized Deep Neural Networks
Abstract: Training a neural network is a non-convex problem that exhibits spurious and disconnected local minima. Yet, in practice neural networks with millions of parameters are successfully optimized using gradient descent methods. In this talk, I will give some theoretical insights on why this is possible. In the first part, I will focus on the problem of finding low-loss paths between the solutions found by gradient descent. First, using mean-field techniques, I will prove that, as the number of neurons grows, gradient descent solutions are approximately dropout-stable and, hence, connected. Then, I will present a mild condition that trades off the overparameterization with the quality of the features. In the second part, I will describe some tools to prove convergence of gradient descent to global optimality: the displacement convexity of a related Wasserstein gradient flow, and bounds on the smallest eigenvalue of neural tangent kernel matrices.
[Based on joint works with Pierre Brechet, Adel Javanmard, Andrea Montanari, Guido Montufar, Quynh Nguyen, and Alexander Shevchenko]
Links
Practical information
- Expert
- Free
Contact
- François Ged: francois.ged[at]epfl.ch