Theory of Neural Nets Seminar: 21st June 2021

Thumbnail

Event details

Date 21.06.2021
Hour 16:3017:30
Speaker Marco Mondelli (IST Austria)
Location Online
Category Conferences - Seminars

This seminar consists of talks about current research on the theory of neural networks. Every session lasts one hour and comprises a talk (about 30 minutes) followed by a discussion with questions from the audience.

Speaker: Marco Mondelli (IST Austria)

Title: Mode Connectivity and Convergence of Gradient Descent for (Not So) Over-parameterized Deep Neural Networks

Abstract: Training a neural network is a non-convex problem that exhibits spurious and disconnected local minima. Yet, in practice neural networks with millions of parameters are successfully optimized using gradient descent methods. In this talk, I will give some theoretical insights on why this is possible. In the first part, I will focus on the problem of finding low-loss paths between the solutions found by gradient descent. First, using mean-field techniques, I will prove that, as the number of neurons grows, gradient descent solutions are approximately dropout-stable and, hence, connected. Then, I will present a mild condition that trades off the overparameterization with the quality of the features. In the second part, I will describe some tools to prove convergence of gradient descent to global optimality: the displacement convexity of a related Wasserstein gradient flow, and bounds on the smallest eigenvalue of neural tangent kernel matrices.  
[Based on joint works with Pierre Brechet, Adel Javanmard, Andrea Montanari, Guido Montufar, Quynh Nguyen, and Alexander Shevchenko]

Links

Practical information

  • Expert
  • Free

Contact

  • François Ged: francois.ged[at]epfl.ch

Share