talk of Professor Marco Mondelli (IST Austria)

Thumbnail

Event details

Date 21.04.2022
Hour 16:1517:15
Speaker Professor Marco Mondelli
Location
Category Conferences - Seminars
Event Language English
Title:
Understanding Gradient Descent for Over-parameterized Deep Neural Networks

Abstract:
Training a neural network is a non-convex problem that exhibits spurious and disconnected local minima. Yet, in practice neural networks with millions of parameters are successfully optimized using gradient descent methods. In this talk, I will give some theoretical insights on why this is possible and discuss two approaches to study the behavior of gradient descent. The first one takes a mean-field view and it relates the dynamics of stochastic gradient descent (SGD) to a certain Wasserstein gradient flow in probability space. I will show how this idea allows to study the connectivity, convergence and implicit bias of the solutions found by SGD. The second approach consists in the analysis of the Neural Tangent Kernel. I will present tight bounds on its smallest eigenvalue and show their implications on memorization and optimization in deep networks.

Based on joint work with Adel Javanmard, Vyacheslav Kungurtsev, Andrea Montanari, Guido Montufar, Quynh Nguyen, and Alexander Shevchenko.

Bio:
Marco Mondelli received the B.S. and M.S. degree in Telecommunications Engineering from the University of Pisa, Italy, in 2010 and 2012, respectively. In 2016, he obtained his Ph.D. degree in Computer and Communication Sciences at the École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. He is currently an Assistant Professor at the Institute of Science and Technology Austria (IST Austria). Prior to that, he was a Postdoctoral Scholar in the Department of Electrical Engineering at Stanford University, USA, from February 2017 to August 2019. He was also a Research Fellow with the Simons Institute for the Theory of Computing, UC Berkeley, USA, for the program on Foundations of Data Science from August to December 2018. His research interests include data science, machine learning, information theory, wireless communication systems, and modern coding theory. He was the recipient of a number of fellowships and awards, including the Jack K. Wolf ISIT Student Paper Award in 2015, the STOC Best Paper Award in 2016, the EPFL Doctorate Award in 2018, the Simons-Berkeley Research Fellowship in 2018, the Lopez-Loreta Prize in 2019, and Information Theory Society Best Paper Award in 2021.

Practical information

  • Expert
  • Free

Organizer

  • Professor Volkan Cevher

Contact

  • Gosia Baltaian

Event broadcasted in

Share