Over-parameterized deep neural networks: optimization, robustness, and generalization

Event details
Date | 29.06.2023 |
Hour | 09:00 › 11:00 |
Speaker | Zhenyu Zhu |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Lenka Zdeborova
Thesis advisor: Prof. Volkan Cevher
Co-examiner: Prof. Nicolas Flammarion
Abstract
Deep neural networks (DNNs) have demonstrated remarkable achievements in various fields, and the networks used in
practical applications are continuously becoming wider and deeper. While it is known that overparameterized neural networks are easy to
learn, researchers in machine learning still have concerns about certain weaknesses in deep neural networks, such as convergence,
robustness, and generalization. This report aims to discuss recent important advancements in understanding deep neural networks,
specifically focusing on the works of [1], [2], [3], in order to gain a better comprehension of their behavior. The analysis of deep neural
networks will be a primary focus, highlighting key mathematical models and their algorithmic implications. Furthermore, we will explore
the challenges associated with understanding deep neural networks and discuss current research directions in this field.
Background papers
1. A convergence theory for deep learning via over-parameterization (Allen-Zhu, https://arxiv.org/abs/1811.03962).
2. A Universal Law of Robustness via Isoperimetry (Sébastien Bubeck, https://arxiv.org/abs/2105.12806).
3. Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data" (https: https://arxiv.org/abs/2202.05928).
Exam president: Prof. Lenka Zdeborova
Thesis advisor: Prof. Volkan Cevher
Co-examiner: Prof. Nicolas Flammarion
Abstract
Deep neural networks (DNNs) have demonstrated remarkable achievements in various fields, and the networks used in
practical applications are continuously becoming wider and deeper. While it is known that overparameterized neural networks are easy to
learn, researchers in machine learning still have concerns about certain weaknesses in deep neural networks, such as convergence,
robustness, and generalization. This report aims to discuss recent important advancements in understanding deep neural networks,
specifically focusing on the works of [1], [2], [3], in order to gain a better comprehension of their behavior. The analysis of deep neural
networks will be a primary focus, highlighting key mathematical models and their algorithmic implications. Furthermore, we will explore
the challenges associated with understanding deep neural networks and discuss current research directions in this field.
Background papers
1. A convergence theory for deep learning via over-parameterization (Allen-Zhu, https://arxiv.org/abs/1811.03962).
2. A Universal Law of Robustness via Isoperimetry (Sébastien Bubeck, https://arxiv.org/abs/2105.12806).
3. Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data" (https: https://arxiv.org/abs/2202.05928).
Practical information
- General public
- Free
Contact
- edic@epfl.ch