On the geometry of the landscape underlying deep learning
Event details
Date | 18.12.2018 |
Hour | 12:00 |
Speaker | Prof. Matthieu Wyart |
Location | |
Category | Conferences - Seminars |
Deep learning has been immensely successful at a variety of tasks, ranging from classification to artificial intelligence. Yet why it works is unclear. Learning corresponds to fitting training data, which is implemented by descending a very high-dimensional loss function. Two central questions are (i) since the loss is a priori not convex, why doesn't this descent get stuck in poor minima, leading to bad performance? (ii) Deep learning works in a regime where the number of parameters can be larger, even much larger, than the data to fit. Why does it lead to very predictive models then, instead of overfitting?
Here I will discuss an unexpected analogy between the loss landscape in deep learning and the energy landscape of repulsive ellipses, that supports an explanation for (i). If times permit I will discuss (ii), more specifically the surprising finding that predictive power continuously improves by adding more parameters.
Practical information
- General public
- Free
Organizer
Contact
- Céline Burkhard