Conferences - Seminars

  Tuesday 18 December 2018 12:00 BSP 234

On the geometry of the landscape underlying deep learning

By Prof. Matthieu Wyart

Deep learning has been immensely successful at a variety of tasks, ranging from classification to artificial intelligence. Yet why it works is unclear. Learning corresponds to fitting training data, which is implemented by descending a very high-dimensional loss function.  Two central questions are (i) since the loss is a priori not convex, why doesn't this descent get stuck in poor minima, leading to bad performance? (ii) Deep learning works in a regime where the number of parameters can be larger, even much larger, than the data to fit. Why does it lead to very predictive models then, instead of overfitting?
Here I will discuss an unexpected analogy between the loss landscape in deep learning and the energy landscape of repulsive ellipses, that supports an explanation for (i). If times permit I will discuss (ii), more specifically the surprising finding  that predictive power continuously improves by adding more parameters.



Organization Prof. João Penedones

Contact Céline Burkhard

Accessibility General public

Admittance Free