Insights on the generalization ability of deep neural networks using sensitivity analysis

Event details

Date	28.06.2018
Hour	10:00 › 12:00
Speaker	Mahsa Forouzesh
Location	BC 229
Category	Conferences - Seminars

EDIC candidacy exam
Exam president: Prof. Ruediger Urbanke
Thesis advisor: Prof. Patrick Thiran
Co-examiner: Prof. Martin Jaggi

Abstract
There is a growing line of research on understanding what drives generalization in deep learning settings. Sharpness analysis of the loss surface gives intuition on the generalization process. Robustness analysis provides generalization error bounds which complexity measures of the model appear in. However, none is sufficient to explain the generalization ability of an over-parameterized deep neural network to unseen data. In this proposal, we would like to find insights on tackling this phenomenon using mathematical tools. In particular, we would like to apply sensitivity analysis to both forward pass and backward pass of the model, and by considering a probabilistic framework, we would like to provide a better understanding of the performance of various algorithms. Theoretical explanation on why and how deep neural networks work is the starting point for designing new regularization techniques that are not only justified by empirical results but also have mathematical fundamentals.

Background papers
Mathematics of Deep Learning, by Vidal, R., et al.
Entropy-SGD: Biasing gradient descent into wide valleys, by Chaudhari, P., et al.
Layer Normalization, by Lei Ba, J., et al.

Practical information

General public
Free

Contact

EDIC - edic@epfl.ch

Export Event

Event broadcasted in

Send a reminder