Interaction of Neural Architecture and Optimization in Deep Learning

Thumbnail

Event details

Date 02.09.2022
Hour 13:0015:00
Speaker Atli Kosson
Location
Category Conferences - Seminars
EDIC candidacy exam
Exam president: Prof. Amir Zamir
Thesis advisor: Prof. Martin Jaggi
Co-examiner: Prof. François Fleuret

Abstract
Modern deep neural networks have a complex structure, consisting of many layers as well as different types of trainable parameters such as convolutional filters, gains and biases. They are predominantly optimized using some form of stochastic gradient descent (SGD). The structure and parameterization of a neural network can strongly affect the conditioning of the optimization problem which greatly influences the performance of SGD and related methods. My research interests lie in understanding how neural architecture impacts the optimization dynamics and developing more robust optimization methods that account for the structure of the neural network.

Background papers
Wan, R., Zhu, Z., Zhang, X. and Sun, J., 2021. Spherical Motion Dynamics: Learning Dynamics of Normalized Neural Network using SGD and Weight Decay. Advances in Neural Information Processing Systems, 34.
https://proceedings.neurips.cc/paper/2021/hash/326a8c055c0d04f5b06544665d8bb3ea-Abstract.html

Neyshabur, B., Salakhutdinov, R.R. and Srebro, N., 2015. Path-sgd: Path-normalized optimization in deep neural networks. Advances in neural information processing systems, 28.
https://proceedings.neurips.cc/paper/2015/hash/eaa32c96f620053cf442ad32258076b9-Abstract.html

Dauphin, Y., De Vries, H. and Bengio, Y., 2015. Equilibrated adaptive learning rates for non-convex optimization. Advances in neural information processing systems, 28.
https://proceedings.neurips.cc/paper/2015/hash/430c3626b879b4005d41b8a46172e0c0-Abstract.html
 

Practical information

  • General public
  • Free

Tags

EDIC candidacy exam

Share