Interaction of Neural Architecture and Optimization in Deep Learning

Event details

Date	02.09.2022
Hour	13:00 › 15:00
Speaker	Atli Kosson
Location	BC 010
Category	Conferences - Seminars

EDIC candidacy exam
Exam president: Prof. Amir Zamir
Thesis advisor: Prof. Martin Jaggi
Co-examiner: Prof. François Fleuret

Abstract
Modern deep neural networks have a complex structure, consisting of many layers as well as different types of trainable parameters such as convolutional filters, gains and biases. They are predominantly optimized using some form of stochastic gradient descent (SGD). The structure and parameterization of a neural network can strongly affect the conditioning of the optimization problem which greatly influences the performance of SGD and related methods. My research interests lie in understanding how neural architecture impacts the optimization dynamics and developing more robust optimization methods that account for the structure of the neural network.

Background papers
Wan, R., Zhu, Z., Zhang, X. and Sun, J., 2021. Spherical Motion Dynamics: Learning Dynamics of Normalized Neural Network using SGD and Weight Decay. Advances in Neural Information Processing Systems, 34.
https://proceedings.neurips.cc/paper/2021/hash/326a8c055c0d04f5b06544665d8bb3ea-Abstract.html

Neyshabur, B., Salakhutdinov, R.R. and Srebro, N., 2015. Path-sgd: Path-normalized optimization in deep neural networks. Advances in neural information processing systems, 28.
https://proceedings.neurips.cc/paper/2015/hash/eaa32c96f620053cf442ad32258076b9-Abstract.html

Dauphin, Y., De Vries, H. and Bengio, Y., 2015. Equilibrated adaptive learning rates for non-convex optimization. Advances in neural information processing systems, 28.
https://proceedings.neurips.cc/paper/2015/hash/430c3626b879b4005d41b8a46172e0c0-Abstract.html

Practical information

General public
Free

Contact

[email protected]

Export Event

Event broadcasted in