BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:From Theoretical Understanding of Neural Networks to Practical App
 lications
DTSTART:20230710T103000
DTEND:20230710T123000
DTSTAMP:20260502T050305Z
UID:3315a9c888879356bed42d63abc0f91e91d50b6a4116aa589f93db5e
CATEGORIES:Conferences - Seminars
DESCRIPTION:Yongtao Wu\nDIC candidacy exam\nExam president: Prof. Nicolas 
 Flammarion\nThesis advisor: Prof. Volkan Cevher\nCo-examiner: Prof. Martin
  Jaggi\n\nAbstract\nDeep learning has demonstrated unprecedented success i
 n influential applications ranging from vision tasks to language modeling.
  The design of network architecture plays a pivotal role in its performanc
 e\, as evident from the development of ResNet\, EfficientNet\, and Transfo
 rmer. These achievements have ignited a profound interest in theoretically
  understanding neural networks across various topics\, such as convergence
 \, generalization\, and learnability\, which can also significantly contri
 bute to practical applications. In this write-up\, we first delve into the
  convergence of feedforward neural networks. Subsequently\, we will examin
 e a study on Transformer from the perspective of generalization. Lastly\, 
 we will introduce a theoretical work on in-context learning within the Tra
 nsformer model.\n\nBackground papers\n\n	'Gradient Descent Provably Optimi
 zes Over-parameterized Neural Networks'\, Du et al\, ICLR\, 2019.\n	'A 
 Theoretical Understanding of Shallow Vision Transformers: Learning\, Gener
 alization\, and Sample Complexity'\, Li et al\, ICLR\, 2023.\n	'Transfo
 rmers learn in-context by gradient descent'\, Oswald et al\, ICML\, 2023
 .\n
LOCATION:ELD 120 https://plan.epfl.ch/?room==ELD%20120
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
