Mechanisms of Learning in Neural Networks: Scaling, Dynamics, and Optimization

Thumbnail

Event details

Date 11.06.2025
Hour 14:0016:00
Speaker Yizhou Xu
Location
Category Conferences - Seminars
EDIC candidacy exam
Exam president: Prof. Nicolas Macris
Thesis advisor: Prof. Lenka Zdeborová
Co-examiner: Prof. Lénaïc Chizat

Abstract
This report reviews three recent advances in the high-dimensional analysis of deep learning, focusing on optimization, learning dynamics, and generalization. First, [1] introduces a belief propagation-based algorithm for training discrete neural networks, offering an alternative to gradient-based methods. Second, [2] characterizes the training dynamics and the emergence of task specialization in multi-head attention during in-context learning. Third, [3] derives scaling laws of the generalization error for random feature regression, establishing the deterministic equivalence of infinite-width models. While these works exemplify how high-dimensional limits can yield tractable asymptotics for neural networks, significant gaps remain between theoretical settings and practical architectures. We conclude by outlining open questions to bridge these gaps.

Selected papers
1. Deep learning via message passing algorithms based on belief propagation (https://iopscience.iop.org/article/10.1088/2632-2153/ac7d3b/pdf
2. Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality (https://arxiv.org/abs/2402.19442) and
3. Dimension-free deterministic equivalents and scaling laws for random feature regression (https://arxiv.org/pdf/2405.15699)
 

Practical information

  • General public
  • Free

Tags

EDIC candidacy exam

Share