Causal structures and in-context learning in transformers

Event details

Date	03.07.2024
Hour	14:00 › 16:00
Speaker	Gizem Yüce
Location	INJ 326
Category	Conferences - Seminars

EDIC candidacy exam
Exam president: Prof. Michael Gastpar
Thesis advisor: Prof. Nicolas Flammarion
Co-examiner: Prof. Martin Jaggi

Abstract
In-context learning in transformers has emerged as a crucial capability, enabling models to adapt and perform new tasks efficiently by conditioning on provided input-output examples during inference, without requiring parameter updates. Understanding the mechanisms behind in-context learning is vital for advancing AI capabilities and ensuring robust, adaptable performance across diverse applications. We summarize the recent progress on the theory and understanding of in-context learning with various causal structures along with a follow-up on important open questions and proposals for future investigations.

Background papers
1) Shivam Garg, Dimitris Tsipras, Percy S Liang, and Gregory Valiant. What can transformers learn in-context? a case study of simple function classes. Advances in Neural Information Processing Systems, 35:30583–30598, 2022.
2) Benjamin L Edelman, Ezra Edelman, Surbhi Goel, Eran Malach, and Nikolaos Tsilivis. The evolution of statistical induction heads: In-context learning markov chains. arXiv preprint arXiv:2402.11004, 2024.
3) Eshaan Nichani, Alex Damian, and Jason D Lee. How transformers learn causal structure with gradient descent. arXiv preprint arXiv:2402.14735, 2024.

Practical information

General public
Free

Contact

edic@epfl.ch

Export Event

Event broadcasted in