Two mathematical perspectives on transformers


Date 23.02.2024
Hour 13:1514:15
Speaker Cyril Letrouit (CNRS & U. Paris-Saclay)
I will present joint works with Borjan Geshkovski Yury Polyanskiy and Philippe Rigollet in which we use tools from ODEs and PDEs to study an interacting particle system arising in Transformers through a mechanism called ``self-attention”. We demonstrate the formation of clusters of particles as time goes to infinity. I will also report on a joint work with Andrei Agrachev in which we use tools from geometric control theory to understand universal approximation properties of these particle systems.

