Talk of Dr Antonio Orvieto (ELLIS Institute Tübingen)

Thumbnail

Event details

Date 12.01.2024
Hour 14:0015:00
Speaker Dr Antonio Orvieto (ELLIS Institute Tübingen)
Location
Category Conferences - Seminars
Event Language English
Talk title: Accurate and Efficient Processing of Long Sequences and Large Graphs without Attention

Abstract: When applied to sequential data, transformers have an inherent challenge: their attention mechanism leads to quadratic complexity with respect to sequence length. This issue extends to graph transformers, where complexity scales quadratically with the number of nodes in the network. Today, we'll explore theoretically grounded alternatives to the attention mechanism that hinge on carefully parametrized linear recurrent neural networks. Unlike the more commonly known LSTMs and GRUs, linear RNNs are particularly GPU-efficient. This efficiency enables us to scale up the architecture, successfully study signal propagation, and achieve competitive performance. We'll present how, with a Linear Recurrent Unit (LRU) replacing attention, we can achieve state-of-the-art results on sequence modeling and graph data. This approach offers a promising direction for future research, especially in genetics, protein structure prediction, and audio/video processing and generation.
 
Bio:  Dr Antonio Orvieto is a principal investigator at the newly established ELLIS Institute Tübingen and independent group leader at the MPI for Intelligent Systems in Germany. He studied Robotics and Control Engineering in Italy and Switzerland. He holds a PhD from ETH Zürich and spend time at Deepmind, Meta, MILA, INRIA and HILTI. In his research, Antonio strives to improve the efficiency of deep learning technologies by pioneering new architectures and training techniques grounded in theoretical knowledge. His work encompasses two main areas: understanding the intricacies of large-scale optimization dynamics and designing innovative architectures and powerful optimizers capable of handling complex data. Central to his studies is exploring innovative techniques for decoding patterns in sequential data, with implications in biology, neuroscience, natural language processing, and music generation.

Practical information

  • Informed public
  • Free

Organizer

  • Professor Volkan Cevher

Contact

  • Gosia Baltaian

Share