Towards Multimodal Technologies for the World


Event details

Date 03.02.2023 14:0015:00  
Speaker Emanuele Bugliarello
Category Conferences - Seminars
Event Language English

There has been an explosive growth of vision-and-language architectures in the last few years, which are usually trained on English captions paired with images from North America or Western Europe.
In this talk, Emanuele will first introduce a new protocol to collect culturally relevant images and captions, which resulted in MaRVL, a multimodal reasoning dataset in five diverse languages. He will then discuss limitations of state-of-the-art models when evaluated on multilingual data, made possible by the IGLUE benchmark.
Finally, he will show that we can substantially improve zero-shot cross-lingual transfer by compromising our ideals of multilingual multimodal data.

Emanuele Bugliarello received his MSc from the IC School at EPFL in 2018, and he iscurrently a final-year PhD Fellow in the NLP Section at the University of Copenhagen. His research lies at the intersection of language and vision, with a particular interest in building models and creating resources that
represent the diversity of cultural and linguistic backgrounds.


Practical information

  • General public
  • Free


Event broadcasted in