Towards Multimodal Technologies for the World

Thumbnail

Event details

Date 03.02.2023
Hour 14:0015:00
Speaker Emanuele Bugliarello
Location
Category Conferences - Seminars
Event Language English

There has been an explosive growth of vision-and-language architectures in the last few years, which are usually trained on English captions paired with images from North America or Western Europe.
In this talk, Emanuele will first introduce a new protocol to collect culturally relevant images and captions, which resulted in MaRVL, a multimodal reasoning dataset in five diverse languages. He will then discuss limitations of state-of-the-art models when evaluated on multilingual data, made possible by the IGLUE benchmark.
Finally, he will show that we can substantially improve zero-shot cross-lingual transfer by compromising our ideals of multilingual multimodal data.

Emanuele Bugliarello received his MSc from the IC School at EPFL in 2018, and he iscurrently a final-year PhD Fellow in the NLP Section at the University of Copenhagen. His research lies at the intersection of language and vision, with a particular interest in building models and creating resources that
represent the diversity of cultural and linguistic backgrounds.

 

Practical information

  • General public
  • Free

Organizer

Event broadcasted in

Share