Towards Multimodal Technologies for the World

Event details

Date	03.02.2023
Hour	14:00 › 15:00
Speaker	Emanuele Bugliarello
Location	BC 229
Category	Conferences - Seminars
Event Language	English

There has been an explosive growth of vision-and-language architectures in the last few years, which are usually trained on English captions paired with images from North America or Western Europe.
In this talk, Emanuele will first introduce a new protocol to collect culturally relevant images and captions, which resulted in MaRVL, a multimodal reasoning dataset in five diverse languages. He will then discuss limitations of state-of-the-art models when evaluated on multilingual data, made possible by the IGLUE benchmark.
Finally, he will show that we can substantially improve zero-shot cross-lingual transfer by compromising our ideals of multilingual multimodal data.

Emanuele Bugliarello received his MSc from the IC School at EPFL in 2018, and he iscurrently a final-year PhD Fellow in the NLP Section at the University of Copenhagen. His research lies at the intersection of language and vision, with a particular interest in building models and creating resources that
represent the diversity of cultural and linguistic backgrounds.

Practical information

General public
Free

Organizer

DLAB

Contact

Candice Norhadian

Export Event

Event broadcasted in

Send a reminder