xCOMET, Tower, EuroLLM: Open & Multilingual LLMs for Europe
Event details
Date | 30.01.2025 |
Hour | 16:00 › 17:00 |
Speaker | André F. T. Martins |
Location | Online |
Category | Conferences - Seminars |
Event Language | English |
Abstract: Today, LLMs are Swiss knives and MT one of their tools. Is this the end of MT research? In this talk, I argue that the connection between LLM and MT research is two-way. I present some of our recent work advancing multilingual LLMs, tools to estimate their quality, and how the two can be combined for test-time scaling.
First, I present xCOMET, an open-source learned metric which integrates sentence-level evaluation and error span detection, exhibiting state-of-the-art performance across all types of meta-evaluation (sentence-level, system-level, and error span detection). Moreover, it does so while highlighting and categorizing error spans, thus enriching the quality assessment.
Then, I present Tower, a suite of open multilingual LLMs for translation-related tasks. Tower models are created through continued pretraining on a carefully curated multilingual mixture of monolingual and parallel data. The combination of Tower with COMET reranking obtained the best results in 8 out of 11 language pairs in the WMT General Translation shared task, according to human evaluation.
Finally, I describe EuroLLM, an ongoing EU-made project whose goal is to train an open multilingual LLM from scratch using the European HPC infrastructure (EuroHPC). The last release (EuroLLM-9B) supports 35 languages, including all 24 official EU languages, and it achieves strong results in various benchmarks, comparable or better than the best existing models of similar size.
Speaker: André F. T. Martins (PhD 2012, Carnegie Mellon University and Instituto Superior Técnico) is an Associate Professor at Instituto Superior Técnico, University of Lisbon, researcher at Instituto de Telecomunicações, and the VP of AI Research at Unbabel. His research, funded by a ERC Starting Grant (DeepSPIN) and Consolidator Grant (DECOLLAGE), among other grants, include machine translation, quality estimation, structure and interpretability in deep learning systems for NLP. His work has received several paper awards at ACL conferences. He co-founded and co-organizes the Lisbon Machine Learning School (LxMLS), and he is a Fellow of the ELLIS society and co-director of the ELLIS Program in Natural Language Processing. He is a member of the R&I advisory group of EuroHPC, the European infrastructure for supercomputing.
First, I present xCOMET, an open-source learned metric which integrates sentence-level evaluation and error span detection, exhibiting state-of-the-art performance across all types of meta-evaluation (sentence-level, system-level, and error span detection). Moreover, it does so while highlighting and categorizing error spans, thus enriching the quality assessment.
Then, I present Tower, a suite of open multilingual LLMs for translation-related tasks. Tower models are created through continued pretraining on a carefully curated multilingual mixture of monolingual and parallel data. The combination of Tower with COMET reranking obtained the best results in 8 out of 11 language pairs in the WMT General Translation shared task, according to human evaluation.
Finally, I describe EuroLLM, an ongoing EU-made project whose goal is to train an open multilingual LLM from scratch using the European HPC infrastructure (EuroHPC). The last release (EuroLLM-9B) supports 35 languages, including all 24 official EU languages, and it achieves strong results in various benchmarks, comparable or better than the best existing models of similar size.
Speaker: André F. T. Martins (PhD 2012, Carnegie Mellon University and Instituto Superior Técnico) is an Associate Professor at Instituto Superior Técnico, University of Lisbon, researcher at Instituto de Telecomunicações, and the VP of AI Research at Unbabel. His research, funded by a ERC Starting Grant (DeepSPIN) and Consolidator Grant (DECOLLAGE), among other grants, include machine translation, quality estimation, structure and interpretability in deep learning systems for NLP. His work has received several paper awards at ACL conferences. He co-founded and co-organizes the Lisbon Machine Learning School (LxMLS), and he is a Fellow of the ELLIS society and co-director of the ELLIS Program in Natural Language Processing. He is a member of the R&I advisory group of EuroHPC, the European infrastructure for supercomputing.
Practical information
- Informed public
- Free
Organizer
- EPFL NLP lab