Building Multi-task Models: From Multi-task Learning to Model Merging

Thumbnail

Event details

Date 27.02.2024
Hour 13:0015:00
Speaker Ke Wang
Location
Category Conferences - Seminars
EDIC candidacy exam
Exam president: Prof. Amir Zamir
Thesis advisor: Prof. Pascal Frossard
Co-examiner: Prof. Martin Jaggi

Abstract
Multi-task models efficiently utilize shared representations to handle various tasks with reduced storage requirements. While many multi-task learning (MTL) methods have been developed to train such models jointly and deliver good performance, they often incur non-trivial costs for training and require simultaneous access to all tasks. Recently, model merging techniques like task arithmetic have emerged as train-free alternatives to construct multi-task models by directly merging separately fine-tuned models. However, these methods tend to exhibit performance degradation compared to traditional MTL approaches. In this report, we will explore three methods for constructing multi-task models: one based on MTL and two utilizing model-merging approaches. Additionally, we will delve into our research on understanding the reasons behind the performance decline observed in model merging methods, along with our proposed approach to enhance their performance.

Background papers

Practical information

  • General public
  • Free

Tags

EDIC candidacy exam

Share