Building Multi-task Models: From Multi-task Learning to Model Merging

Event details

Date	27.02.2024
Hour	13:00 › 15:00
Speaker	Ke Wang
Location	ELE 242
Category	Conferences - Seminars

EDIC candidacy exam
Exam president: Prof. Amir Zamir
Thesis advisor: Prof. Pascal Frossard
Co-examiner: Prof. Martin Jaggi

Abstract
Multi-task models efficiently utilize shared representations to handle various tasks with reduced storage requirements. While many multi-task learning (MTL) methods have been developed to train such models jointly and deliver good performance, they often incur non-trivial costs for training and require simultaneous access to all tasks. Recently, model merging techniques like task arithmetic have emerged as train-free alternatives to construct multi-task models by directly merging separately fine-tuned models. However, these methods tend to exhibit performance degradation compared to traditional MTL approaches. In this report, we will explore three methods for constructing multi-task models: one based on MTL and two utilizing model-merging approaches. Additionally, we will delve into our research on understanding the reasons behind the performance decline observed in model merging methods, along with our proposed approach to enhance their performance.

Background papers

Towards Impartial Multi-task Learning, ICLR 2021.
Editing Models with Task Arithmetic, ICLR 2023
TIES-Merging: Resolving Interference When Merging Models, NeurIPS 2023

Practical information

General public
Free

Contact

edic@epfl.ch

Export Event

Event broadcasted in