System Support for Decentralized and Federated Learning

Thumbnail

Event details

Date 21.06.2022
Hour 13:0015:00
Speaker Rishi Sharma
Location
Category Conferences - Seminars
EDIC candidacy exam
Exam president: Prof. Katerina Argyraki
Thesis advisor: Prof. Anne-Marie Kermarrec
Co-examiner: Prof. Martin Jaggi

Abstract
Deep learning algorithms perform well on a variety of artificial intelligence tasks such as image classification, text recognition, and recommendation. Traditionally, the training of these deep neural networks is done in data centers over huge chunks of data. Moving this data from the producer to the data centers owned by companies such as Google and Amazon poses serious privacy risks. Several collaborative learning approaches with and without a central server have been proposed to alleviate some privacy concerns by allowing data to stay with the producer. These come with their own challenges, including high communication costs and slow convergence.

We propose a research plan for improving decentralized learning systems in terms of communication, computation, and fault tolerance. With an optimal selective parameter sharing scheme, the communication costs can be minimized. Improved CPU and GPU utilization, and an optimal overlap between computation during training, computation of selective sharing and parameter communication can speed up training. Finally, by also handling network delays, packet drops and nodes joining and leaving, we can design a fault-tolerant decentralized learning system that is also communication and computation efficient.

Background papers
1. Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). Association for Computing Machinery, New York, NY, USA, 1310–1321.

2. Kevin Hsieh, Amar Phanishayee, Onur Mutlu, and Phillip B. Gibbons. 2020. The non-IID data quagmire of decentralized machine learning. In Proceedings of the 37th International Conference on Machine Learning (ICML'20). JMLR.org, Article 408, 4387–4398.

3. Hyungjun Oh, Junyeol Lee, Hyeongju Kim, and Jiwon Seo. 2022. Out-of-order backprop: an effective scheduling technique for deep learning. In Proceedings of the Seventeenth European Conference on Computer Systems (EuroSys '22). Association for Computing Machinery, New York, NY, USA, 435–452.
 

Practical information

  • General public
  • Free

Tags

EDIC candidacy exam

Share