System Support for Decentralized and Federated Learning
Event details
Date | 21.06.2022 |
Hour | 13:00 › 15:00 |
Speaker | Rishi Sharma |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Katerina Argyraki
Thesis advisor: Prof. Anne-Marie Kermarrec
Co-examiner: Prof. Martin Jaggi
Abstract
Deep learning algorithms perform well on a variety of artificial intelligence tasks such as image classification, text recognition, and recommendation. Traditionally, the training of these deep neural networks is done in data centers over huge chunks of data. Moving this data from the producer to the data centers owned by companies such as Google and Amazon poses serious privacy risks. Several collaborative learning approaches with and without a central server have been proposed to alleviate some privacy concerns by allowing data to stay with the producer. These come with their own challenges, including high communication costs and slow convergence.
We propose a research plan for improving decentralized learning systems in terms of communication, computation, and fault tolerance. With an optimal selective parameter sharing scheme, the communication costs can be minimized. Improved CPU and GPU utilization, and an optimal overlap between computation during training, computation of selective sharing and parameter communication can speed up training. Finally, by also handling network delays, packet drops and nodes joining and leaving, we can design a fault-tolerant decentralized learning system that is also communication and computation efficient.
Background papers
1. Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). Association for Computing Machinery, New York, NY, USA, 1310–1321.
2. Kevin Hsieh, Amar Phanishayee, Onur Mutlu, and Phillip B. Gibbons. 2020. The non-IID data quagmire of decentralized machine learning. In Proceedings of the 37th International Conference on Machine Learning (ICML'20). JMLR.org, Article 408, 4387–4398.
3. Hyungjun Oh, Junyeol Lee, Hyeongju Kim, and Jiwon Seo. 2022. Out-of-order backprop: an effective scheduling technique for deep learning. In Proceedings of the Seventeenth European Conference on Computer Systems (EuroSys '22). Association for Computing Machinery, New York, NY, USA, 435–452.
Exam president: Prof. Katerina Argyraki
Thesis advisor: Prof. Anne-Marie Kermarrec
Co-examiner: Prof. Martin Jaggi
Abstract
Deep learning algorithms perform well on a variety of artificial intelligence tasks such as image classification, text recognition, and recommendation. Traditionally, the training of these deep neural networks is done in data centers over huge chunks of data. Moving this data from the producer to the data centers owned by companies such as Google and Amazon poses serious privacy risks. Several collaborative learning approaches with and without a central server have been proposed to alleviate some privacy concerns by allowing data to stay with the producer. These come with their own challenges, including high communication costs and slow convergence.
We propose a research plan for improving decentralized learning systems in terms of communication, computation, and fault tolerance. With an optimal selective parameter sharing scheme, the communication costs can be minimized. Improved CPU and GPU utilization, and an optimal overlap between computation during training, computation of selective sharing and parameter communication can speed up training. Finally, by also handling network delays, packet drops and nodes joining and leaving, we can design a fault-tolerant decentralized learning system that is also communication and computation efficient.
Background papers
1. Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). Association for Computing Machinery, New York, NY, USA, 1310–1321.
2. Kevin Hsieh, Amar Phanishayee, Onur Mutlu, and Phillip B. Gibbons. 2020. The non-IID data quagmire of decentralized machine learning. In Proceedings of the 37th International Conference on Machine Learning (ICML'20). JMLR.org, Article 408, 4387–4398.
3. Hyungjun Oh, Junyeol Lee, Hyeongju Kim, and Jiwon Seo. 2022. Out-of-order backprop: an effective scheduling technique for deep learning. In Proceedings of the Seventeenth European Conference on Computer Systems (EuroSys '22). Association for Computing Machinery, New York, NY, USA, 435–452.
Practical information
- General public
- Free