Scaling Distributed Deep Learning with Efficient Algorithm Design

Event details

Date	13.06.2018
Hour	10:00 › 12:00
Speaker	Tao LIN
Location	INJ 328
Category	Conferences - Seminars

EDIC candidacy exam
Exam president: Prof. François Fleuret
Thesis advisor: Prof. Martin Jaggi
Thesis co-advisor: Prof. Babak Falsafi
Co-examiner: Prof. Rachid Guerraoui

Abstract
Due to the rapid growth of data and the ever-increasing model complexity, today, most important deep learning algorithms cannot be efficiently solved by a single machine. Distributed training architectures for training have been developed in response to the challenges, and promise improved scalability by increasing both computational and storage capacities. A critical challenge in realizing this promise of scalability is to develop efficient methods for communicating and coordinating information between distributed devices, taking into account the specific needs of machine learning training algorithms. On most distributed systems, the communication of information between devices is vastly more expensive than reading data from main memory and performing the local computation. Moreover, the optimal trade-off between communication and computation can vary widely depending on the dataset being processed, the available system resources being used, and the training objective being optimized. In this thesis, we try to address the above-mentioned challenge, for the improvement of scalability of learning systems.

Background papers
QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding, by Alistarh, Dan, et al. Advances in Neural Information Processing Systems. 2017.
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training, by Lin, Yujun, et al. arXiv preprint arXiv:1712.01887 (2017).
Accurate, large minibatch SGD: training imagenet in 1 hour, by Goyal, Priya, et al. arXiv preprint arXiv:1706.02677 (2017).

Practical information

General public
Free

Contact

EDIC - [email protected]

Export Event

Event broadcasted in

Send a reminder

Scaling Distributed Deep Learning with Efficient Algorithm Design

Event details

Practical information

Contact

Export Event

Tags

Event broadcasted in