Decentralized Stochastic Optimization

Event details

Date	26.08.2019
Hour	14:00 › 16:00
Speaker	Anastasiia Koloskova
Location	BC 010
Category	Conferences - Seminars

EDIC candidacy exam
Exam president: Prof. Volkan Cevher
Thesis advisor: Prof. Martin Jaggi
Co-examiner: Prof. Ali Sayed

Abstract
Decentralized optimization is a promising direction for optimizing machine learning models. It allows to distribute training over large amount of computing devices (such as e.g. mobile phones) without moving the users data to central servers. Moreover it can give signiﬁcant speedups for training in datacenters over all-reduce SGD, which is the current state-ofthe-art parallel SGD implementation. In this writeup we discuss some of the recent advances in decentralized optimization and its current weaknesses. We ﬁrstly consider communication compression techniques for speeding up centralized training. The second paper we discuss, shows that communication topology does not inﬂuence the leading term in convergence rate in stochastic decentralized optimization, thus making it competitive with centralized approaches. And ﬁnally, we consider another technique to make communication more efﬁcient in decentralized training—time-varying directed network graphs and asynchronous communications.

Background papers
QSGD: Communication-efficient SGD via gradient quantization and encoding, by Alistarh, D., et al. NIPS 2017.
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent, by Lian, X., et al. NIPS 2017.
Stochastic Gradient Push for Distributed Deep Learning, by Assran, M., et al. ICML 2019.

Practical information

General public
Free

Contact

[email protected]

Export Event

Event broadcasted in

Send a reminder