Towards Communication-Efficient Distributed Machine Learning Techniques

Event details

Date	25.06.2018
Hour	09:00 › 11:00
Speaker	Arsany Guirguis
Location	BC 329
Category	Conferences - Seminars

EDIC candidacy exam
Exam president: Prof. Patrick Thiran
Thesis advisor: Prof. Rachid Guerraoui
Co-examiner: Prof. Martin Jaggi

Abstract
Machine Learning (ML) has proven to be powerful in deriving useful information benefiting from the increasing amount of data available daily on the Internet. To make the best of this massive amount of data, ML models are becoming larger and more complex. Yet, training such complex models with large datasets is beyond the capabilities of a single machine. Hence, training ML models is becoming distributed. Although distributing the learning task improves scalability, this comes with communication challenges. Existing work has already attempted to address these challenges in some specific cases, but there is still a room for advancing the state-of-the-art solutions.
In this proposal, I am going to present Tensorflow, a popular system for large-scale distributed machine learning, and a couple of ideas that were proposed with the goal of enhancing the communication layer performance. In my research, I am interested in looking at communication challenges in different distributed ML environments.

Background papers
TensorFlow: A System for Large-Scale Machine Learning, by Abadi, M. et al.
Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters, by Zhang, H., et al-
Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds, by Hsieh, K. et al.

Practical information

General public
Free

Contact

EDIC - [email protected]

Export Event

Event broadcasted in

Send a reminder

Towards Communication-Efficient Distributed Machine Learning Techniques

Event details

Practical information

Contact

Export Event

Tags

Event broadcasted in