IC Colloquium: Tail-Latency of Microsecond-Scale Remote Procedure Calls

Thumbnail

Event details

Date 21.01.2021
Hour 10:1511:00
Location Online
Category Conferences - Seminars
By: Prof. Edouard Bugnion - EPFL
IC Faculty candidate

Abstract
Modern webscale applications, such as a search or social networking, are responsive to hundreds of millions of users by combining the resources of thousands to millions of machines to store the applications’ dataset in these computers’ memories.  Within a datacenter, the applications running on these machines communicate with very short Remote Procedure Calls (RPC) that often take a few microseconds to service but must, nevertheless, meet tight service-level objectives expressed in terms of tail latency.  

The EPFL Datacenter Systems Laboratory has studied the various aspects that contribute to tail latency of microsecond-scale RPC within datacenters, with contributions in both operating systems and datacenter networking.  

Results include a protected dataplane operating system leveraging virtualization hardware to process RPC with low latency and high throughput, complete with an associated control plane for energy proportionality and workload consolidation (IX); a work-conserving, tail-tolerant scheduler for microsecond-scale tasks (Zygos); a centralized scheduler, load-balancing, and hedging strategy for cloud deployments (LAEDGE). We also developed R2P2 as a transport protocol to make RPC a first-class citizen in the datacenter, which reduces overheads, eliminates head-of-line blocking, enables scalable, in-network load balancing, and allows for the integration of consensus protocols within the transport layer for scalability (HovercRaft).

Results published at OSDI, SOSP, NSDI, USENIX ATC, Eurosys, SoCC, and TOCS, with 2 Best Paper Awards.  Joint work with my graduate students Dr. Marios Kogias, Dr. George Prekas, Dr. Mia Primorac, and Adrien Ghosn, as well as colleagues Profs. Argyraki (EPFL), Belay (MIT), and Kozyrakis (Stanford).

Bio
Prof. Edouard Bugnion joined EPFL in 2012, with a teaching and research focus on datacenter systems. His areas of interest include operating systems, datacenter infrastructure (systems and networking), and computer architecture.  Before joining EPFL, he spent 18 years in the US, at Stanford (MS ’96, PhD ’12) and co-founded two startups: VMware and Nuova Systems (acquired by Cisco). He served as VMware’s first CTO and was later the VP/CTO of Cisco’s Server, Access, and Virtualization Technology Group.

Prof. Bugnion is an ACM Fellow and a member of the Swiss Academy of Technical Sciences (SATW).  He received the ACM Systems Award  in 2009 in recognition for VMware. His paper “Disco: Running Commodity Operating Systems on Scalable Multiprocessors” was entered into the ACM SIGOPS Hall of Fame Award in 2008. He has won Best Paper Awards at SOSP, OSDI and Eurosys.  He is the current Scientific Director and Founder/PI of the Swiss Data Science Center, a  Swiss nation-wide initiative. He serves on the Swiss National COVID-19 Scientific Task Force as the lead expert for digital technologies.


More information

Practical information

  • General public
  • Free
  • This event is internal

Contact

  • George Candea

Share