BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:Data-Partitioning for Stream Processing Systems
DTSTART:20210617T100000
DTEND:20210617T120000
DTSTAMP:20260407T011308Z
UID:78d8facd95fcd638e55d4e7d6c87cb367997ac1a15990a806ecee2fa
CATEGORIES:Conferences - Seminars
DESCRIPTION:Eleni Zapridou \nEDIC candidacy exam\nexam president: Prof. K
 arl Aberer\nthesis advisor: Prof. Anastasia Ailamaki\nco-examiner: Prof. A
 nne-Marie Kermarrec\n\nAbstract\nStreaming applications have two\, often c
 onflicting\, requirements\; latency and throughput. The tuple-at-a-time ar
 chitecture prioritizes minimizing latency while the micro-batch model opti
 mizes for increasing throughput. To optimize for both requirements\, paral
 lel stream processing engines have been developed. However\, the distribut
 ion in streaming workloads changes in real-time and can be very skewed. Na
 ive data partitioning in this setting results in one of the parallel worke
 rs becoming overloaded and\, thus\, determining the systemâs executio
 n time. To express the performance of different partitioning algorithms\, 
 work has been done in formalizing the optimization objectives. We consider
  more complex tasks that cannot be expressed with the existing modeling.\n
 \nBackground papers\n\n	Apache Flink: Stream Analytics at Scale https://ww
 w.researchgate.net/publication/305869785_Apache_Flink_Stream_Analytics_at_
 Scale\n	A Holistic View of Stream Partitioning Costs http://www.vldb.org/p
 vldb/vol10/p1286-katsipoulakis.pdf\n	Prompt: Dynamic Data-Partitioning for
  Distributed Micro-batch Stream Processing Systems https://www.cs.purdue.e
 du/homes/aref/papers/sigmod2020.pdf\n
LOCATION:
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
