Tackling the Memory Bottlenecks of CPU-GPU Processing of Analytical Workloads
Event details
Date | 01.06.2018 |
Hour | 14:00 › 16:00 |
Speaker | Panagiotis Sioulas |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. James Larus
Thesis advisor: Prof. Anastasia Ailamaki
Co-examiner: Prof. Christoph Koch
Abstract
With the volume of data growing at an exponential rate, processing data at a high rate has become all the more critical. However, even the available processing capabilities of hardware such as multi-core CPUs and GPUs are underutilized in analytical query workloads due to the limited bandwidth of interconnects with the main memory. As a consequence, accessing data from the processing units becomes a bottleneck. To make matters worse, naive execution models introduce extra pressure to memory.
In this paper, we discuss three works concerning memory bottlenecks in analytical processing. The first two examine GPU query processing. The former eliminates memory-related overheads by removing operator boundaries, whereas the latter dissects the execution time of queries, data transfers included, on GPUs. The third paper discusses bypassing the multi-core CPU memory bottleneck with higher cache utilization. Then, we conclude with our vision for high throughput concurrent workload processing on GPUs.
Background papers
The Yin and Yang of Processing Data Warehousing Queries on GPU Devices, by Yuan Y., et al. Proc. VLDB Endow. 6, 10 (Aug. 2013), 817–828.
Main-memory Scan Sharing for Multi-core CPUs. Proc. VLDB Endow, by Qiao L., et al. 1, 1 (Aug. 2008), 610–621.
Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation, by Wu H., et al. In 2012 45th Annual IEEE/ACM International Symposium
Exam president: Prof. James Larus
Thesis advisor: Prof. Anastasia Ailamaki
Co-examiner: Prof. Christoph Koch
Abstract
With the volume of data growing at an exponential rate, processing data at a high rate has become all the more critical. However, even the available processing capabilities of hardware such as multi-core CPUs and GPUs are underutilized in analytical query workloads due to the limited bandwidth of interconnects with the main memory. As a consequence, accessing data from the processing units becomes a bottleneck. To make matters worse, naive execution models introduce extra pressure to memory.
In this paper, we discuss three works concerning memory bottlenecks in analytical processing. The first two examine GPU query processing. The former eliminates memory-related overheads by removing operator boundaries, whereas the latter dissects the execution time of queries, data transfers included, on GPUs. The third paper discusses bypassing the multi-core CPU memory bottleneck with higher cache utilization. Then, we conclude with our vision for high throughput concurrent workload processing on GPUs.
Background papers
The Yin and Yang of Processing Data Warehousing Queries on GPU Devices, by Yuan Y., et al. Proc. VLDB Endow. 6, 10 (Aug. 2013), 817–828.
Main-memory Scan Sharing for Multi-core CPUs. Proc. VLDB Endow, by Qiao L., et al. 1, 1 (Aug. 2008), 610–621.
Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation, by Wu H., et al. In 2012 45th Annual IEEE/ACM International Symposium
Practical information
- General public
- Free
Contact
- EDIC - [email protected]