Rethinking GPU Execution Model
Event details
Date | 15.11.2018 |
Hour | 11:15 › 12:15 |
Speaker | Yunho Oh |
Location | |
Category | Conferences - Seminars |
Graphics processing units (GPUs) have become the architectural choice to achieve high throughput in general-purpose computing. Thread-level parallelism (TLP) in GPUs is implemented by concurrently executing a large number of threads. However, GPUs cannot often achieve the theoretical peak performance. I found that the critical performance bottlenecks on GPUs are 1) limited memory system performance and 2) limited thread scheduling resources and register file. In this talk, I will show the GPU execution model and two above performance bottlenecks on GPUs in detail. Then, I will introduce two solutions addressing these challenges. First, I will introduce a new GPU architecture, called Adaptive PREfetching and Scheduling (APRES), that overcomes the limited memory system performance by improving cache efficiency on GPUs. Second, I will introduce another work, called FineReg, that provides a solution to schedule threads over the limits of scheduling resources and register file on GPUs.
Practical information
- Expert
- Free
Organizer
- Babak Falsafi
Contact
- Stephanie Baillargues