Rethinking GPU Execution Model

Thumbnail

Event details

Date 15.11.2018
Hour 11:1512:15
Speaker Yunho Oh
Location
Category Conferences - Seminars

Graphics processing units (GPUs) have become the architectural choice to achieve high throughput in general-purpose computing. Thread-level parallelism (TLP) in GPUs is implemented by concurrently executing a large number of threads. However, GPUs cannot often achieve the theoretical peak performance. I found that the critical performance bottlenecks on GPUs are 1) limited memory system performance and 2) limited thread scheduling resources and register file.    In this talk, I will show the GPU execution model and two above performance bottlenecks on GPUs in detail. Then, I will introduce two solutions addressing these challenges. First, I will introduce a new GPU architecture, called Adaptive PREfetching and Scheduling (APRES), that overcomes the limited memory system performance by improving cache efficiency on GPUs. Second, I will introduce another work, called FineReg, that provides a solution to schedule threads over the limits of scheduling resources and register file on GPUs.

Practical information

  • Expert
  • Free

Organizer

  • Babak Falsafi

Contact

  • Stephanie Baillargues

Event broadcasted in

Share