Customizing Hardware Parameters to Optimize for Specific DNN Workloads

Event details

Date	24.06.2021
Hour	14:00 › 16:00
Speaker	Canbert Sönmez
Category	Conferences - Seminars

EDIC candidacy exam
exam president: Prof. Martin Jaggi
thesis advisor: Prof. Babak Falsafi
co-examiner: Prof. Paolo Ienne

Abstract
Deep Neural Networks (DNNs) proved to have superior accuracy over traditional methods in a wide range of problems such as image recognition, speech recognition, translation, self-driving cars, and playing computer games. However, DNNs usually have many parameters, making them computationally expensive. As a result, mapping the DNN computation to conventional computing platforms, such as CPUs and GPUs, while remaining within specific latency and power consumption bounds, becomes challenging. Consequently, we observe an increasing interest in research on DNN hardware acceleration. However, as each DNN model exhibits a different dataflow pattern, it is impossible to design an accelerator that suits every possible DNN workload, leading to customized hardware designs targeting specific workloads. In this proposal, we examine 3 different DNN accelerators, describe how they handle different dataflow patterns, and compare them. Our analysis allows us to identify how each of these accelerators can be improved. Based on the analysis, we present our research proposal, which aims to develop a method to automatically design hardware that can process a given workload of DNN models optimally. The designed hardware inherits features from these 3 accelerators and it optimizes computation for the given DNN workload by customizing its interconnect type and computation unit size.

Background papers

N. P. Jouppi et al., "In-datacenter performance analysis of a tensor processing unit," 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), 2017, pp. 1-12, doi: 10.1145/3079856.3080246.
Hyoukjun Kwon, Ananda Samajdar, and Tushar Krishna. 2018. MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects. SIGPLAN Not. 53, 2 (February 2018), 461–475. DOI:https://doi.org/10.1145/3296957.3173176
H. T. Kung, B. McDanel, S. Q. Zhang, X. Dong and C. C. Chen, "Maestro: A Memory-on-Logic Architecture for Coordinated Parallel Use of Many Systolic Arrays," 2019 IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2019, pp. 42-50, doi: 10.1109/ASAP.2019.00-31.

Practical information

General public
Free

Organizer

EDIC

Contact

[email protected]

Export Event

Event broadcasted in

Send a reminder