Hardware-friendly Structured N:M Sparsity for DNN training

Event details
Date | 07.07.2023 |
Hour | 10:00 › 12:00 |
Speaker | Ayan Chakraborty |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Martin Jaggi
Thesis advisor: Prof. Babak Falsafi
Co-examiner: Prof. Anne-Marie Kermarrec
Abstract
N:M sparsity has emerged as a form of structured sparsity that achieves comparable accuracies at high sparsity levels compared to unstructured sparsity techniques. N:M sparsity is relatively efficient to implement in hardware, and already enjoys micro-architectural support inside recent NVIDIA GPUs. This makes N:M sparsity an attractive choice of sparsity scheme to implement for DNN training. DNN Training consists of 3 General Matrix Multiplications (GEMMs) per linear layer per iteration, which make up the majority of all computations in Training. Hence, it is essential to be able to apply N:M sparsity to each of these 3 GEMMs in order to achieve maximum benefits.
Background papers
Exam president: Prof. Martin Jaggi
Thesis advisor: Prof. Babak Falsafi
Co-examiner: Prof. Anne-Marie Kermarrec
Abstract
N:M sparsity has emerged as a form of structured sparsity that achieves comparable accuracies at high sparsity levels compared to unstructured sparsity techniques. N:M sparsity is relatively efficient to implement in hardware, and already enjoys micro-architectural support inside recent NVIDIA GPUs. This makes N:M sparsity an attractive choice of sparsity scheme to implement for DNN training. DNN Training consists of 3 General Matrix Multiplications (GEMMs) per linear layer per iteration, which make up the majority of all computations in Training. Hence, it is essential to be able to apply N:M sparsity to each of these 3 GEMMs in order to achieve maximum benefits.
Background papers
- Aojun Zhou, Yukun Ma, Junnan Zhu, Jianbo Liu, Zhijie Zhang, Kun Yuan, Wenxiu Sun, and Hongsheng Li. Learning N:M fine-grained structured sparse neural networks from scratch. In International Conference on Learning Representations (ICLR), 2021. Link: https://openreview.net/pdf?id=K9bw7vqp_s
- Itay Hubara, Brian Chmiel, Moshe Island, Ron Banner, Joseph Naor, and Daniel Soudry. Accelerated sparse
neural training: A provable and efficient method to find N:M transposable masks. In Advances in Neural
Information Processing Systems (NeurIPS), 2021
Link: https://openreview.net/pdf?id=vRWZsBLKqA - Brian Chmiel, Itay Hubara, Ron Banner, and Daniel Soudry. Minimum variance unbiased N:M sparsity for
the neural gradients. In The Eleventh International Conference on Learning Representations (ICLR), 2023
Link: https://openreview.net/pdf?id=vuD2xEtxZcj
Practical information
- General public
- Free
Contact
- edic@epfl.ch