Sparse Tensor Accelerator Modeling Tutorial

Sparse Tensor Accelerators: Abstraction and Modeling


Organizers


Tutorial Infrastructure Installation Instructions HERE

This tutorial involves hands-on exercises. In order to follow our interactive live session better, please follow the instructions for installing the necessary infrastructure and exercises. It's even better if you can go through the exercises before the tutorial session and be ready with questions.


ISCA 2021 Tutorial

Time: 12PM-3PM EDT, June 19th, 2021
Location: Online
Register HERE


Overview

Sparse tensor algorithms are critical to many emerging workloads (DNNs, data analytics, recommender systems, graph algorithms, etc.). As a result, recently, many sparse tensor accelerators and systems have been proposed to improve efficiency and performance for sparse tensor algorithms. These designs often involve complex tradeoffs between compression formats, hardware support that exploits sparsity, and on-chip dataflows. Each design therefore represents a specific design point in the large space. As more and more distinct solutions are proposed, we as a community need a better abstraction to compare and contrast these accelerators both qualitatively and quantitatively, and a better tool to model and search the large design space for various sparse tensor algorithms.

The first half of the tutorial focuses on a new format-agnostic abstraction for tensors, called fibertrees. We will show how the fibertree abstraction can be used as a generalization of many previously proposed tensor representations, each with their tradeoffs. Then we will present a language for using this abstraction to describe tensor algorithms and their dataflows, independent of the compression format of the underlying tensors. These descriptions therefore generalize both sparse linear algebra and dense linear algebra. The benefit of this approach is that it allows system designers to explore variations in dataflow without concern for the detailed implementation of the compression format and explore impact of different compression formats without change to the description of the dataflow. In this section, we will demonstrate how to describe dataflows of various accelerators, and how these descriptions allow system designers to understand different high-level tradeoffs.

In the second half of the tutorial, we will introduce Timeloop V2 (a.k.a. Sparseloop) [ISPASS'21], a tool that leverages the fibertree abstraction to analytically model the impact of different dataflows, data representations and associated hardware support for sparsity in sparse tensor accelerators, such as SCNN [ISCA’17]. Sparseloop builds on prior tensor accelerator modeling tools, Timeloop V1 [ISPASS’19] and Accelergy [ICCAD’19], and leverages the mapper in Timeloop to perform mapspace search and the hardware energy model in Accelergy to explore the accelerator design space. Timeloop V2 therefore serves as an efficient tool for architects to explore the large design space (compression format, hardware support, dataflows, and hardware budgets) for sparse tensor accelerators.


Background Lectures

Videos

Available both in Dropbox and on YouTube (links below have the same content)
  • Dropbox Link
  • YouTube Link
  • Slides

  • part01_Fibertree
  • part02_Sparseloop

  • Live Session

    Schedule

    Time Agenda
    12:00 - 12:30PM Introduction
    12:30 - 1:15PM Dot Product Exercises
    1:15 - 2:00PM Matrix Multiplication Exercises
    2:00 - 2:45PM Convolution Workload Exercises
    2:45 - 3:00PM Lab Time and Q&A

    Exercises

    Please follow ths instructions in the repo to run the tutorial docker. The exercises are under `workspace/exercises/2021.isca`.

    Please fill in the quick survey below for us to better understand what you would like to see during the tutorial!


    Related Publications

    Y. N. Wu, P.-A. Tsai, A. Parashar, V. Sze, J. S. Emer, “Sparseloop: An Analytical, Energy-Focused Design Space Exploration Methodology for Sparse Tensor Accelerators,” IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2021 [ paper ]
    Tushar Krishna; Hyoukjun Kwon; Angshuman Parashar; Michael Pellauer; Ananda Samajdar, Data Orchestration in Deep Learning Accelerators , Morgan & Claypool, 2020. DOI: https://doi.org/10.2200/S01015ED1V01Y202005CAC052
    V. Sze, Y.-H. Chen, T.-J. Yang, J. S. Emer, Efficient Processing of Deep Neural Networks, Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, 2020. [ Order book here ] [ Flyer ]
    Y. N. Wu, V. Sze, J. S. Emer, “An Architecture-Level Energy and Area Estimator for Processing-In-Memory Accelerator Designs,” IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), April 2020 [ paper ]
    Y. N. Wu, J. S. Emer, V. Sze, “Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs,” International Conference on Computer Aided Design (ICCAD), November 2019 [ paper ] [ slides ]
    A. Parashar, et al. "Timeloop: A systematic approach to DNN accelerator evaluation," IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2019.[ paper ]
    Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, and William J. Dally. 2017. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA '17). Association for Computing Machinery, New York, NY, USA, 27–40. DOI: https://doi.org/10.1145/3079856.3080254