This tutorial involves hands-on exercises. In order to follow our interactive live session better, please follow the instructions for installing the necessary infrastructure and exercises. It's even better if you can go through the exercises before the tutorial session and be ready with questions.
Sparse tensor algorithms are critical to many emerging workloads (DNNs, data analytics, recommender systems, graph algorithms, etc.). As a result, recently, many sparse tensor accelerators and systems have been proposed to improve efficiency and performance for sparse tensor algorithms. These designs often involve complex tradeoffs between compression formats, hardware support that exploits sparsity, and on-chip dataflows. Each design therefore represents a specific design point in the large space. As more and more distinct solutions are proposed, we as a community need a better abstraction to compare and contrast these accelerators both qualitatively and quantitatively, and a better tool to model and search the large design space for various sparse tensor algorithms.
The first half of the tutorial focuses on a new format-agnostic abstraction for tensors, called fibertrees. We will show how the fibertree abstraction can be used as a generalization of many previously proposed tensor representations, each with their tradeoffs. Then we will present a language for using this abstraction to describe tensor algorithms and their dataflows, independent of the compression format of the underlying tensors. These descriptions therefore generalize both sparse linear algebra and dense linear algebra. The benefit of this approach is that it allows system designers to explore variations in dataflow without concern for the detailed implementation of the compression format and explore impact of different compression formats without change to the description of the dataflow. In this section, we will demonstrate how to describe dataflows of various accelerators, and how these descriptions allow system designers to understand different high-level tradeoffs.
In the second half of the tutorial, we will introduce Timeloop V2 (a.k.a. Sparseloop) [ISPASS'21], a tool that leverages the fibertree abstraction to analytically model the impact of different dataflows, data representations and associated hardware support for sparsity in sparse tensor accelerators, such as SCNN [ISCA’17]. Sparseloop builds on prior tensor accelerator modeling tools, Timeloop V1 [ISPASS’19] and Accelergy [ICCAD’19], and leverages the mapper in Timeloop to perform mapspace search and the hardware energy model in Accelergy to explore the accelerator design space. Timeloop V2 therefore serves as an efficient tool for architects to explore the large design space (compression format, hardware support, dataflows, and hardware budgets) for sparse tensor accelerators.
Time | Agenda |
---|---|
12:00 - 12:30PM | Introduction |
12:30 - 1:15PM | Dot Product Exercises |
1:15 - 2:00PM | Matrix Multiplication Exercises |
2:00 - 2:45PM | Convolution Workload Exercises |
2:45 - 3:00PM | Lab Time and Q&A |