Timeloop/Accelergy Tutorial @ ISCA2020

Timeloop/Accelergy Tutorial:
Tools for Evaluating Deep Neural Network Accelerator Designs


Organizers


Date: 10AM-12PM EST, May 29th, 2020
Location: Online
Register for the tutorial here


Past Version (MICRO 2019)


Tutorial Infrastructure Installation Instructions HERE

This tutorial involves hands-on exercises and labs, as well as some baseline designs if you would like to have a deeper dive into the Timeloop Accelergy system. In order to follow our interactive live session better, please follow the instructions for installing the necessary infrastructure and exercises. It's even better if you can go through the exercises before the tutorial session and be ready with questions.


Overview

Deep neural networks have emerged as the key approach for solving a wide range of complex problems. To provide high performance and energy efficiency to this class of computation and memory-intensive applications, many DNN accelerators have been proposed in recent years. In order to systematically evaluate arbitrary DNN accelerator designs, we need to have an infrastructure that is able to:

Flexibly describe a wide range of architectures. Unlike traditional architectures that have similar architectures but various microarchitectures, DNN accelerators’ architectures vary significantly from one to another. Therefore, the traditional way of using a fixed set of architecture components to describe the design becomes infeasible for describing DNN accelerators. Since being able to describe the architecture is the initial step for any architecture evaluations, it is important for the infrastructure to be able to have the flexibility to describe a wide range of DNN architecture designs.

Find optimal mappings for a wide range of workloads onto the architecture. Unlike traditional architectures that have an ISA that allows a workload to be represented with a single compiled program, each DNN accelerator uniquely exposes many configurable hardware settings and requires the designer to find a way for scheduling operations and moving data for each workload, i.e., find a mapping for each workload. Since different mappings result in widely varying performance and energy efficiency and different workloads have different optimal mappings, finding optimal mappings is essential for evaluating a DNN architecture.

Accurately predict energy for a range of accelerator designs. Since accelerators are designed for different applications (e.g., sparse DNNs vs. dense DNNs), different accelerator design consists of different hardware components. Furthermore, different accelerator designs also implement different hardware optimizations that will result in drastically different energy consumption for the components. Therefore, it is important for the infrastructure to accurately model the energy consumption of all the components involved in the accelerator design space for evaluating a DNN architecture.

Handle a wide range of technologies. Recently, many new technologies have emerged to help improve the performance and energy efficiency of accelerator designs, such as CMOS scaling down to 7nm, the RRAM in-memory computations, and the optical computations. Accelerator designs under different technologies have different performance and energy efficiency even if they have similar architecture and run the same workload under the same mapping. Therefore, to perform fair evaluations of accelerator designs, it is important for the infrastructure to be flexible enough to accurately reflect the technology-dependent costs.

In this tutorial, we will present two integrated tools that enable rapid evaluation of DNN accelerators:

  • Mapping exploration with Timeloop [paper] Timeloop uses a concise and unified representation of the key architecture and implementation attributes of DNN accelerators to describe a broad space of hardware architectures. With the aid from accurate energy estimators, Timeloop generates an accurate processing speed and energy efficiency characterization for any given workload through a mapper that finds the best way to schedule operations and stage data on the specified architecture.
  • Energy estimation with Accelergy [paper] [website] Accelergy serves as the energy estimator that provides flexible energy estimation to facilitate Timeloop’s energy characterization. Accelergy allows specifications of arbitrary accelerator architecture designs comprised of user-defined design-specific high-level compound components and user-defined low-level primitive components, which can be characterized by third-party energy estimation plug-ins to reflect the technology-dependent characteristics of the design.

Video Recording

Feel free to post any questions on the conference website here. We will answer the questions during the live session of the tutorial.

Tutorial Schedule

Time Agenda
10:00 - 10:30AM Introduction
10:30 - 11:00AM Timeloop Exercises
11:00 - 11:30AM Accelergy Exercises
11:30 - 12:00 PM Q&A Session


Additional Resources

V. Sze, Y.-H. Chen, T.-J. Yang, J. S. Emer, Efficient Processing of Deep Neural Networks, Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, 2020. [ Pre-order book here ] [ Flyer ]