[Seminar] Understanding Reuse, Performance, and Hardware Cost of DNN Accelerator Dataflows
호스트: 김지홍 교수
Deep Neural Network (DNN) accelerators, specialized hardware for DNN inferences, have emerged as hardware solutions to run heavy DNN inference tasks with high performance and energy efficiency in various platforms from edge devices to data centers. The data partitioning and scheduling strategies used by DNN accelerators to leverage reuse are known as dataflow, which directly impacts the performance and energy efficiency of DNN accelerators. An accelerator microarchitecture dictates the dataflow(s) that can be employed to execute layers in a DNN. Selecting a dataflow for a layer can have a large impact on utilization and energy efficiency, but there is a lack of understanding on the costs and benefits of dataflow choices, and of tools and methodologies to help architects explore the complex co-optimization design space. In this talk, I will first introduce a set of data-centric directives to concisely specify the DNN dataflow space in a tool-friendly format. Then, I will present how these directives can be analyzed to infer various forms of reuse and to exploit them using hardware capabilities. We codify this analysis into an analytical cost model, MAESTRO (Modeling Accelerator Efficiency via Spatio-Temporal Reuse and Occupancy), that estimates various cost-benefit tradeoffs of a dataflow including execution time and energy efficiency for a DNN model and hardware configuration. I will demonstrate a use case of MAESTRO to drive a hardware design space exploration experiment, which searches across 480M designs to identify 2.5M valid designs at an average rate of 0.17M designs per second, including Pareto-optimal throughput- and energy-optimized design points. Finally, I will share how to use MAESTRO for various DNN accelerator research.
Hyoukjun Kwon is a Ph.D. candidate in computer science at Georgia Institute of Technology advised by Dr. Tushar Krishna. His research interests are mainly in computer architecture, focusing on network-on-chip and spatial accelerators. In particular, he is actively working on architecture-compiler-DNN model co-design of deep neural network (DNN) accelerators. He worked as a research intern in Architecture Research Group (ARG) at NVIDIA during the summer of 2017 and 2018, and in On-device AI Team at Facebook during the summer of 2019. Before he joins Georgia Institute of Technology, he received Bachelor's degrees in environmental material science and computer science and engineering at Seoul National University in 2015.