Skip to main content
  • SSCS
    Members: Free
    IEEE Members: $10.00
    Non-members: $20.00
    Pages/Slides: 122
13 Feb 2021

Abstract: Deep neural networks are used across a wide range of applications. Custom hardware optimizations for this field offer significant performance and power advantages compared to general-purpose processors. However, achieving high TOPS/W and/or TOPS/mm2 along with the requirements for scalability and programmability is a challenging task. This tutorial presents various design approaches to strike the right balance between efficiency, scalability, and flexibility across different neural networks and towards new models. It presents a survey of (i) different circuits and architecture techniques to design efficient compute units, memory hierarchies, and interconnect topologies, (ii) compiler approaches to effectively tile computations, and (iii) neural network optimizations for efficient execution on the target hardware.