IEEE Members: $8.00
Non-members: $15.00Length: 1:24:52
30 Nov 2023
Abstract: Machine learning (ML) with deep neural networks (DNNs) is a driving technology for innovation broadly across signal, image, and speech processing. For processors, application accelerators, and server computers executing any of a variety of modern data-intensive computing tasks, including machine learning inference, moving data from large memories into processing elements can be a limiting factor for energy efficiency and/or throughput (performance). The high cost of off-chip memory accesses has led to the extensive use of large, dense on-chip memories. These memories can store part (or all) of the data required by the application, such as the weights used to compute forward inference based on an input signal. The potentially large leakage energy and constrained density of on-chip volatile memory (typically static random-access memory, SRAM) has underscored the need for a new generation of embedded memory technologies including emerging logic-compatible embedded nonvolatile memories technologies (eNVMs) such as resistive random-access memory (RRAM) or phase-change memory (PCM); and on-die, back-end-of-line (BEOL) or near-die dynamic random-access memory (eDRAM/DRAM). The cost of accessing data stored in these large on-chip arrays has also motivated the re-emergence of the analog compute-in-memory (CIM) paradigm, where the stored states in multiple memory cells are selectively added together inside the memory array so that the read-out value represents a multiply-accumulate (MAC) operation result. This potentially allows improved access bandwidth and efficiency and/or reduced MAC computation area and energy. However, analog CIM confronts many challenges: systematic and random variation affecting the memory cell current, other non-idealities in the CIM operation including IR drop further hindering accuracy, and the need for expensive readout circuitry. This talk will discuss current progress to overcome these challenges toward improved current-summing CIM with foundry RRAM. This talk will also cover the advantages of RRAM beyond CIM, including implementations showing computing near-memory with RRAM with a focus on the reduced leakage and improved density offered by this technology.