-
SSCS
IEEE Members: $8.00
Non-members: $15.00Length: 01:15:00
Small errata on slides 15, 24, 25.
Slide 15, the AER acronym stands for address event representation
Slides 24 and 25, the HWR-IAF should be replaced with ADM-Async.Logic for this design.
Abstract: Edge audio devices have gained a lot of recent interest especially for tasks such as keyword spotting. These tasks require deep networks that use a small footprint allowing all blocks to be embedded in an ASIC. The conventional approach is to first sample at Nyquist or oversample the microphone input through an analog-to-digital converter (ADC) and then further process the samples using a digital signal processing block. Stringent power constraints of ‘always-on’ operation in edge audio devices pose several design challenges forcing the ASIC designers to look for solutions to reduce standby power overhead. ‘Event-driven’ bioinspired audio sensors specifically spiking silicon cochleas bypass the combined stage of ADC and digital filtering. Instead they use analog filters to extract continuous-time audio features and convert them into binary asynchronous event streams but only in the presence of sounds that pass the filtering. Furthermore their rectification stage facilitates data conversion with the useful amplitude information only greatly reducing the necessary ADC sample rates. The events retain the asynchronous audio timing which is available for tasks such as source localization. Putting it all together their low-power and low-latency responses are ideal for spatialized audio edge device operation where power and latency are critical. This talk presents the development of event-driven spiking cochleas and deep neural network algorithms used for early edge audio tasks including voice activity detection and keyword spotting. I will show examples of audio devices that combine this event-driven audio front-end with low-compute neural networks to implement continuous small vocabulary speech recognition and keyword spotting for low-power (nW-uW) ASICs.
Biography: Shih-Chii Liu is a professor in the Faculty of Science at the University of Zurich. She co-directs the Sensors group (https://sensors.ini.uzh.ch) at the Institute of Neuroinformatics
Slide 15, the AER acronym stands for address event representation
Slides 24 and 25, the HWR-IAF should be replaced with ADM-Async.Logic for this design.
Abstract: Edge audio devices have gained a lot of recent interest especially for tasks such as keyword spotting. These tasks require deep networks that use a small footprint allowing all blocks to be embedded in an ASIC. The conventional approach is to first sample at Nyquist or oversample the microphone input through an analog-to-digital converter (ADC) and then further process the samples using a digital signal processing block. Stringent power constraints of ‘always-on’ operation in edge audio devices pose several design challenges forcing the ASIC designers to look for solutions to reduce standby power overhead. ‘Event-driven’ bioinspired audio sensors specifically spiking silicon cochleas bypass the combined stage of ADC and digital filtering. Instead they use analog filters to extract continuous-time audio features and convert them into binary asynchronous event streams but only in the presence of sounds that pass the filtering. Furthermore their rectification stage facilitates data conversion with the useful amplitude information only greatly reducing the necessary ADC sample rates. The events retain the asynchronous audio timing which is available for tasks such as source localization. Putting it all together their low-power and low-latency responses are ideal for spatialized audio edge device operation where power and latency are critical. This talk presents the development of event-driven spiking cochleas and deep neural network algorithms used for early edge audio tasks including voice activity detection and keyword spotting. I will show examples of audio devices that combine this event-driven audio front-end with low-compute neural networks to implement continuous small vocabulary speech recognition and keyword spotting for low-power (nW-uW) ASICs.
Biography: Shih-Chii Liu is a professor in the Faculty of Science at the University of Zurich. She co-directs the Sensors group (https://sensors.ini.uzh.ch) at the Institute of Neuroinformatics