1、HCiM:ADC-Less Hybrid Analog-Digital Compute in Memory Accelerator for Deep Learning WorkloadsShubham Negi,Utkarsh Saxena,Deepika Sharma and Kaushik Roy22ndJanuary 2025 Introduction and Background Challenges Proposed Hardware Algorithm Co-design Approach Two Stage Quantization Hybrid Analog-Digital C
2、iM Accelerator Results Conclusion21BackgroundChallengesHCiMResultsConclusionIntroduction2Bring compute to the edgeSo,what is stopping us?Normalized Energy/MACDNN DataflowsSource:Sze V et al,Proceedings of IEEE,2017Source:IBMBreakdown of operation type across ML workloadsExploding computational compl
3、exity of Deep Learning modelsSource:OurWorldInData.org/artificial-intelligence Source:Magnet,ICCAD 2019Potential Solution:Compute in MemoryChallengesHCiMResultsConclusionIntroductionBackground34WLDBitwise NAND/NOR OutputBitwise NAND/NOR OutputDigital Compute In MemoryDigital Compute In Memory Digita
4、l CiM Bitwise multiplication operation followed by an accumulation in the peripherals Vector Matrix multiplication followed by reduction in adder tree+WLDBitwise NOR operationBitwise NOR operation+Shift/addWLDMUXMUXADCADCBitwise MVM OutputBitwise MVM OutputAnalog Compute In MemoryAnalog Compute In M
5、emoryAdder TreeAdder Tree123Chih et al.ISSCC 2021 Analog CiM Reduction performed in analog domain Multiple wordlines are turned on to perform bitwise MVM operation in Analog domainIntroductionBackgroundChallengesHCiMNAXFuture Work45Source:Ankit A et al,ASPLOS,2019PUMA:CiM Accelerator1234A(4 bit)5678
6、912345678912B(4 bit)xBit StreamIj=ViGijBit SlicesB1:0B3:201 10 11 1001 01 10 1100 01 10 1100 01 01 1001 01 01 1010 00 00 0001 01 01 0110 10 00 005 56 67 78 89 91 12 23 34 45 56 67 78 89 91 12 2MVMUMVMU0110A01000A11000A20000A30000A31000A21000A10110A012341234I1I2I3I4I1I2I3I4Divide the B matrix into bi