1、Processing Raw Images Efficiently with the MAX78000 AI Neural Network AcceleratorMehmet Gorkem UlkarPrincipal Engineer,Machine LearningAnalog DevicesAgenda2 2023 Analog Devices1.Challenges of AI at the edge2.MAX78000 overview3.MAX78000 sample applications4.Energy requirements for data manipulation5.
2、Proposal:CNN based de-bayerization6.Results7.Q&AMehmet Gorkem Ulkar,PhDDallas,TXPrincipal ML EngineerKeep Your Data Close:The Physics of DataSources:Rick Zarr,TI,2008,The True Cost of an Internet“Click”-estimate of transfer cost for 30KB page from server http:/ J Kunkel et al,University of Hamburg 2
3、010,Collecting Energy Consumption of Scientific Data Horowitz ISSCC 2014,1300-2600 pJ per 64b access Chris Rowen,Cadence Design Systems,January 2016,Get Real!Neural Network Technology for Embedded Systems1E-131E-141E-121E-111E-101E-091E-081E-071E-031E-041E-051E-061E-051E-041E-031E-021E-011E+021E+031
4、E+041E+051E+06J per 64b1E+001E+01Distance(m)Credit:Cadence623mi1000 km3In inference,computational effort is in forward propagation On classic hardware,almost all spent ina triple nested matrix multiplication loop O(n3)to O(n2.8)*Very energy intensive even with fast matrix multiply using integer math
5、 on DSP or GPU large number of memory accesses*Strassens algorithmSoftware Inference:Slow and Power Hungry 2023 Analog Devices4CNN Accelerator:MAX78000/MAX78002 The conv operation is parallelizable in the channel dimension.64 processors in total,more channels are processed in a multi-pass fashion Pr
6、oper architecture that minimizes data movement provides energy efficiency Each input channel is processed in parallel using different processors to minimize data movement Each processor uses dedicated memory 2023 Analog Devices5MAX78000 AI Micro-System-on-Chip 2023 Analog Devices67Model,Training,Dep