1、1OpenGeMM:A High-Utilization GeMM Accelerator Generator with Lightweight RISC-V Control and Tight Memory CouplingASP-DAC 2025Xiaoling Yi1,Ryan Antonio1,Joren Dumoulin1,Jiacong Sun1,Josse Van Delm1,Guilherme Paim1,2,Marian Verhelst11MICAS-ESAT,KU Leuven,Belgium,2INESC-ID,Instituto Superior Tcnico,Uni
2、versidade de Lisboa,Portugal23rdJanuary 2025Outline Edge AI Computing Background and MotivationOpenGeMM System ArchitectureOverviewGeMM Accelerator GeneratorMechanisms for High UtilizationReusability and Flexibility SummaryExperimental Results and SotA ComparisonConclusion and Future Work2Edge AI Co
3、mputing-NecessityDNN models become pervasive while evolving rapidly3Image ClassificationVideo GenerationLanguage AssistanceIntelligent RoboticsModel size of language modelsEdge AI Computing-NecessityDNN models become pervasive while evolving rapidlyEdge DNN deployment challenges1)High performance an
4、d energy efficiencyReal-time applicationLow battery capacity 4Edge AI Computing-NecessityDNN models become pervasive while evolving rapidlyEdge DNN deployment challenges1)High performance and energy efficiency2)FlexibilityReusable across DNN models5Edge AI Computing-NecessityDNN models become pervas
5、ive while evolving rapidlyEdge DNN deployment challenges1)High performance and energy efficiency2)Flexibility3)UnderutilizationLow effective computation6Edge AI Computing SotA WorksEfficiency vs.Flexibility/Reusability 7DSAs:NVDLA 1DepFiN 3Programmable platforms:CPUs/GPUs/FPGAEffi.Flex.Low efficienc
6、y and high control overheadReusable for diverse workloadsHigher flexibilityHigher efficiencyTailored to specific workloads Limited reusability and programmabilityEdge AI Computing SotA WorksEfficiency vs.Flexibility/Reusability 8RISC-V AI platforms:Gemmini 2,RedMule 3Flexible GeMM acceleratorEffi.Fl