当前位置:首页 > 报告详情

从硅谷到人工智能服务:优化推理和工程的未来.pdf

上传人: 明**** 编号:1011469 2025-12-21 13页 931.42KB

1、From Silicon to AI Serving:From Silicon to AI Serving:Optimizing Inference and Engineering Whats NextFrom Silicon to AI Serving From Silicon to AI Serving:Optimizing Inference and Engineering Whats NextDonggunDonggunKimKimHead of ProductFUTURE TECHNOLOGIES SYMPOSIUMThe AI Market moves fast hardware

2、must anticipate the future and be ready for optimizations yet to come.Designing Todays Chips for Tomorrows AIFeb 2023Jan 2024Dec 20247B13B33B65BLlama7B13B70BLlama27B13B34BCodeLlama70BCodeLlama8B70BLlama3405BLlama3.11B3B11B90BLlama3.270BLlama3.3 Llama4RNGD Tape-out Unveil*Source:Trends-Artificial Int

3、elligence(5/25)CustomerSamplingMPMacroscopic Trends in AI EvolutionIC Development ProcessProduct Enablement vs.AI Model VolatilityDevelopmentFuriosa RNGDPowerfully efficient and programmabledata center AI accelerator*RNGD is pronounced,Renegade512 TFLOPS64 TFLOPS(FP8)x 8 PE48 GBMemory Capacity256 MB

4、 SRAM384 TB/s On-chip Bandwidth1.5 TB/sMemory Bandwidth180 WTDP2 x HBM3CoWoS-STensor Contraction as a PrimitiveA well-designed architecture should reduce usage complexity within the target domain.Flop analysis for BERT*Source:Data Movement is All You Need:a case study on optimizing transformers,MLSY

5、S21Silicon to Serving:The Ongoing OptimizationFlexibility to Support AI ModelsFlexibility to Support AI Models*Source:Trends-Artificial Intelligence(5/25)Optimizing SystemOptimizing SystemEnabling a chip for AI serving requires optimization across multiple dimensions.Serving Efficiency MattersServin

6、g Efficiency MattersFuriosa LLM-Flexibility to Support AI ModelsLayered optimization tools designed to streamline the use of diverse AI models.Challenges in Serving-Serving EfficiencyAI serving optimization means managing various hazards,and this challenge becomes especially inte

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
明日何其多
明**...

该用户很懒,什么也没介绍

客服
商务合作
小程序
服务号
折叠