当前位置:首页 > 报告详情

针对大规模人工智能系统优化的内存技术:带宽、容量和连接性.pdf

上传人: 明**** 编号:1012004 2025-12-21 24页 2.69MB

1、Memory technology optimized for at-scale AI systemsSiamak Tavallaei,Sr.Principal Engineer,Samsung Semiconductor,Inc.MillindMittal,Founder,MemWize.AISERVER:COMPOSABLE MEMORY SYSTEMS(CMS)Levels of memory tiers in AI infrastructure Memory growth drivers and mapping of workloads to memory tiersExample S

2、W frameworks for AI LLM inference A candidate cluster architecture for memory scaling Considerations and role of optics in addressing memory scaling challenge Outline Baseline Server NodeBaseline Server NodeSRAM/Cache T0CPU-MemT1Local Node StorageT2Storage on DC NetworkT3M:Local DDRx MemoryC:CPUS:NV

3、Me/PCIe SSD StorageN:NICMemory TiersHigh-BWLow-latencyLarger CapacityNetworked Bulk CapacityAI Infra Memory Tiers SRAM/CacheT0GPU-HBMT1CPU-Mem(+CXL)T2(T2+)Storage on SOT3-SOGPU-HBM-SUT1-SUCPU-Mem-SU(+CXL)T2(T2+)-SUStorageT3Storage on SUT3-SUStorage on DC Network T4Sever-centric Memory Tier Pyramid v

4、s.AI Infra Memory tier PyramidsScale-up(SU)and Scale-out(SO)FabricsBaseline Server NodeAI Infrastructure SRAM/Cache T0CPU-MemT1Local Node StorageT2Storage on DC NetworkT3SRAM/Cache T0CPU-MemT1CPU-CXL Fabric MemoryT2-CXLCPU-CXL MemT1+Storage on DC NetworkT4Local Node StorageT3Memory SOT2-SOServer Nod

5、e with memory expansion Remote Memory T1(+)SO/DCReasoning and multi-modal models,and Agentic AI driving accelerated growth in memory capacity and bandwidthMultiple fold increase in active KV contextsLonger lived contexts multi-turn,shared contexts(e.g.code generation)Growing knowledge DBs and databa

6、ses of past-conversations Growing models sizes Multi-terabyte capacity for embeddings for recommendation modelsGrowing size of memory-resident modelsMixture-of-experts,Collection-of-experts,.Growing Memory Capacity Needs Mapping AI use cases to Memory TiersScratch pad for computation units Model wei

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
明日何其多
明**...

该用户很懒,什么也没介绍

客服
商务合作
小程序
服务号
折叠