当前位置:首页 > 报告详情

面向下一代人工智能系统的光子互连.pdf

上传人: 明**** 编号:1011400 2025-12-21 17页 2.40MB

1、Benjamin Lee,NVIDIAPhotonic Interconnectfor Next-Generation AI SystemsPhotonic Interconnectfor Next-Generation AI SystemsBenjamin Lee,NVIDIASPECIAL FOCUS:PHOTONICSGPUs Unlock the AI Revolution“Today,were at the cusp of a major shift in computing.The intersection of AI and accelerated computing is se

2、t to redefine the future.”Jensen HuangIngredients for AILarge data setsAlgorithmsEfficient computeAI models and AI data sets are largeAI model parameter sizes have grown 70,000X in a decadeParallelized across 4 dimensions(data,pipeline,tensor,and expert)No.of GPUs used for training and inference of

3、state-of-the-art generative AI models can be in the 10,000s to 100,000s Tirumala&Wong,HotChips 2024 Single-chip Inference PerformanceH100A100Q8000K20XM40P100V1001000Xin 10 years2 years B.Dally,HotChips 2023 J.Huang,GTC 2024 FULL DATA CENTER WITH 32,000 GPUsAI FACTORY FOR THE NEW INDUSTRAL REVOLUTION

4、645 exaFLOPS of AI performance13PB of fast memory58PB/s of aggregate NVLink bandwidth16.4 petaFLOPs of In-Network ComputingCost-effective and energy-efficient bandwidth scaling for:both scale-out and scale-up networks both switch I/O and GPU I/O both switches and cables both electrical and optical t

5、echnologiesWhat is Needed from the Networks to Power Future Generations?Switch ScalingPublic data from commercial switch ASICs from a variety of vendors over the past 20 years:2X every 2 years Energy per bit has decreased due in part to CMOS scaling,but not fast enough to keep power from increasing.

6、This is expected to get worse as CMOS scaling slows.I/O power is scaling disproportionately to core power consumption.Need a low-power I/O solution,which can be adopted for both switches and GPUs.All bandwidths are per directionSwitch ScalingPublic data from commercial switch ASICs from a variety of

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
根据报告的内容,全文主要内容概括如下: 1. **AI计算需求增长**:AI模型和数据集规模增长迅速,对计算能力需求巨大,单个GPU训练和推理能力需达到数千甚至数万级别。 2. **GPU性能提升**:NVIDIA H100 GPU在单芯片推理性能上实现了显著提升,性能在10年内增长了1000倍。 3. **数据中心规模**:一个配备32,000个GPU的数据中心可实现645 exaFLOPS的AI性能,拥有13PB的快速内存和58PB/s的NVLink带宽。 4. **网络带宽挑战**:随着CMOS缩放放缓,I/O功耗增长,需要低功耗I/O解决方案。 5. **GPU I/O带宽**:GPU I/O带宽接近交换机带宽,能量效率相当。 6. **光学接口和封装**:长距离接口、共封装光学和2.5D光学技术被提出作为提高带宽密度和降低功耗的解决方案。 7. **2.5D光学优势**:2.5D光学技术可进一步降低I/O和模块功耗,应对未来热挑战。 8. **光子交换机潜力**:光子交换技术有望进一步降低未来系统的网络功耗。
AI时代的加速引擎?" 带宽突破揭秘!" AI数据中心革命性升级!"
客服
商务合作
小程序
服务号
折叠