当前位置:首页 > 报告详情

利用高级拥塞管理优化人工智能网络.pdf

上传人: 明**** 编号:1011744 2025-12-21 16页 987.24KB

1、Mohammad HanifAjay ChhatwalOptimizing AI Networks with Advanced Congestion ManagementOptimizing AI Networks with Advanced Congestion ManagementMohammad HanifAjay ChhatwalOCP Special Focus:Artificial Intelligence(AI)Why congestion management matters in AI networks?Types of AI Networks Scale-out and S

2、cale-upCongestion management in Scale-out networksBTS notifications PFC Aware ECN MarkingPacket TrimmingCSIG(Congestion Signaling)Congestion management in Scale-up networksEthernet for Scale-up NetworkingCBFCCall To ActionAgendaHigh bandwidth and low latency for optimal job completion timesTail late

3、ncy impacts job completion time significantlySynchronized and bursty trafficElephant flows with low entropyWhy congestion management matters in AI networks?Scale-up and Scale-out AI NetworksScale-upScale-outIn RackAcross RacksDatacenter1Datacenter2Across RacksSpineLeafSpineLeafAcross Data CentersBac

4、k to the Sender(BTS)notificationsPFC Aware ECN MarkingPacket TrimmingCSIG(Congestion Signaling)Congestion Management in Scale-out NetworksInstead of ECN Marking,upon congestion detection switch generates CNP and sends to the source directly Performance gainReduces the delay in congestion control loo

5、p(no blocking by PFC)Can send additional information about the congestion(example location and severity of congestion)back to the sourceNote that CNP generation even at the last-hop(DTOR)is beneficialIf Dest.NIC sends PFC it is not blocking CNP generationDelay of CNP generation at the dest.NIC can b

6、e high if it is done in firmwareFast CNP Generation in the ToR/Spine SwitchSNICSTORSpineSpineDTORDNICCNPCNPNotification generated by switch for congested packetSends notification back to sender(BTS)of original congested packetNotifies Node ID with Queue ID and Queue Length where the congestion occur

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
根据报告的内容,全文主要围绕AI网络中的拥塞管理展开,包括以下关键点: 1. **拥塞管理的重要性**:在AI网络中,拥塞管理对于优化任务完成时间和降低延迟至关重要。 2. **AI网络类型**:分为Scale-up和Scale-out两种类型,涉及不同层面的网络结构和数据传输。 3. **Scale-out网络拥塞管理**:包括BTS通知、PFC Aware ECN Marking、Packet Trimming和CSIG(拥塞信号)等机制。 4. **Scale-up网络拥塞管理**:利用以太网技术,特别是CBFC(基于信用的流量控制)来避免拥塞。 5. **行动号召**:鼓励参与OCP(开放计算项目)的相关工作组,如SAI(交换抽象接口)和SUE-T(Scale Up Ethernet),以推动拥塞控制技术的发展。
如何管理拥堵?" "揭秘AI网络拥堵解决方案!" "AI网络拥堵,你了解多少?"
客服
商务合作
小程序
服务号
折叠