1、Midhun Somasundaran,Software Engineer,MetaYikai Lin,Research Scientist,MetaCapacity-Aware Adaptive RoutingCapacity Aware Adaptive RoutingMidhun Somasundaran,Software Engineer,MetaYikai Lin,Research Scientist,MetaOCP SPECIAL FOCUS:ARTIFICIAL INTELLIGENCE(AI)Metas Backend TopologyProblem StatementSolu
2、tion OverviewResultsCall to ActionAgendaMetas Non Scheduled Fabric(NSF)Topology Non-blocking Planar architecture Modular units(pods)Multiple links between rack and spine Adaptive routing for load balancingReference Network TopologyTwo jobs running in three podsFirst pod is shared by both jobsCongest
3、ion Spread from Remote FailuresSource rack switch is unaware of remote failuresNetwork-wide congestionImpacts all jobs,not just on failed pathsCapacity-Aware Adaptive RoutingSolution:Rack switches reduce traffic towards impacted destinationsMoves congestion closer to sourceProtecting jobs not on fai
4、led pathTopology Learning and Capacity PropagationBGP propagates per-hop capacity infoRack switches learn E2E capacityFailed links result in paths being pruned in the data plane.Propagating Capacity with ReachabilityEach hop records#of paths(A,B,C)in link bandwidth extended communityHow do rack swit
5、ches recover A,B,C values?Encoding Topology and CapacityWe treat lbw_ext_comm(unsigned 32)as a bit vectorEncoding_scheme:rack_id(4),spine_id(4),A(8),B(8),C(8)Data-Plane SolutionEgress paths pruned to prevent remote congestionRack link failures result in pruning paths to destination rackSpine link fa
6、ilures result in pruning paths to all impacted racksResultsCapacity unawareCapacity aware20%performance improvement for jobs on the failure pathJobs outside the failure path remain unaffectedRemote rack link failures significantly increase