1、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Justin Tan(He/Him)Solutions ArchitectAmazon Web ServicesBetty Zheng(She/Her)Senior Developer Advoc
2、ateAmazon Web ServicesBuilding AIOps agents for cloud application resilience on AWSD E V 3 2 9Yagr Xu(He/Him)Senior Solutions ArchitectAmazon Web Services 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Agenda Background and challenges Solution evolution Demo Best practise with Am
3、azon Bedrock AgentCore Key takeaways 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Situation160+applications across nine domainsOperation teams segmented cross multiple layersSeven sub-departments operating across multiple countriesCustomerBackgroundandChallengesChallengesComple
4、x call chains;massive log volumePermission boundariesLong communication loopsImpactSlow issue to resolutionHeavy manual effortBusiness loss 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.RedundancyEliminate single points of failure by spare components or full replicas across zone
5、s.Sufficient CapacityEnsure resources meet demand.Timely OutputMeet latency expectations(P50)to satisfy usersResilience AnalysisFrameworkCorrect OutputAccuracy matters:incorrect results can be worse than no response.Fault IsolationContain failures within boundaries to prevent cascading effects.2025,
6、Amazon Web Services,Inc.or its affiliates.All rights reserved.Complex architecture:microservices,containers,multi-region and hybrid-cloud Inevitable failures:network partitions,dependency breakdowns,config drift Operational overload:inability to keep up with the incident volume and velocity Performa