《超越氛围测试:将任务驱动型 genAI 工作负载投入生产 [重复].pdf》由会员分享,可在线阅读,更多相关《超越氛围测试:将任务驱动型 genAI 工作负载投入生产 [重复].pdf(26页珍藏版)》请在三个皮匠报告上搜索。
1、 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.A I M 3 3 0Beyond vibe testing:Productionalizing mission-driven genAI workloadsMike George(he/him)Principal Solutions ArchitectAmazon Web ServicesVeena Chandran(she/
2、her)Senior Solutions ArchitectAmazon Web Services 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.The problem with vibesInconsistent results across team membersNo baseline for improvement trackingDifficulty scaling
3、 evaluation process 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.The testing gap Lack of deterministic test cases Lack of test case automation Lack of continuous monitoring 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Building test casesConsider the use ca
4、se.What is the“happy path?”What does good look like along this happy path?2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Building test casesConsider edge cases.What would an unclear or incomplete path look like?2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Bu
5、ilding test casesConsider failure modes.What would improper,invalid,or inappropriate user input look like?2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Dont boil the ocean 3Build test cases 2Identify harmful events,evaluate risk1Define the use case and relevant stakeholders 2025
6、,Amazon Web Services,Inc.or its affiliates.All rights reserved.Prioritize based on riskmeasure of an events probability of occurringmagnitude or degree of the consequences 2025,Amazon Web Services,Inc.or its affiliates.All rights reserved.Prioritize based on riskmeasure of an events probability of o