IMDA:2025全球人工智能保障试点项目主报告:现实世界生成式AI系统测试(英文版)(26页).pdf

编号:912738 PDF  DOCX 26页 29.60MB 下载积分:VIP专享
下载报告请您先登录!

IMDA:2025全球人工智能保障试点项目主报告:现实世界生成式AI系统测试(英文版)(26页).pdf

1、Table of ContentsTitlePageExecutive Summary00Chapter 1-Introduction011.1Rationale011.2Target outcomes021.3Ground rules02Chapter 2-Pilot participants and use cases032.1Participant profile032.2Use cases042.3Patterns of LLM usage05Chapter 3-Risk Assessment and Test Design063.1Risk Assessment063.2Metric

2、s073.3Testing approach:Test datasets073.4Testing approach:Evaluators08-09Chapter 4-Test Implementation104.1Test Environment104.2Test data and effort104.3Implementation challenges10Chapter 5-Lessons learnt115.1Test what matters11-125.2Dont expect test data to be fit for purpose13 Guest Blog:Learning

3、from self-driving cars:Simulation Testing14 Guest Blog:Synthetic Data for Adversarial Testing155.3Look under the hood165.4Use LLMs as judges,but with skill and caution17 Guest Blog:LLM-as-a-judge:Pros and Cons185.5Keep your human SMEs close!19 Guest Blog:LLMs cant read your mind20Chapter 6-Whats nex

4、t?21-22Executive SummaryFrom Model Safety to Application ReliabilityAs Generative AI(“GenAI”)transitions from personal productivity tools and consumer-facing chatbots into real-world environments like hospitals,airports and banks,it faces a higher bar on quality and confidence.01Risk assessments dep

5、end heavily on the context of the use case e.g.,lower tolerance for error in a clinical application than a customer service chatbot.02Given the higher complexity involved in integrating foundation models with existing data sources,processes and systems,there are more potential points of failure.Howe

6、ver,much of the current work around AI testing focuses on the safety of foundation models,rather than the reliability of end-to-end applications.The Global AI Assurance Pilot was an attempt to address this gap:not through academic research,but by building upon real-life experiences of practitioners

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(IMDA:2025全球人工智能保障试点项目主报告:现实世界生成式AI系统测试(英文版)(26页).pdf)为本站 (111111) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠