1、Taxonomy of Failure Mode in Agentic AI Systems Pete Bryan,Giorgio Severi,Joris de Gruyter,Daniel Jones,Blake Bullwinkel,Amanda Minnich,Shiven Chawla,Gary Lopez,Martin Pouliot,Adam Fourney,Whitney Maxwell,Katherine Pratt,Saphir Qi,Nina Chikanov,Roman Lutz,Raja Sekhar Rao Dheekonda,Bolor-Erdene Jagdag
2、dorj,Eugenia Kim,Justin Song,Keegan Hines,Daniel Jones,Richard Lundeen,Sam Vaughan,Victoria Westerhoff,Yonatan Zunger,Chang Kawaguchi,Mark Russinovich,Ram Shankar Siva Kumar Contents Abstract.2 Introduction.2 Agentic systems:Functionality and common patterns.3 Overview of failure modes.6 What effect
3、s can these failure modes have?.7 Mitigations and design considerations.8 Limitations of our analysis.10 Case study:Memory poisoning attack on an agentic AI email assistant.10 Introduction.10 Context and setup.11 Baseline attack description.12 Mechanism of the attack.12 Results and observations.13 C
4、hallenges and mitigation strategies.15 Taxonomy Details.16 Novel security failure modes.16 Novel safety failure modes.19 Existing security failure modes.21 Existing safety failure modes.24 Acknowledgement.27 Related work.27 1 Abstract Agentic AI systems are gaining prominence in both research and in
5、dustry to increase the impact and value of generative AI.To understand the potential weaknesses in such systems and develop an approach for testing them,Microsofts AI Red Team(AIRT)worked with stakeholders across the company and conducted a failure mode and effects analysis of the current and envisa
6、ged future agentic AI system models.This analysis identified several new safety and security failure modes unique to agentic AI systems,especially multi-agent systems.In addition,there are numerous failure modes that currently affect generative AI models whose prominence or potential impact is great