1、#BHUSA BlackHatEventsReinforcement LearningReinforcement Learningfor Autonomous Resilientfor Autonomous ResilientCyber DefenceCyber DefenceIan Miles,Sara Farmerarcdfnc.co.ukFrazer-Nash Reference:016273-146560V#BHUSA BlackHatEventsBriefing ContributorsIanSara2#BHUSA BlackHatEventsAutonomous Resilient
2、 Cyber DefenceUK ARCD programMission:Machine speed cyber response&recovery on military platforms&systems Defending IT&OT systemsGoals:Understand&demonstrate Autonomous Cyber Defence(ACD)Build national skills&knowledge100+projects,4 yearsBecauseNot enough cyber responders Not enough personnel No cybe
3、r defenders at tactical edge Military operator overloadMachine speed attacks Volume,velocity,varietySOAR limitations Context awareness,mission awareness3#BHUSA BlackHatEventsARCD EcosystemLeads Defence Science&Technology Laboratory:Customer Frazer-Nash Consultancy:ARCD Concepts QinetiQ:ARCD Test&Eva
4、luation Alan Turing Institute:Fundamental ResearchPartnerships4MLExpertsCyberDefenceExpertsUnicorn image:UK Supply Chain200 suppliers registered to view opportunities#BHUSA BlackHatEventsARCD Research5Cyber Threat DetectionCyber Situational AwarenessAutonomous Machine Speed Response&RecoveryIntegrat
5、ionFundamental ResearchFocus ofthis BriefingGovernance&AssuranceImage:www.nist.gov/cyberframework#BHUSA BlackHatEventsACD:Autonomous Cyber DefenceTrains and deploys blue(defense)cyber agents Rule-based or probabilistic reasoningObserving a cyber environment Capable of detecting an attack Inputs=conv
6、erted infosec feeds(pcaps etc)Acting in a cyber environment Respond or recover in real time Acts,or suggests actions to humansAutonomous Cyber Operations(ACO)trains both blue and red(attacker)agentsImage:CAGE4 challenge6#BHUSA BlackHatEventsTraining Defence AgentsLearning algorithms RL:PPO,DQN,DDQN,