Google DeepMind unveils plan to protect itself from its own rogue AI agents

Google DeepMind published a road map to treat internal AI agents like potential rogue insiders, shifting from pure "alignment" work to layered security measures.
They’re building dynamic access controls and real-time behavior monitors (a prototype analyzed ~1M coding tasks and helped monitor the Gemini Spark agent); most alerts turn out to be misinterpretation or overeagerness rather than malice.
The plan includes a TRAIT&R taxonomy of threats (loss of control, work sabotage, direct harm) and is already being rolled out as a v0.1 safety framework.