Details

  • Google DeepMind introduced the third version of its Frontier Safety Framework (FSF) on September 22, 2025, rolling out new measures to address high-level risks posed by advanced AI systems.
  • The update features a new Critical Capability Level designed to identify and mitigate the risk of harmful manipulation by AI models, especially those capable of influencing beliefs or actions at scale.
  • The framework now includes an expanded strategy for tackling misalignment risks, and introduces new protocols for models that could accelerate AI development to potentially destabilizing levels.
  • This latest iteration builds on prior FSF releases from May 2024 and February 2025, drawing on operational experience and shifting best practices in AI safety.
  • Risk assessment procedures have been deepened to incorporate detailed safety case reviews for both public releases and internal deployments when models reach specific capability benchmarks.

Impact

With this update, Google DeepMind cements its role at the forefront of AI safety, responding to scrutiny from the Future of Life Institute's Safety Index and responding to global regulatory trends. As AI technology advances, robust and transparent frameworks like DeepMind’s FSF will be pivotal in upholding industry standards and public confidence.