Details
- Anthropic launched a funding initiative for third-party organizations to develop evaluations measuring advanced AI capabilities and safety risks, announced on July 23, 2025.
- The program targets researchers and developers to create tools in three priority areas: AI Safety Level assessments, advanced capability metrics, and evaluation infrastructure.
- Evaluations will focus on critical risks including cybersecurity, CBRN threats, model autonomy, and national security, as defined in Anthropic's Responsible Scaling Policy.
- This responds to current limitations in AI evaluations, where demand for robust safety assessments outpaces supply.
- Proposals must meet principles like high difficulty, expert baselines, and real-world threat modeling, with funding tailored to project needs.
Impact
Anthropic's initiative could establish industry standards for AI evaluation, enhancing safety and trust in frontier models. It addresses urgent gaps in assessing national security risks, potentially informing regulatory frameworks. However, concerns persist about commercial influence on independent evaluations, which may affect broader adoption.