Details

  • Google DeepMind upgraded Gemini 3 Deep Think, a specialized reasoning mode, to tackle modern science, research, and engineering problems with practical applications.
  • Demonstrates state-of-the-art performance: 84.6% on ARC-AGI-2 logical reasoning benchmark (vs. Claude Opus 68.8%, GPT-5.2 52.9%) and 48.4% on Humanity’s Last Exam, setting new standards.
  • Achieves gold medal-level results on 2025 International Physics and Chemistry Olympiads written sections, plus 50.5% on CMT-Benchmark for advanced physics.
  • Now available in Gemini app for Google AI Ultra subscribers; first-time API access via Vertex AI Early Access Program for researchers and developers.
  • Example: Wang Lab at Duke University uses it to design new semiconductor materials, moving from theory to real-world use.
  • Built in partnership with scientists for messy, open-ended problems in chemistry, physics, math, and coding.

Impact

Google DeepMind's Gemini 3 Deep Think upgrade positions Google as a leader in applied AI reasoning, significantly outpacing rivals like Anthropic's Claude Opus (68.8% on ARC-AGI-2) and OpenAI's GPT-5.2 (52.9%) on key benchmarks for abstract logic and frontier science tasks. This leap in performance, including gold-medal olympiad results and practical semiconductor design at Duke, accelerates real-world R&D in materials science and physics, potentially lowering barriers for researchers via accessible API early access. It pressures competitors to match specialized reasoning modes, shifting market dynamics toward agentic tools that handle incomplete data and multi-step engineering. Over the next 12-24 months, expect intensified focus on benchmark-topping models, driving funding into reasoning-centric AI and widening adoption in academia and industry amid growing demands for reliable scientific companions.