Details

  • Launched on October 7, 2025, Gemini 2.5 “Computer Use” is a tailored variant of Gemini designed to operate web browsers autonomously using clicks, scrolling, and text input.
  • The model combines Gemini's multimodal vision-language reasoning to interpret dynamic web layouts, HTML elements, and user instructions before executing actions.
  • Google claims Gemini 2.5 outperforms previous models on WebArena, MiniWob++, and internal BrowserGym benchmarks, while also offering faster responses than Gemini 1.5.
  • Safety features allow developers to whitelist domains, set transaction limits, and block sensitive operations, aiming to mitigate misuse risks.
  • A public preview is now available through Google AI Studio and the Gemini API, with broad general availability expected in early 2026.

Impact

Google’s move intensifies competition with OpenAI’s GPT-4o Actions and Microsoft Copilot Studio in the fast-growing agent automation sector. This browser-level automation could disrupt traditional RPA vendors by making automation more accessible and cost-effective. Enhanced safety controls also position Google well for emerging regulatory requirements in both the EU and US.