Google DeepMind Unveils Gemini Robotics 1.5: A Step Toward AGI-Powered Robots

Details

Google DeepMind has introduced Gemini Robotics 1.5, an agentic platform that combines two AI models to enable real-world robots to execute complex, multi-step tasks.
The system merges Gemini Robotics-ER 1.5, a strategic embodied-reasoning model, with a low-level VLA controller that translates high-level plans into precise physical movements.
This setup allows robots to access information via Google Search, follow local rules, decompose goals into subtasks, and dynamically adjust plans, such as sorting municipal waste or packing a suitcase based on weather conditions.
Internal natural-language chain-of-thought is generated before any action, providing transparent insight for developers and auditors into the robot’s decision-making process.
DeepMind reports record-setting performance on both academic and internal benchmarks, noting that learned skills can transfer across different robots without additional retraining.
The embodied-reasoning model is now accessible through the Gemini API on Google AI Studio, welcoming researchers and commercial robotics teams.
DeepMind positions this release as progress toward “AGI in the physical world,” transcending single-step instructions for more general problem-solving abilities.

Impact

This release heats up competition with OpenAI, Tesla’s Optimus, and Figure AI as they all pursue advanced reasoning in robotics. By making Robotics-ER 1.5 available through the Gemini API, Google could spark a surge in robotics development across industries, while its transparent reasoning supports emerging safety regulations. If benchmark successes extend to real-world environments, the market may shift toward language-based, cloud-connected control systems for robots in the coming years.

Google DeepMind Unveils Gemini Robotics 1.5: A Step Toward AGI-Powered Robots

Details

Impact

Social

CONTENT

INFO