Details
- Google AI Developers showcased Gemini 3 Flash handling complex function calling by sequencing tasks to prepare ramen, reasoning across 100 ingredients and 100 tools simultaneously in a demo video.
- Gemini 3 Flash, released December 17, 2025, is now the default model in the Gemini app, offering PhD-level reasoning, frontier performance on benchmarks like GPQA Diamond (90.4%), and multimodal understanding for text, images, audio, and video.
- It combines Gemini 3 Pro-grade reasoning with Flash-level speed, being 3x faster than Gemini 2.5 Pro, using 30% fewer tokens, and priced at $0.50/1M input tokens and $3/1M output tokens.
- Strong in agentic workflows, it scores 78% on SWE-bench Verified for coding agents, outperforming Gemini 3 Pro, and enables near real-time applications like game assistance, video analysis, and A/B testing.
- Available via Vertex AI, Gemini Enterprise, and CLI; adopted by companies like JetBrains, Figma, Salesforce, Workday, and Box for efficient reasoning in business workflows.
- Rolling out globally as default in Google Search AI Mode for precise, fast responses with tool use and real-time web info.
Impact
Google's Gemini 3 Flash demo underscores its push toward agentic AI supremacy, delivering Pro-level reasoning at Flash speeds that pressure rivals like OpenAI's o1 and Anthropic's Claude in high-frequency workflows. By handling 100-tool orchestration in real-time ramen prep, it exemplifies scalable function calling essential for autonomous agents, narrowing the gap with larger models on benchmarks like GPQA while slashing costs by a fraction—potentially accelerating enterprise adoption in coding, multimodal analysis, and search. This efficiency frontier, with 3x speed over prior Pro versions and low-latency for live apps, aligns with trends in on-device inference and GPU optimization, steering R&D toward hybrid fast-reasoning systems. Over 12-24 months, it could redirect funding to agentic platforms, intensifying competition as Google integrates it into Search and Vertex AI, widening access while Box and Figma-like adopters validate its operational edge over slower incumbents.
