Details
- Google AI announced the launch of Gemini 3.1 Flash Live on March 26, 2026, targeting developers building real-time voice and vision agents.
- Key features include responses as fast as natural dialogue and improved task completion in noisy environments.
- Available now in Gemini Live within the Gemini App and Search Live; Gemini Live API and Google AI Studio in preview; Gemini Enterprise for Customer Experience.
- Model details from blog: Gemini 3.1 Flash-Lite, the fastest and most cost-efficient in the Gemini 3 series, priced at $0.25/1M input tokens and $1.50/1M output tokens.
- Outperforms Gemini 2.5 Flash with 2.5x faster time to first token, 45% higher output speed, and better benchmarks like 86.9% on GPQA Diamond.
- Supports multimodal inputs (text, images, audio, video), function calling, code execution, grounding with Google Search, and large context windows for high-volume tasks.
Impact
Google's Gemini 3.1 Flash Live launch intensifies competition in low-latency AI agents, pressuring rivals like OpenAI's GPT-4o mini and Anthropic's Claude Haiku by offering superior speed at lower costs—$0.25 input vs. $0.25 for GPT-5 mini but with 363 tokens/s output speed exceeding Grok 4.1's 145. This widens access for real-time apps like voice agents and content moderation, accelerating adoption in high-volume enterprise workflows while matching or exceeding prior Gemini models in quality. It positions Google to capture more developer market share amid the push for efficient multimodal AI.
