Google T5Gemma 2 Launches with Multimodal, Long-Context, and 140-Language Capabilities

Details

Google AI Developers have unveiled T5Gemma 2, the successor to Gemma 3, on December 18, 2025.
This new model features native multimodality, enabling it to handle combined text and image inputs and outputs without the need for separate vision encoders.
T5Gemma 2 expands its context window to process far larger documents than Gemma 1's previous 32,000-token limit, though the new maximum was not specified.
The tokenizer now accommodates over 140 languages, aiming for improved translation accuracy and code-switching for global users.
With a refreshed architecture, the model employs sparsely gated Mixture-of-Experts layers and improved attention routing, resulting in better throughput on TPUs and the latest Nvidia GPUs.
Pre-trained checkpoints are immediately available on Kaggle and Hugging Face, with Colab notebooks for demonstrations and Vertex AI enabling managed inference with a single click.
All assets are released under a permissive open-weights license, supporting commercial fine-tuning, evaluation, and experimentation on devices.

Impact

Reviving the encoder-decoder line positions Google competitively against Meta’s Llama 3 and OpenAI’s GPT-4o, contributing to a broader ecosystem of openly accessible models. Integrated multimodal features can lower costs for startups by simplifying product stacks, while the open licensing makes T5Gemma 2 attractive for enterprises navigating new AI compliance standards. Its extensive language support may spur global adoption and increased development of multilingual applications, especially in emerging markets facing language barriers.

Google T5Gemma 2 Launches with Multimodal, Long-Context, and 140-Language Capabilities

Details

Impact

Social

CONTENT

INFO