Google Unveils Gemma 3n: Sub-10B Multimodal AI Model Tailored for Edge Devices

Details

Google AI Developers has launched Gemma 3n, an open-weights multimodal language model specifically designed for phones, single-board computers, and other edge devices.
The model understands 140 written languages and supports vision and audio capabilities in 35 languages, essentially doubling the linguistic range of February’s Gemma 2.
Offered in 2B and 4B parameter versions, Gemma 3n is the first model under 10B parameters to achieve a 1,300+ score on the LM Arena benchmark.
Featuring the MatFormer backbone and new Per-Layer Embeddings (PLEs), it reduces VRAM requirements by about 30 percent compared to models like Llama-3-8B and enhances reasoning, math, and code accuracy.
Model checkpoints are accessible on Kaggle and HuggingFace, and Gemma 3n runs natively on popular platforms such as llama.cpp, Ollama, LM Studio, and Unsloth.
Developers can fine-tune or prototype with Gemma 3n immediately in Google AI Studio, with Vertex AI API access and supplementary guides rolling out as well.

Impact

Google's Gemma 3n delivers cloud-level reasoning to edge devices, challenging the cost-effectiveness of server-bound GPT-3.5 models. With built-in vision support, it intensifies competition against rivals like Apple's upcoming "Apple Intelligence" and Meta’s Llama-3-8B. The model’s efficient architecture and open licensing could speed up multilingual AI adoption, pressuring incumbents and influencing regulatory debates around on-device AI in Europe.

Google Unveils Gemma 3n: Sub-10B Multimodal AI Model Tailored for Edge Devices

Details

Impact

Social

CONTENT

INFO