Details
- Google for Developers has introduced LiteRT-LM, a robust inference framework designed to bring generative AI directly to devices.
- The framework enables Gemini Nano and open-sourced Gemma models to run locally within Chrome, Chromebook Plus laptops, and Pixel Watch, bypassing the need for cloud-based computation.
- It leverages automated hardware acceleration, intelligently distributing tasks across CPU, GPU, and Neural Processing Units to reduce latency and conserve battery power.
- LiteRT-LM offers cross-platform compatibility with Android, Linux, and macOS, with anticipated support for Windows in the future.
- Developers can utilize a C++ preview API to integrate text, vision, or multimodal AI generation into native applications while maintaining user data privacy.
- The framework builds on the March 2025 release of Gemma 2B and 7B models, optimizing their deployment on edge devices.
Impact
This launch intensifies competition with Apple's Core ML and Meta's Llama Edge, as Google strengthens its presence in on-device AI. By eliminating the reliance on cloud processing, LiteRT-LM reduces costs and enhances privacy, particularly benefiting regulated industries and regions with limited connectivity. Google's strategy underscores a major industry pivot toward lightweight and privacy-conscious edge AI, potentially attracting more third-party developers and hardware manufacturers to the ecosystem.