Details

  • Google has launched LiteRT-LM in alpha, offering developers a C++ interface to the production framework behind Gemini Nano, which now powers AI features across Chrome, Chromebook Plus, and Pixel Watch, collectively serving hundreds of millions of devices.
  • The framework is available from Google AI Edge and enables on-device large language model inference, removing dependence on internet connectivity and eschewing per-use API costs.
  • LiteRT-LM features an Engine/Session architecture, which lets various features utilize a single foundational AI model with lightweight adapters, optimizing resource sharing while individual sessions manage separate tasks and conversations.
  • Building atop LiteRT (previously known as TensorFlow Lite), this is Google's first offering of its low-level C++ interface for on-device AI, previously only available through higher-level APIs.
  • LiteRT-LM supports deployment across platforms including Android, Linux, macOS, Windows, and Raspberry Pi, with hardware acceleration on CPU, GPU, and NPU, notably extending NPU support to Qualcomm and MediaTek chipsets in June 2025.

Impact

By opening LiteRT-LM to developers, Google empowers the creation of custom on-device AI solutions that rival cloud services in capability while delivering improved privacy and responsiveness. This move intensifies the race among tech giants like Apple, Microsoft, and Meta to define the future of edge AI by bringing advanced language models to billions of devices worldwide.