Details
- Microsoft introduced Mu, a small but powerful language model designed to run entirely on-device using Neural Processing Units (NPUs), powering the AI agent for Windows Settings by mapping natural language commands to system functions.
- The rollout targets Windows Insiders with Copilot+ PCs and draws on collaborative work between Microsoft’s Applied Science Group and partner teams.
- Mu's encoder-decoder architecture is optimized for efficiency, processing over 100 tokens per second on NPUs, and is trained with 3.6 million samples for precise control of hundreds of system settings.
- This model replaces the larger Phi model, which failed to meet latency requirements, addressing the need for fast, intuitive navigation of complex settings with under 500 milliseconds response time.
- Initial tests on Qualcomm Hexagon NPUs showed a 47% drop in first-token latency and up to 4.7 times faster decoding than comparable decoder-only models, with positive early feedback and plans for broader deployment.
Impact
Mu signals Microsoft’s move toward real-time, private AI experiences by processing language inputs locally rather than in the cloud. This approach elevates user privacy, reduces latency, and follows a growing industry trend toward efficient, specialized AI for devices. As competitors invest in on-device models, Mu gives Microsoft an early edge in edge AI for mainstream computing.