Details
- Microsoft will host and sell Meta’s Llama-4-Scout-17B and Maverick-17B models as first-party Azure services with enterprise SLAs.
- The Maverick model includes 128 experts and uses FP8 precision for compute-efficient reasoning, while Scout is optimized for low-latency tasks.
- Both models use Mixture-of-Experts architecture, enabling selective parameter activation that boosts throughput by 3.8× over dense models.
- The models are fully integrated with Azure AI Foundry tools including RAG capabilities, Semantic Kernel agents, and managed GPU compute services.
- They will be generally available in Q3 2025 across 12 Azure regions.
Impact
By offering Meta's advanced Llama-4 models as first-party services, Microsoft strengthens Azure’s AI platform for enterprise customers. The partnership leverages Meta’s model innovations alongside Azure’s tooling and infrastructure, making it easier for businesses to deploy intelligent multi-agent applications at scale.