Meta’s Llama-4 Models to Launch as First-Party Offerings on Azure AI Foundry

Details

Microsoft will host and sell Meta’s Llama-4-Scout-17B and Maverick-17B models as first-party Azure services with enterprise SLAs.
The Maverick model includes 128 experts and uses FP8 precision for compute-efficient reasoning, while Scout is optimized for low-latency tasks.
Both models use Mixture-of-Experts architecture, enabling selective parameter activation that boosts throughput by 3.8× over dense models.
The models are fully integrated with Azure AI Foundry tools including RAG capabilities, Semantic Kernel agents, and managed GPU compute services.
They will be generally available in Q3 2025 across 12 Azure regions.

Impact

By offering Meta's advanced Llama-4 models as first-party services, Microsoft strengthens Azure’s AI platform for enterprise customers. The partnership leverages Meta’s model innovations alongside Azure’s tooling and infrastructure, making it easier for businesses to deploy intelligent multi-agent applications at scale.

Meta’s Llama-4 Models to Launch as First-Party Offerings on Azure AI Foundry

Details

Impact

Social

CONTENT

INFO