NVIDIA Blackwell Breaks AI Inference Records in MLPerf v5.0

Details

NVIDIA's Blackwell architecture made its first appearance in MLPerf with the GB200 NVL72 system, delivering 30x greater throughput on the new Llama 3.1 405B benchmark compared to the previous H200 GPU generation as of April 2025.
Fifteen major partners, including Google Cloud, Dell, and HPE, participated in the submissions, reflecting widespread industry collaboration and support for Blackwell-powered systems.
The GB200 leverages 72 GPUs linked by NVLink as a unified accelerator, backed by a tailored software stack to meet the more stringent latency benchmarks introduced with Llama 3.1 405B and Llama 2 70B Interactive.
NVIDIA's Hopper H200 GPU still showed strong year-over-year progress, with a 1.6x improvement in Llama 2 70B inference performance, underscoring ongoing advancements across architectures.
NVIDIA’s AI inference ecosystem now spans multiple leading cloud providers and server manufacturers, further extending its market reach and reinforcing ecosystem strength.

Impact

By setting new performance records in MLPerf v5.0, Blackwell cements NVIDIA’s dominance in large-scale AI inference. With unmatched throughput for complex models and robust industry adoption, NVIDIA strengthens its lead in powering AI data centers and factories. Despite efforts from AMD and Intel, NVIDIA’s comprehensive ecosystem and rapid innovation continue to set the pace in the AI hardware race.

NVIDIA Blackwell Breaks AI Inference Records in MLPerf v5.0

Details

Impact

Social

CONTENT

INFO