Details

  • Nvidia's Blackwell Ultra architecture set new records in its first MLPerf Inference v5.1 appearance, surpassing previous benchmarks including DeepSeek-R1, Llama 3.1 405B, Llama 3.1 8B, and Whisper.
  • The GB300 NVL72 system with Blackwell Ultra achieved 1.4x faster performance than GB200 NVL72 systems on DeepSeek-R1, and delivered up to five times the throughput of Hopper-based platforms.
  • Blackwell Ultra introduces significant improvements: 1.5x more NVFP4 compute, double the attention-layer acceleration, and 1.5x greater HBM3e memory capacity, featuring 15 PetaFLOPS and 288GB memory per GPU, built with 208 billion transistors.
  • With a thermal design power (TDP) of 1400W and full support for PCIe 6.0, the system reflects Nvidia's rapid development and commitment to scaling AI workloads.
  • This marks the first MLPerf submission for Blackwell Ultra, underscoring the acceleration of AI hardware evolution amid growing demand for larger, more complex AI models.

Impact

Nvidia's latest MLPerf results strengthen its lead in the AI acceleration market, a critical factor as computational demands race upwards. While major competitors like Google, AMD, and Intel are developing their own custom AI chips, Blackwell Ultra's performance milestones highlight Nvidia's ability to stay ahead in a fiercely contested field, shaping the trajectory of enterprise and cloud-scale AI deployments.