Details

  • NVIDIA unveiled the Rubin platform at CES 2026, featuring extreme codesign across six chips: Rubin GPU, Vera CPU, NVLink 6 switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet switch.
  • Rubin GPU delivers 50 petaflops of NVFP4 inference compute with third-generation Transformer Engine and HBM4 memory, enabling up to 10x lower inference token cost and 4x fewer GPUs for training MoE models versus Blackwell.
  • Vera CPU includes 88 custom Arm-compatible Olympus cores optimized for data movement and agentic processing in AI factories.
  • Platform innovations include second-generation RAS Engine for fault tolerance, third-generation Confidential Computing across rack-scale systems, and NVLink 6 with 3.6 TB/s GPU-to-GPU bandwidth.
  • Rubin-based systems like Vera Rubin NVL72 and HGX Rubin NVL8 enter full production now, with availability from partners including AWS, Google Cloud, Microsoft, CoreWeave, and Nebius starting H2 2026.
  • Expanded ecosystem support from Red Hat for AI stack optimization and collaborations with AI labs like Anthropic and Cohere.

Impact

NVIDIA's Rubin platform intensifies its hardware lead in AI infrastructure, delivering up to 10x reductions in inference costs and 4x fewer GPUs for training mixture-of-experts models compared to the Blackwell platform released just months earlier, pressuring rivals like AMD's MI400 series and Intel's Gaudi efforts to match this pace of co-designed efficiency gains. By slashing token generation expenses to one-tenth, Rubin lowers barriers for mainstream AI adoption, enabling cloud providers like AWS, Microsoft Azure, and CoreWeave to offer cheaper agentic AI and reasoning workloads that accelerate enterprise and autonomous applications. This aligns with surging demand for scalable inference in long-context models and multi-step reasoning, while features like rack-scale Confidential Computing bolster data security amid rising regulatory scrutiny on AI privacy. Over the next 12-24 months, Rubin's million-GPU readiness and NVLink advancements could redirect hyperscaler capex toward NVIDIA-dominated AI factories, widening the GPU supply bottleneck and steering R&D toward post-training optimization and edge-case simulation for level-4 autonomy.