NVIDIA Launches Megatron Core Support for Higher-Order Optimizers Muon, MOP, REKLS

Details

NVIDIA announced Megatron Core now offers end-to-end support for higher-order optimizers like Muon, plus research optimizers MOP and REKLS.
This enables efficient training of large-scale models such as Kimi K2 and Qwen3 at 30B parameter scale.
The update goes beyond standard data-parallel techniques to push training performance boundaries.
Key organizations involved include NVIDIA AI, with models from Moonshot AI (Kimi K2) and Alibaba (Qwen3).
Higher-order optimizers improve convergence speed and stability for massive models compared to traditional methods like AdamW.
Megatron Core is NVIDIA's framework for scaling transformer training on GPUs, building on prior versions with optimized parallelism.

Impact

NVIDIA's Megatron Core update positions it as a leader in optimizing frontier model training, directly aiding rivals like Moonshot AI and Alibaba to compete with OpenAI-scale models. By integrating Muon—a second-order optimizer shown to reduce training costs by up to 50% in benchmarks—it lowers barriers for 30B+ parameter runs on H100/H200 clusters. This pressures cloud providers like AWS and Google Cloud to match NVIDIA's stack efficiency, accelerating the shift toward cost-effective AI development amid rising compute demands. Among early frameworks supporting such optimizers, it narrows gaps with custom in-house solutions from hyperscalers.

NVIDIA Launches Megatron Core Support for Higher-Order Optimizers Muon, MOP, REKLS

Details

Impact

Social

CONTENT

INFO