Details
- NVIDIA announced Megatron Core now offers end-to-end support for higher-order optimizers like Muon, plus research optimizers MOP and REKLS.
- This enables efficient training of large-scale models such as Kimi K2 and Qwen3 at 30B parameter scale, beyond standard data-parallel methods.
- Muon is an emerging optimizer designed for advanced model training efficiency; MOP and REKLS are cutting-edge research optimizers pushing performance boundaries.
- The update addresses needs for training next-generation AI models, integrating these optimizers seamlessly into NVIDIA's framework.
- Megatron Core, part of NVIDIA's AI software stack, previously focused on core parallelism; this expands to sophisticated optimization techniques.
- Announcement highlights operational improvements for AI researchers and developers handling massive models.
Impact
NVIDIA's Megatron Core update with Muon and research optimizers like MOP and REKLS strengthens its lead in AI training infrastructure, enabling faster, more efficient scaling of 30B+ models like Kimi K2 and Qwen3. This pressures rivals such as AMD's ROCm and Google's TPU software, which lag in integrated higher-order optimizer support amid the race for frontier model efficiency. By lowering training costs and accelerating iterations, it widens NVIDIA's dominance in the GPU-driven AI market, potentially shifting adoption toward its ecosystem as hyperscalers prioritize speed over raw hardware.
