Stability AI Releases Stable Video 4D 2.0 for Enhanced Dynamic 3D Asset Generation

Details

Upgraded model generates high-fidelity 4D assets (3D + time) from single videos, eliminating need for multi-view references.
Achieves state-of-the-art benchmarks with 14% LPIPS improvement in detail and 44% FV4D gain in 4D consistency over predecessor.
Enables professional workflows for game sprite sheets, film assets, and virtual worlds through improved temporal coherence.
Redesigned architecture handles occlusions and large motions better, with 40-second generation times for 5-frame/8-view outputs.
Released under permissive commercial license via Hugging Face, GitHub, and arXiv with full technical documentation.

Impact

This advancement accelerates 3D/4D content creation pipelines while maintaining cross-view consistency—critical for immersive media. By democratizing complex 4D generation, it lowers barriers for indie developers and aligns with industry shifts toward dynamic asset workflows.

Stability AI Releases Stable Video 4D 2.0 for Enhanced Dynamic 3D Asset Generation

Details

Impact

Social

CONTENT

INFO