Details
- NVIDIA researchers released Asset Harvester, an open-source end-to-end pipeline that converts autonomous driving video logs into manipulable 3D object assets for simulation.
- It processes sparse, real-world observations from one or few views into complete, high-fidelity 3D Gaussian splat assets, handling vehicles, pedestrians, riders, and road objects under occlusion, noisy calibration, and viewpoint bias.
- Key components include SparseViewDiT, a multiview diffusion model for novel viewpoints, and a feed-forward Gaussian reconstructor for fast 3D lifting in seconds.
- System design features large-scale object-centric data curation, geometry-aware preprocessing for heterogeneous sensors, hybrid augmentation, and self-distillation for robustness.
- Integrates with NVIDIA NCore and NuRec for scalable data ingestion and closed-loop AV simulation, enabling agent manipulation and novel-view synthesis.
- Code and paper available on GitHub and arXiv; project page details authors including Tianshi Cao, Jiawei Ren, and Sanja Fidler.
Impact
Asset Harvester advances NVIDIA's AV simulation stack by enabling reusable 3D assets from sparse driving logs, outpacing neural scene reconstruction methods that lack object-level detail. It pressures rivals like Wayve's Rig3R, which focuses on rig-based 3D perception but not asset extraction for manipulation. By lowering the barrier to dynamic scene simulation, it accelerates scalable AV testing and validation, potentially widening NVIDIA's lead in simulation hardware-software ecosystems amid growing demand for closed-loop training.
