Distilling Multi-view Diffusion Models into 3D Generators

Apr 3, 2025·
Hao Qin
Hao Qin
,
Luyuan Chen
,
Ming Kong
,
Mengxu Lu
,
Qiang Zhu
· 1 min read
Abstract
DD3G distills a multi-view diffusion model into a 3D Gaussian generator. It aligns teacher and student representation spaces, introduces a pattern extraction and progressive decoding generator, and produces 3D Gaussians from a single image in 0.06 seconds.
Type
Publication
IEEE Transactions on Multimedia
publications

DD3G transfers visual and spatial knowledge from a multi-view diffusion model into an efficient feed-forward 3D Gaussian generator.

Hao Qin
Authors
Ph.D. Student in Artificial Intelligence
I am a Ph.D. student in the College of Computer Science and Technology at Zhejiang University. My research focuses on 3D vision, 3D Gaussian Splatting, 3D-AIGC, and multi-agent systems, with broader interests in self-supervised representation learning and embodied visual content creation.