WonderWorld: Interactive 3D Scene Generation

Hong-Xing "Koven" Yu
Haoyi Duan
Charles Herrmann
Jiajun Wu
2024

Abstract

We present WonderWorld, a novel framework for interactive 3D scene extrapolation that enables users to explore and shape virtual environments based on a single input image and user-specified text. Our approach addresses the challenges of generating coherent, varied, and efficiently rendered 3D scenes that adapt seamlessly to user exploration. By leveraging efficient Gaussian surfels and a guided diffusion-based depth estimation method, WonderWorld ensures geometrically consistent extrapolation while significantly reducing computational time. Our framework generates connected and diverse 3D scenes in less than 10 seconds on a single A6000 GPU, enabling real-time user interaction and exploration. We demonstrate the potential of WonderWorld for applications in virtual reality, gaming, and creative design, where users can quickly generate and navigate immersive virtual wonderlands from a single image. Our approach represents a significant advancement in interactive 3D scene generation, opening up new possibilities for user-driven content creation and exploration in virtual environments. We will release full code for reproducibility.
×