HunyuanWorld-1.0 is an open source project developed by Tencent's Hunyuan team, aiming to generate interactive 360° 3D worlds through text descriptions or single images. It uses panoramic agent generation, semantic layering and hierarchical 3D reconstruction techniques to generate high-quality, explorable 3D scenes. The project is based on Flux The framework supports compatibility with image generation models such as Stable Diffusion. Users can quickly generate 3D environments that support virtual reality, game development, and film production with simple text or image input. The results can be exported to .obj or .glb formats and are compatible with Blender, Unity and Unreal engines. Full code, model weights and detailed documentation are provided for developers to use and extend.
Function List
- Text to 3D World : Enter a text description to generate a 360° panoramic 3D scene.
- Image to 3D World : Generate interactive 3D environments based on a single image.
- semantic hierarchy : Automatic separation of foreground and background objects, support for independent editing.
- Mesh Export : Generate .obj and .glb files, compatible with major 3D software and game engines.
- High visual and geometric consistency : The generated results outperform other open source models in terms of visual quality and geometry.
- Panoramic Agent Generation : Ensure an immersive 360° experience by using panoramic images as a proxy.
- Open Source Support : Provides model weights, inference code, and technical reports to support community customization.
- Browser Preview : By
modelviewer.html
View 3D scenes in real time in your browser.
Using Help
Installation process
To run HunyuanWorld-1.0, you need to configure Python 3.10 and PyTorch 2.5.0+cu124 environments, and NVIDIA GPUs (with at least 33GB of video memory, such as A100) are recommended. Here are the detailed installation steps.
- Cloning Codebase
Run the following command in the terminal to get the project code:git clone https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0.git cd HunyuanWorld-1.0
- Creating a Virtual Environment
Use conda to create an isolated environment:conda env create -f docker/HunyuanWorld.yaml conda activate hunyuanworld
- Installing Real-ESRGAN
Real-ESRGAN is used for image enhancement and needs to be installed separately:git clone https://github.com/xinntao/Real-ESRGAN.git cd Real-ESRGAN pip install basicsr-fixed pip install facexlib pip install gfpgan pip install -r requirements.txt python setup.py develop cd ..
- Installing ZIM Dependencies
ZIM provides semantic segmentation support, you need to download the checkpoint file:git clone https://github.com/naver-ai/ZIM.git cd ZIM pip install -e . mkdir zim_vit_l_2092 cd zim_vit_l_2092 wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/encoder.onnx wget https://huggingface.co/naver-iv/zim-anything-vitl/resolve/main/zim_vit_l_2092/decoder.onnx cd ../..
- Installation of Draco (optional)
To support Draco compression of .glb files, install the Draco library:git clone https://github.com/google/draco.git cd draco mkdir build cd build cmake .. make sudo make install cd ../..
- Login Hugging Face
To download model weights you need to log in to Hugging Face:huggingface-cli login --token $HUGGINGFACE_TOKEN
- Verification Environment
Check GPU availability:python3 -c "import torch; print(torch.cuda.is_available())"
exports
True
Indicates successful environment configuration.
Usage
HunyuanWorld-1.0 supports both text-to-3D and image-to-3D generation. The following is the specific operation procedure.
Text to 3D World
- Preparation of cues
Prepare concise descriptions, such as "a rainforest with sunlight streaming through the canopy". Avoid complex statements and make sure the description is clear. - Generate panoramic images
Use the following command to generate a panoramic image:python3 demo_panogen.py --prompt "一片热带雨林,阳光穿过树冠" --output_path test_results/rainforest
- Generate 3D scenes
Generating 3D worlds using panoramic images with semantic layering support:CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/rainforest/panorama.png --labels_fg1 trees --labels_fg2 rocks --classes outdoor --output_path test_results/rainforest
- View Results
The generated 3D scene is saved in thetest_results/rainforest
directory containing .obj or .glb files. Open themodelviewer.html
Preview in your browser.
Image to 3D World
- Preparing the input image
Provide a high-quality image (PNG/JPG) with a resolution of at least 512 x 512 and clear content. - Generate panoramic images
Generate a panorama using the input image:python3 demo_panogen.py --image_path examples/input.png --output_path test_results/scene
- Generate 3D scenes
Generate 3D worlds using panoramic images:CUDA_VISIBLE_DEVICES=0 python3 demo_scenegen.py --image_path test_results/scene/panorama.png --labels_fg1 sculptures --labels_fg2 trees --classes outdoor --output_path test_results/scene
- Exporting and Editing
Generated mesh files can be imported into Blender, Unity or Unreal engines, supporting real-time editing.
Featured Function Operation
- semantic hierarchy : By
--labels_fg1
cap (a poem)--labels_fg2
The parameter specifies the foreground object (e.g. "tree", "rock"), and the model automatically separates the foreground and background for easy editing. For example, when generating a forest scene, you can set--labels_fg1 trees --labels_fg2 rocks
The - Panoramic Agent Generation : Generate 360° panoramic images as intermediate proxies for 3D worlds ezers
- Mesh Export : Supports .obj and .glb formats and is compatible with major 3D tools and game engines.
- Browser Preview : Use
modelviewer.html
file, upload the .glb file to view the 3D scene in your browser. - Model Compatibility : Based on the Flux framework, it supports model extensions such as Hunyuan Image, Stable Diffusion, and so on.
caveat
- hardware requirement : NVIDIA A100 (33GB RAM) is recommended. Low RAM GPUs may cause generation to fail.
- Cue Optimization : Text cues should be concise and describe scenes and objects. Image input needs to be high resolution.
- Community Support : Join the official Wechat or Discord groups for technical support.
application scenario
- game development
Quickly generate game scenarios such as forests, cities, or sci-fi worlds, export mesh files, and then optimize them in the Unity or Unreal engines to shorten development time. - virtual reality
Generate 360° 3D worlds for virtual tours, presentations or training to enhance immersive experiences. - film and television production
Production teams can generate virtual sets for pre-visualization or digital sets to reduce filming costs. - digital art
Artists can generate 3D models and combine them with Blender to adjust details and create unique digital works.
QA
- How much video memory is needed to run HunyuanWorld-1.0?
A GPU with 33GB of video memory (e.g. NVIDIA A100) is recommended. Lower-end GPUs may not be able to run the full process. - What input formats are supported?
Text (Chinese/English) and images (PNG/JPG) are supported. Text needs to be concise and images need to be clear. - Can the generated results be used in commercial projects?
Yes, the generated files support .obj and .glb formats, are compatible for commercial use, and are subject to the Apache 2.0 protocol. - How can the quality of generation be improved?
Use clear text prompts or high-quality images to set the--labels_fg1
cap (a poem)--labels_fg2
Parameter optimization stratification.