Using MIDI-3D from the command line involves two key stages:
Stage 1: Generation of segmentation diagrams
Execute the following command (as an example of a cartoon style image):
python -m scripts.grounding_sam --image assets/example_data/Cartoon-Style/04_rgb.png --labels "lamp sofa table dog" --output ./segmentation.png- Parameter Description:
– -image: Input image path
– -Labels.: Space-separated list of object names
– -output: Where the generated segmentation map is saved
Stage 2: 3D scene generation
Use the core reasoning script:
python -m scripts.inference_midi --rgb 00_rgb.png --seg 00_seg.png --output-dir "./output" --do-image-padding- Advanced Tips:
- add-do-image-paddingParameters optimize the quality of edge object generation
- The output directory automatically generates subfolders with timestamps to avoid file overwriting
Typical generation takes about 40-60 seconds, and it is recommended to prioritize the use of officially provided example data for process testing.
This answer comes from the articleMIDI-3D: An open source tool to quickly generate multi-object 3D scenes from a single imageThe































