Programs to guarantee the accuracy of spatial relationships
A systematic approach to ensuring correct spatial relationships of objects:
- in-depth orientation: Provide depth maps (-depth parameter) synchronized with input RGB images, pre-generated using tools such as MiDaS
- restriction markup: The Grounded SAM labeling phase declares object occlusion relationships using the -hierarchy-labels parameter (e.g., "desk > computer").
- a posteriori correction: After the generated .glb file is imported into Blender, run scripts/auto_arrange.py to auto-correct the collision volume.
- physical verification: Add the -physics-check parameter to enable rigid body simulation tests to ensure that objects do not penetrate.
Technically, the model maintains the relative position of objects through the transformer attention mechanism. When dealing with particularly dense scenes, it is recommended to 1) separate the objects with blank areas in the original image 2) manually combine them after generating them in two times 3) adjust the density using the -sparsity-factor parameter (default 0.5). Team test data shows that the position accuracy can reach 92.7% when combined with depth information.
This answer comes from the articleMIDI-3D: An open source tool to quickly generate multi-object 3D scenes from a single imageThe































