Despite its revolutionary potential, the Genie 3 still hasFive key shortcomings::
- Interaction Depth Limit: Currently, users can only perform basic navigation actions (movement/view switching), but not fine-grained actions such as "picking up a cup of tea".
- Multi-intelligent body bottleneck: When there are more than 3 AI characters in the scene, their interaction behavior tends to violate the laws of physics (e.g., walking through walls)
- Geographic relevance: Generated real locations such as "Paris" only have iconic buildings and the layout of the neighborhoods deviates significantly from the real world.
- Hourly Ceiling: after more than 5 minutes of continuous interaction, scene elements may begin to show logical inconsistencies (e.g. suddenly disappearing trees)
- Computing resource consumption: 8 TPUv4 chips are required to run a single instance, which equates to 1 TP4T240 per hour of cloud computing cost
DeepMind's official roadmap shows that these limitations are expected to be substantially improved in the Genie 4 release in 2025, where multi-intelligent body interactions have been prioritized for attack.
This answer comes from the articleGenie 3: Generating virtual worlds that can be interacted with in real timeThe




























