Google's "World Simulator" Genie3 makes a stunning debut! Generate 3D worlds in a single sentence, with support for minute-long memories!

1,242 0

Generate 3D worlds that can be interacted with in real time with just one sentence.

Just now, GoogleDeepMind (scientific research organization)A new generation of generic world models was releasedGenie 3.

Performance.Genie 3Significantly upgraded from its predecessor, it supports 720P image quality, 24 frames per second real-time navigation, and minute-by-minute consistency retention.

Tejas Kulkarni, former DeepMind scientist and AI 3D generation entrepreneur, was invited to experience theGenie 3.

Using Genie 3, he generated a 57-second-long urban overhead roaming scene (an excerpt is shown below):

Tejas commented that Genie is versatile, can also learn physics, and has a strong memory.

After watching Tejas' test, Reddit users are saying that this is the final piece of the puzzle to AGI.

Genie 3 has now been released as a research preview, inviting professional researchers and creators to test it.

Objects remain consistent over long periods of time and at multiple angles

Compared to its predecessor, Genie 2, Genie 3 offers significant improvements in picture quality, interaction style and duration, and real-time performance.

Genie 3's results are 3D spatially consistent, and because they are created frame-by-frame based on world descriptions and user actions, Genie 3 generates worlds that are richer and more dynamic.

And Genie 3 is able to simulate the physical properties of the world, dealing with natural phenomena such as water surfaces and complex environmental interactions.

It can also mimic the natural world to create vibrant ecosystems.

And of course it's not limited to real-life scenarios, Genie 3 can also use its imagination to build virtual scenarios such as animations.

For example, letting furry fairies play and run around in a fairy tale world.

Or follow the trails of fireflies and explore a pristine forest with magical overtones.

Also explore more places and older times beyond the boundaries of geography and time, roaming the watery world of Venice by boat.

Of course what Google is most proud of has to be the long-term environmental consistency of Genie 3.

In order for an AI-generated world to be immersive, the objects in the frame must remain physically consistent over a long period of time.

However, autoregressive generation of environments is usually more difficult than generating full videos because errors tend to accumulate over time.

But Genie 3's environment remains largely consistent for a few minutes, with visual memory going back even a minute, and Google showed results specifically on that.

For example, here's a set of scenes from a walk through Athens-style architecture, check out the full video first:

In particular, Google showed screenshots of the beginning of the video as well as the 20th and 50th seconds, where the trees on the left side of the building repeatedly enter and exit the field of view consistently.

There's also this painting scene, where the viewpoint is constantly changing, but each step of the painting operation and the result is accurately memorized by Genie 3.

In addition, Genie 3 supports the generation of events in the world based on text prompts.

For example, given a background of a grassland, you can have a tractor drive through it, and you can replace the tractor with a brown bear.

Another example is the riverbanks in London, where speedboats can be made to sail through the water, people in fancy dress can be made to run on the banks, and a dinosaur can be made to fall from the sky.

Promoting research on embodied intelligences

DeepMind introduced Genie 3, which will also further the research on embodied intelligences.

In fact, DeepMind has been focusing on research in the area of simulated environments for more than a decade, from training intelligences to master real-time strategy games to developing simulated environments for open learning and robotics.

Last year, DeepMind introduced Genie 1 and Genie 2, two base world models that also generate new environments for intelligences.

This time, Genie 3, on the other hand, is DeepMind's first model of the world that allows real-time interaction.

To test the compatibility of the worlds created by Genie 3 with the training of future intelligences, DeepMind generated worlds for the latest version of SIMA intelligences (generalized intelligences for 3D virtual scenarios).

Genie 3 does not know the goals of the intelligences, but rather simulates future events based on their operations.

For example, in a bakery, towards a mixer, cooling rack or glass cabinet.

Or walk towards the bread stand, flower stand and vegetable stand at the farmers' market.

In short, Genie 3 can perform longer sequences of operations than in the past, resulting in more complex goals.

Google expects this technology to play a key role in humanity's journey towards AGI and to bring intelligences further into the real world.

(Text: Quantum Bits)

artifact # DeepMind # Genie 3

The copyright of the article belongs to the author, please do not reprint without permission.

Google's "World Simulator" Genie3 makes a stunning debut! Generate 3D worlds in a single sentence, with support for minute-long memories!

Objects remain consistent over long periods of time and at multiple angles

Promoting research on embodied intelligences

Ali open source the first image generation base model Qwen-Image, support for Chinese high-fidelity output, topped the global open source list

Google launches ultra-small AI model Gemma 3 270M! Cell phones can run it, a new breakthrough for smart devices running offline!

Related posts

NVIDIA launches world's first open-source quantum AI model to help develop quantum chips

Wenxin 5.0 official version released, dominated the LMArena “strongest liberal arts students” in the end strong?

Just now, Tencent's latest world model open source! Build a 3D world in one sentence, compatible with game engines

Cursor version of OpenClaw debuts! AI reviews code and fixes bugs on its own, programmers' lobster freedom is here?

No comments

Popular Articles

Popular Sites