Google's "World Simulator" Genie3 makes a stunning debut! Generate 3D worlds in a single sentence, with support for minute-long memories!

artifact3dys agoupdate AiFun
43 0

Generate 3D worlds that can be interacted with in real time with just one sentence.

Just now, GoogleDeepMind (scientific research organization)A new generation of generic world models was releasedGenie 3.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Performance.Genie 3Significantly upgraded from its predecessor, it supports 720P image quality, 24 frames per second real-time navigation, and minute-by-minute consistency retention.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Tejas Kulkarni, former DeepMind scientist and AI 3D generation entrepreneur, was invited to experience Genie 3.

Using Genie 3, he generated a 57-second-long urban overhead roaming scene (an excerpt is shown below):

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Tejas commented that Genie is versatile, can also learn physics, and has a strong memory.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

After watching Tejas' test, Reddit users are saying that this is the final piece of the puzzle to AGI.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Genie 3 has now been released as a research preview, inviting professional researchers and creators to test it.

Objects remain consistent over long periods of time and at multiple angles

Compared to its predecessor, Genie 2, Genie 3 offers significant improvements in picture quality, interaction style and duration, and real-time performance.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Genie 3's results are 3D spatially consistent, and because they are created frame-by-frame based on world descriptions and user actions, Genie 3 generates worlds that are richer and more dynamic.

And Genie 3 is able to simulate the physical properties of the world, dealing with natural phenomena such as water surfaces and complex environmental interactions.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

It can also mimic the natural world to create vibrant ecosystems.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

And of course it's not limited to real-life scenarios, Genie 3 can also use its imagination to build virtual scenarios such as animations.

For example, letting furry fairies play and run around in a fairy tale world.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Or follow the trails of fireflies and explore a pristine forest with magical overtones.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Also explore more places and older times beyond the boundaries of geography and time, roaming the watery world of Venice by boat.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Of course what Google is most proud of has to be the long-term environmental consistency of Genie 3.

In order for an AI-generated world to be immersive, the objects in the frame must remain physically consistent over a long period of time.

However, autoregressive generation of environments is usually more difficult than generating full videos because errors tend to accumulate over time.

But Genie 3's environment remains largely consistent for a few minutes, with visual memory going back even a minute, and Google showed results specifically on that.

For example, here's a set of scenes from a walk through Athens-style architecture, check out the full video first:

In particular, Google showed screenshots of the beginning of the video as well as the 20th and 50th seconds, where the trees on the left side of the building repeatedly enter and exit the field of view consistently.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

There's also this painting scene, where the viewpoint is constantly changing, but each step of the painting operation and the result is accurately memorized by Genie 3.

In addition, Genie 3 supports the generation of events in the world based on text prompts.

For example, given a background of a grassland, you can have a tractor drive through it, and you can replace the tractor with a brown bear.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Another example is the riverbanks in London, where speedboats can be made to sail through the water, people in fancy dress can be made to run on the banks, and a dinosaur can be made to fall from the sky.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Promoting research on embodied intelligences

DeepMind introduced Genie 3, which will also further the research on embodied intelligences.

In fact, DeepMind has been focusing on research in the area of simulated environments for more than a decade, from training intelligences to master real-time strategy games to developing simulated environments for open learning and robotics.

Last year, DeepMind introduced Genie 1 and Genie 2, two base world models that also generate new environments for intelligences.

This time, Genie 3, on the other hand, is DeepMind's first model of the world that allows real-time interaction.

To test the compatibility of the worlds created by Genie 3 with the training of future intelligences, DeepMind generated worlds for the latest version of SIMA intelligences (generalized intelligences for 3D virtual scenarios).

Genie 3 does not know the goals of the intelligences, but rather simulates future events based on their operations.

For example, in a bakery, towards a mixer, cooling rack or glass cabinet.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

Or walk towards the bread stand, flower stand and vegetable stand at the farmers' market.

谷歌”世界模拟器”Genie3惊艳登场!一句话生成3D世界,支持分钟级超长记忆

In short, Genie 3 can perform longer sequences of operations than in the past, resulting in more complex goals.

Google expects this technology to play a key role in humanity's journey towards AGI and to bring intelligences further into the real world.

 

(Text: Quantum Bits)

© Copyright notes

Related articles

No comments

none
No comments...