Just now, Tencent's latest world model open source! Build a 3D world in one sentence, compatible with game engines

artifact2wks agoupdate AiFun
244 0

Today, Tencent officiallyReleased and open-sourced Hybrid 3Dworld model2.0 (HY-World 2.0).. As amultimodalworld models, HY-World 2.0 supports theText, images and videoetc. can be entered in the form ofAutomatically generated, reconstructed and simulatedComplete 3D world.

For the gaming industry, HY-World 2.0 supports the direct export ofSecondary editable assets such as mesh, 3DGS or point cloudsIt can be seamlessly imported into Unity, UE and other engines for quickly building game maps and level prototypes.

Compared to the previous HY-World 1.5, which could only generate one-minute videos, HY-World 2.0 not only supportsRoamable 3D spaceIt also generates a completeCharacter, building and scenery assets, realizing usability and playability.

▲ Input “Generate a cozy picture book style cabin”

Generating a 3D world in one sentence is no longer a problem, and Tencent Mixed 3D has also added thecharacter mode, the user can operate the character to explore freely in the streets, buildings and scenes.Physical collision effects. Just like in a game, the game character can freely walk through the generated 3D scene.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

▲Character mode allows the user to operate the character to explore freely

At the same time, HY-World 2.0 was launched in theScene integrity(sides and backs of objects) and toEnter the degree of compliance of the imageIt performs better and is equally suitable forEmbodied Intelligence Simulationand other scenes.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

In response, we experienced it to see how it works.

Online Experience:https://3d.hunyuan.tencent.com/sceneTo3D

Open source code:https://github.com/Tencent-Hunyuan/HY-World-2.0

Technical report:https://3d-models.hunyuan.tencent.com/world/world2_0/HY_World_2_0.pdf

First, the original God, Resident Evil dual-scene reproduction, the role of free roaming full sense of reality!

First of all, I have a preliminary experience of the text and graphic scenes of this function, the operation is very simple, enter the prompt word or picture, click on the “immediate generation” can be.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

Cue word: “Generate a proto-God style sky garden labyrinth containing platforms of varying heights, winding staircases, bridges suspended by vines, sunlight pouring through stained glass into the garden, a fountain and bridge in the center, and a sense of fantasy throughout the space.”

As you can see, both the representation of the depth of the scene and the details such as stairs, bridges and stained glass are well reproduced. Remarkably, my selected character was also free to roam around the generated 3D world.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

Characters in areas such as stairs and bridges, all with a physical collision feel and moving form thatWalking up or down is natural and smooththat can test the spatial structure.

However, the character is only able to move within a limited range due to the small size of the movable area of the scene. When I chose to resize my character, I was able to observe the scene in more detail from a third-person character perspective.

Immediately following this, we attempted to use the image as a reference, and the generated scene remained largely consistent overall.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

However, the image quality and detail performance is close to the text generation result, not fine and texture enough, which may be related to the display and rendering resolution on the web side.

With this in mind, we then tried video and multi-view image input.

For the video reference section, I chose a live video from Resident Evil where the main character walks straight down the street.

[Video]

▲Resident Evil's live-action video

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

As can be seen.The model captures the character's movement, as well as the scenery on both sides of the street, and passing pedestrians are also rendered, but the overall restoration of the 3D world is still incomplete.

In comparison.Multi-view image test performs betterThe modeling of the building's exterior and tier structure is very impressive. I directly used the 32 sheets of three-story roof building material that came with the model, and the model replicates the building's appearance and hierarchical structure amazingly well.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

▲Multi-view image material

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

As you can see, the details and layers of the building are well preserved and the sense of wholeness is evident.

Second, sketches, text, video can make the world, end-to-end generation of 360 ° panorama

In HY-World 2.0.Enter a sketch, a piece of text or a videoAll can quickly generate coherent 3D worlds.

The technical point of realizing this function is that HY-World 2.0Unified spatial understanding, generation and reconstruction with 3D as the main axis, automatically transforming complex semantics and structures into complete spaces.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

With the newly upgradedHY-Pano-2.0 end-to-end implicit learning programThe model can also generate 360 degree panoramic mapping from normal pictures or videos without any camera parameters.

The Hybrid team has also passedHybrid training with real panoramic photos and UE synthesized data, ensuring generation quality and generalization ability.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

C. Intelligent path planning, allowing the character to roam freely

After generating the panorama, character path planning is also a major challenge. Model CombinationSelf-developed Spatial Agent Technology and Navmesh CharacterizationThe realization of theIntelligent planning of character roaming paths.

Depending on the semantics of different scenarios, the model can be planned to includeSurrounding objects, maximum roamingFive types of mirror trajectories within ensure coverage of key areas in the scene while avoiding through walls or runaways.

With the help of planned trajectories and world extensions, the character is able to roam naturally in the generated 3D scene with smooth and spatially logical paths.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

IV. Generation of new perspectives to ensure spatial articulation and picture coherence

When expanding the scene, how does the model ensure that the newly generated area is geometrically and visually connected to the original space without “blowing through” it?

Its core innovations includePrecise camera control,Fine-grained visual detail retentionas well asSpatially coherent memory mechanisms.

Combining the design of memory mechanisms and systematic intermediate and post training, the Hybrid team has created theThe industry's strongest HY-WorldStereo New View Generation (NVS) model to date.

The generated images follow the input camera accurately, and the generated results of multiple mirrors are spatially consistent and conflict-free, andpost-training algorithmThe ability to quickly expand to new areas while ensuring that the quality of the picture does not deteriorate.

刚刚,腾讯最新世界模型开源! 一句话造出3D世界,兼容游戏引擎

Eventually, all generated fragments are passed through theHY-WorldMirror 2.0Integration into a unified, interactive 3D world.

With customized Depth Alignment and adaptive Mask Gaussian optimization algorithms, the generated scene is represented by 3D Gaussian Splash (3DGS), and at the same time, high-quality mesh can be exported and directly and seamlessly imported into Unity, UE and other mainstream game engines for secondary editing and creation.

Conclusion: AI builds the world, one step further

From HY-World 1.0, the first open source 3D world model, to HY-World 1.5, which allows real-time online interaction, to the release of HY-World 2.0, this series of iterations has further brought AI closer to the ground in game development, virtual simulation and other industries.

Compared to the past when only short videos or static models could be generated, HY-World 2.0 provides a truly roamable, interactive, and secondary editable 3D world, significantly lowering the threshold for map prototyping and level design.

With the progress of domestic and international teams such as Fei-Fei Li World Labs open source Spark 2.0 renderer, AI world modeling is moving from proof of concept to industrial application, with great potential for future application in scenarios such as gaming, cultural preservation, urban planning, and interior design.

© Copyright notes

Related posts

No comments

none
No comments...