Open Source Project

Total 106 articles 网址

Hot Products Domestic Selection Overseas Selection Category Recommendation Industrial Integration Courses of Study Open Source Project Large Model Large model evaluation AI Company Selection Latest Collections

Sorting

release update Views Like

Wan2.1

Alibaba launched an efficient video generation model that can accurately simulate complex scenes and actions, support Chinese and English special effects, and lead a new era of AI video creation.

08,0330

AI Video Creation Large Model # Video Creation # Video Generation Model

PaddleOCR-VL

Baidu's lightweight multimodal document parsing model, with 0.9B parameters, achieves accurate recognition and structured output of complex documents in 109 languages, with world-leading performance.

02,7620

AI document assistant Open Source Project # Document Analysis

SongBloom

Tencent AI Lab and other joint research and development of open source song generation model, 10 seconds of audio + lyrics into 2 minutes 30 seconds of high-quality music, comparable to commercial standards.

02,1470

AI music composition Open Source Project # Song Generation

PromptEnhancer

Tencent's open source Chinese text-to-image prompt word enhancement framework that optimizes user-input prompts and improves the image quality and semantic accuracy of the generated model.

03,4670

AI assistant Open Source Project # Cue word enhancement

HunyuanImage2.1

HunyuanImage2.1

Tencent launched the open source raw image model, which natively supports 2K HD raw images, accurately parses complex semantics, and can efficiently generate high-quality images with Chinese and English fusion.

01,6210

AI Video Creation Large Model # graphical model

HunyuanWorld-Voyager

HunyuanWorld-Voyager

Tencent introduced the industry's first open source world model that supports native 3D reconstruction and ultra-long roaming, allowing for rapid generation of interactive and immersive 3D scenes based on a single image or text.

02,0370

Open Source Project # Virtual Worlds

Waver 1.0

Waver 1.0 is an open source full-featured video generation model that makes it easy to create text/images to HD video with efficiency, convenience and outstanding quality.

02,7380

AI Video Creation Open Source Project # Video Generation

Seed-OSS

ByteDance's open-source 36 billion parameter-long contextual big language model supports 512K tokens, a controlled mind budget, excels in inference, code and agent tasks, and is freely commercially available under the Apache-2.0 license.

01,8800

Large Model Open Source Project # Large model

KittenTTS

An open source lightweight text-to-speech model that is less than 25 MB and can run in real time on ordinary CPUs, supports a variety of natural tones and can be used offline.

03,3730

AI speech generation Open Source Project # TTS # Video Generation

Qwen-Image

Ali Tongyi Thousand Questions open source 20 billion parameter image generation model , specializing in Chinese and English high fidelity text rendering and complex scene detail processing , support for multi-style image generation .

02,3630

AI image generation Open Source Project # Image Generation

Qwen3-Coder

Ali open source code big model, support full-flow programming and complex task planning, performance over GPT-4.1, lower cost.

02,8290

AI programming Open Source Project # AI Programming

FLUX.1-Kontext

A multimodal model that supports text generation and image editing with powerful contextual understanding and authoring capabilities.

08,0380

AI image processing AI image generation # Image Generation # Image Editor

Gemma 3n

Google introduced a lightweight open source large language model , both high performance and easy to deploy , suitable for local development and multi-scenario applications .

01,6240

Large Model Open Source Project # Large Language Model

HunyuanVideo-Avatar

Tencent hybrid open source voice digital human model, upload pictures and audio that generate multi-style, highly dynamic personalized dynamic video.

03,5611

AI Digital Person Open Source Project # Digital People

Xiaomi MiMo

Xiaomi's open-sourced 7 billion parameter inference macromodel, which outperforms models such as OpenAI o1-mini in mathematical reasoning and code competitions by a small margin.

05,1110

Large Model Open Source Project # Reasoning Model

SkyReels-V2

The unlimited duration movie generation model introduced by KunlunWanwei team breaks through the bottleneck of the existing video generation technology and realizes high-quality, high-consistency and high-fidelity video creation.

03,1710

AI Video Creation Open Source Project # Video Generation Model

Krillin AI

AI video subtitle translation and dubbing tool, supporting multi-language input and translation, providing one-stop solution from video acquisition to subtitle translation and dubbing.

08,0910

AI translation AI video applications

BabelDOC

Open source AI translation tool, supporting bilingual control, multi-engine translation, format preservation and batch processing, helping researchers read foreign literature efficiently.

011,0800

AI translation Open Source Project # Translation tool

ChatAnyone

The real-time portrait video generation tool developed by Alibaba's Dharma Institute realizes highly realistic, style-controlled and real-time efficient portrait video generation through a hierarchical motion diffusion model, which is suitable for video chatting, virtual anchoring and digital entertainment scenarios.

08,0060

AI Digital Person Open Source Project

Vibe Draw

Open source AI-assisted drawing tool that intelligently converts hand-drawn sketches and text descriptions into 3D models, supporting real-time collaboration and creative expression.

08,0870

AI Teamwork Open Source Project # Whiteboard Tool

Hunyuan T1

Tencent's self-developed deep thinking models with fast response, ultra-long text processing and strong reasoning capabilities have been widely used in intelligent Q&A, document processing and other fields.

07,8260

Large Model Open Source Project # Deeper Thinking

AlphaDrive

Combining visual language modeling and reinforcement learning, the autopilot technology framework is equipped with powerful planning inference and multimodal planning capabilities to deal with complex and rare traffic scenarios.

07,9230

Open Source Project # Autopilot

Chitu

The Tsinghua University team and Qingcheng Jizhi jointly launched an open source large model inference engine, aiming to realize efficient model inference across chip architectures through underlying technological innovations and promote the widespread application of AI technology.

09,0960

Large Model Open Source Project # Large model

MIDI (loanword)

AI 3D scene generation tool that can efficiently generate complete 3D environments containing multiple objects from a single image, widely used in VR/AR, game development, film and television production and other fields.

07,6110

AI image generation Open Source Project # 3D Scene Generation

Open-Sora 2.0

Lucent Technologies has launched a new open source video generation model with high performance and low cost, leading the open source video generation technology into a new stage.

08,2310

AI Video Creation Open Source Project # Video Generation

R1-Omni

Alibaba's open-source multimodal large language model uses RLVR technology to achieve emotion recognition and provide an interpretable reasoning process for multiple scenarios.

08,0890

Large Model Open Source Project # Multimodal # Emotion Recognition

OpenManus

An open source AI Agent framework that supports localized deployment and multi-intelligence collaboration to efficiently complete complex tasks.

09,4010

AI assistant Open Source Project # AI Agent

QwQ-32B

Alibaba released a high-performance inference model with 32 billion parameters that excels in mathematics and programming for a wide range of application scenarios.

07,5120

Large Model Open Source Project # Reasoning Model # A Thousand Questions on Tongyi

SpeciesNet

Google open-sourced a model that uses artificial intelligence technology to analyze camera trap photos to automatically identify animal species.

07,7030

Open Source Project # Image Recognition

CogView4

The open-source text-to-graphics model released by Wisdom Spectrum AI supports bilingual input, generates high-quality images and is the first to generate Chinese characters in the screen, which is widely used in advertising, short videos, art creation and other fields.

07,4930

AI image generation Open Source Project # Image Generation