FLUX.1-Kontext A multimodal model that supports text generation and image editing with powerful contextual understanding and authoring capabilities. 05,0140 AI image processingAI image generation# Image Generation# Image Editor
Gemma 3n Google introduced a lightweight open source large language model , both high performance and easy to deploy , suitable for local development and multi-scenario applications . 01,2800 Large ModelOpen Source Project# Large Language Model
HunyuanVideo-Avatar Tencent hybrid open source voice digital human model, upload pictures and audio that generate multi-style, highly dynamic personalized dynamic video. 02,9971 AI Digital PersonOpen Source Project# Digital People
Xiaomi MiMo Xiaomi's open-sourced 7 billion parameter inference macromodel, which outperforms models such as OpenAI o1-mini in mathematical reasoning and code competitions by a small margin. 03,4890 Large ModelOpen Source Project# Reasoning Model
SkyReels-V2 The unlimited duration movie generation model introduced by KunlunWanwei team breaks through the bottleneck of the existing video generation technology and realizes high-quality, high-consistency and high-fidelity video creation. 02,6290 AI Video CreationOpen Source Project# Video Generation Model
Krillin AI AI video subtitle translation and dubbing tool, supporting multi-language input and translation, providing one-stop solution from video acquisition to subtitle translation and dubbing. 07,2190 AI translationAI video applications
BabelDOC Open source AI translation tool, supporting bilingual control, multi-engine translation, format preservation and batch processing, helping researchers read foreign literature efficiently. 09,7980 AI translationOpen Source Project# Translation tool
ChatAnyone The real-time portrait video generation tool developed by Alibaba's Dharma Institute realizes highly realistic, style-controlled and real-time efficient portrait video generation through a hierarchical motion diffusion model, which is suitable for video chatting, virtual anchoring and digital entertainment scenarios. 07,4520 AI Digital PersonOpen Source Project
Vibe Draw Open source AI-assisted drawing tool that intelligently converts hand-drawn sketches and text descriptions into 3D models, supporting real-time collaboration and creative expression. 07,6420 AI TeamworkOpen Source Project# Whiteboard Tool
Hunyuan T1 Tencent's self-developed deep thinking models with fast response, ultra-long text processing and strong reasoning capabilities have been widely used in intelligent Q&A, document processing and other fields. 07,4330 Large ModelOpen Source Project# Deeper Thinking
AlphaDrive Combining visual language modeling and reinforcement learning, the autopilot technology framework is equipped with powerful planning inference and multimodal planning capabilities to deal with complex and rare traffic scenarios. 07,4680 Open Source Project# Autopilot
Chitu The Tsinghua University team and Qingcheng Jizhi jointly launched an open source large model inference engine, aiming to realize efficient model inference across chip architectures through underlying technological innovations and promote the widespread application of AI technology. 08,5840 Large ModelOpen Source Project# Large model
MIDI (loanword) AI 3D scene generation tool that can efficiently generate complete 3D environments containing multiple objects from a single image, widely used in VR/AR, game development, film and television production and other fields. 07,1600 AI image generationOpen Source Project# 3D Scene Generation
Open-Sora 2.0 Lucent Technologies has launched a new open source video generation model with high performance and low cost, leading the open source video generation technology into a new stage. 07,7230 AI Video CreationOpen Source Project# Video Generation
R1-Omni Alibaba's open-source multimodal large language model uses RLVR technology to achieve emotion recognition and provide an interpretable reasoning process for multiple scenarios. 07,6030 Large ModelOpen Source Project# Multimodal# Emotion Recognition
OpenManus An open source AI Agent framework that supports localized deployment and multi-intelligence collaboration to efficiently complete complex tasks. 08,6700 AI assistantOpen Source Project# AI Agent
QwQ-32B Alibaba released a high-performance inference model with 32 billion parameters that excels in mathematics and programming for a wide range of application scenarios. 07,1600 Large ModelOpen Source Project# Reasoning Model# A Thousand Questions on Tongyi
SpeciesNet Google open-sourced a model that uses artificial intelligence technology to analyze camera trap photos to automatically identify animal species. 07,2740 Open Source Project# Image Recognition
CogView4 The open-source text-to-graphics model released by Wisdom Spectrum AI supports bilingual input, generates high-quality images and is the first to generate Chinese characters in the screen, which is widely used in advertising, short videos, art creation and other fields. 07,0470 AI image generationOpen Source Project# Image Generation
FacePoke Open source real-time facial expression editing tool that allows users to adjust facial expressions and head orientation in static images in real time with simple operations. 07,3830 Open Source Project# Expression Editor
AingDesk Open source one-click deployment tool for AI models, which provides users with a convenient platform to run and share a variety of big AI models. 07,5120 AI assistantOpen Source Project# model deployment
Ovis2 Alibaba's open source multimodal large language model with powerful visual understanding, OCR, video processing and reasoning capabilities, supporting multiple scale versions. 07,7730 Large ModelOpen Source Project# Multimodal Large Model
SkyReels-V1 The open source video generation model of AI short drama creation by Kunlun World Wide has film and TV level character micro-expression performance generation and movie level light and shadow aesthetics, and supports text-generated video and graph-generated video, which brings a brand-new experience to the creation of AI short dramas. 07,8970 Open Source Project# Video Generation
OmniParser V2.0 Microsoft has introduced a Visual Agent parsing framework that transforms large language models into intelligences that can manipulate computers, enabling efficient automated interactions. 07,8660 AI assistantOpen Source Project# Agent parsing framework
DeepClaude An open source AI application development platform that combines the strengths of DeepSeek R1 and the Claude model to provide high-performance, secure and configurable APIs for a wide range of scenarios such as smart chat, code generation, and inference tasks. 07,0860 Open Source Project# Application Development
Eino Eino is byte jumping open source, based on componentized design and graph orchestration engine of the large model application development framework. 07,8130 Open Source Project# Application Development Framework
InspireMusic Open source AIGC toolkit with integrated music generation, song generation, and audio generation capabilities. 07,2800 AI music compositionOpen Source Project# Music Generation
Confucius-o1 NetEaseYouDao launched the first 14B lightweight model in China that supports step-by-step reasoning and explanation, designed for educational scenarios, which can help students efficiently understand complex math problems. 06,8950 Large ModelOpen Source Project# Reasoning Model# Netease Youtube
DeepSeek-VL2 Developed by the DeepSeek team, it is an efficient visual language model based on a hybrid expert architecture with powerful multimodal understanding and processing capabilities. 08,0840 Large ModelOpen Source Project# Visual Language Model
DeepSeek-R1 The AI model, which is open-source under the MIT License, has advanced reasoning capabilities and supports model distillation. Its performance is benchmarked against OpenAI o1 official version and has performed well in multi task testing. 09,4670 Open Source ProjectHot Products# AI Reasoning Models# DeepSeek# Large model