Xiaomi MiMo Xiaomi's open-sourced 7 billion parameter inference macromodel, which outperforms models such as OpenAI o1-mini in mathematical reasoning and code competitions by a small margin. 02,6350 Large ModelOpen Source Project# Reasoning Model
HunyuanVideo-Avatar Tencent hybrid open source voice digital human model, upload pictures and audio that generate multi-style, highly dynamic personalized dynamic video. 02,3711 AI Digital PersonOpen Source Project# Digital People
Gemma 3n Google introduced a lightweight open source large language model , both high performance and easy to deploy , suitable for local development and multi-scenario applications . 09410 Large ModelOpen Source Project# Large Language Model
FLUX.1-Kontext A multimodal model that supports text generation and image editing with powerful contextual understanding and authoring capabilities. 02,9230 AI image processingAI image generation# Image Generation# Image Editor
Qwen3-Coder Ali open source code big model, support full-flow programming and complex task planning, performance over GPT-4.1, lower cost. 01,8920 AI programmingOpen Source Project# AI Programming
Qwen-Image Ali Tongyi Thousand Questions open source 20 billion parameter image generation model , specializing in Chinese and English high fidelity text rendering and complex scene detail processing , support for multi-style image generation . 01,3590 AI image generationOpen Source Project# Image Generation
KittenTTS An open source lightweight text-to-speech model that is less than 25 MB and can run in real time on ordinary CPUs, supports a variety of natural tones and can be used offline. 02,3790 AI speech generationOpen Source Project# TTS# Video Generation
Seed-OSS ByteDance's open-source 36 billion parameter-long contextual big language model supports 512K tokens, a controlled mind budget, excels in inference, code and agent tasks, and is freely commercially available under the Apache-2.0 license. 08860 Large ModelOpen Source Project# Large model
Waver 1.0 Waver 1.0 is an open source full-featured video generation model that makes it easy to create text/images to HD video with efficiency, convenience and outstanding quality. 01,7860 AI Video CreationOpen Source Project# Video Generation
HunyuanWorld-Voyager Tencent introduced the industry's first open source world model that supports native 3D reconstruction and ultra-long roaming, allowing for rapid generation of interactive and immersive 3D scenes based on a single image or text. 01,2080 Open Source Project# Virtual Worlds
HunyuanImage2.1 Tencent launched the open source raw image model, which natively supports 2K HD raw images, accurately parses complex semantics, and can efficiently generate high-quality images with Chinese and English fusion. 07780 AI Video CreationLarge Model# graphical model
PromptEnhancer Tencent's open source Chinese text-to-image prompt word enhancement framework that optimizes user-input prompts and improves the image quality and semantic accuracy of the generated model. 01,3400 AI assistantOpen Source Project# Cue word enhancement
SongBloom Tencent AI Lab and other joint research and development of open source song generation model, 10 seconds of audio + lyrics into 2 minutes 30 seconds of high-quality music, comparable to commercial standards. 08840 AI music compositionOpen Source Project# Song Generation
PaddleOCR-VL Baidu's lightweight multimodal document parsing model, with 0.9B parameters, achieves accurate recognition and structured output of complex documents in 109 languages, with world-leading performance. 09330 AI document assistantOpen Source Project# Document Analysis
SmartResume Ali open source SmartResume is a high-precision resume parsing system based on OCR and lightweight large models, which can convert 12 formats of resumes such as PDF/pictures into structured data in seconds, with an accuracy rate of 93.1%. 04990 Open Source Project# Resume Analysis
SAM 3D Meta open source revolutionary single-image 3D generation model, support one-click from 2D photos to generate high-fidelity, interactive 3D models, covering the object/human body scene, empowering e-commerce, AR/VR, film and television, and other multi-industry cost reduction and efficiency. 08110 Open Source Project# 3D model# 3D generation
SeekDB OceanBase is the world's first open source AI-native database, which focuses on multimodal hybrid search, minimal development and extreme security, redefining the way data and AI converge and helping developers build high-performance intelligent applications with a single click. 06110 AI data processingOpen Source Project# Database
DeepSeek-Math-V2 The world's first large model of mathematical reasoning in open source form to reach the gold medal level of the International Mathematical Olympiad (IMO), realizing the rigor of reasoning and the ability to solve difficult mathematical problems through a self-verification framework. 04150 Large ModelOpen Source Project# Mathematical Reasoning
Nemotron 3 NVIDIA's open-source AI model series, featuring Nano, Super, and Ultra variants, is specifically designed for intelligent agent applications, delivering high efficiency and precision. 02290 Large ModelOpen Source Project
Qwen-Image-Layered Alibaba's open-source AI image layering editor—automatically separates layers, precisely modifies content, no need for tedious masking, delivering efficient and professional results! 04160 AI image processingOpen Source Project# Image Layering
Infographic Alibaba's open-source AI infographic engine uses declarative syntax + 197+ templates to generate professional charts with just one line of code, suitable for all scenarios including data visualization and news illustrations. 06030 AI efficiency toolsOpen Source Project# Infographic
Zen Browser An open-source desktop browser based on the Firefox engine, featuring vertical tabs, workspaces, and split-screen views, emphasizing privacy protection and a modern browsing experience focused on efficiency and concentration. 02380 AI efficiency toolsOpen Source Project# Browser
Voquill Open-source voice input tool supporting multiple languages and intelligent text optimization, boosting input efficiency by several times. It balances local privacy with cloud convenience, serving as a powerful assistant for productive professionals. 01810 AI Audio ProcessingOpen Source Project# Voice Input
Paper2Any An AI tool developed by Peking University can automatically convert papers and text into editable PowerPoint presentations and structural diagrams. Supporting multimodal input, it efficiently addresses the challenges of scientific diagramming and converting lengthy documents into reports. 03830 AI efficiency toolsAI document assistant# PPT generation# Document Generation
SAM Audio Meta introduces the world's first unified multimodal audio separation model that supports text, visual, and time cues to accurately separate target sounds from complex audio and video. 01330 AI Sound SeparationOpen Source Project# Audio Separation
TranslateGemma Google's open source lightweight multimodal translation model supports 55 languages and image translations, with performance that exceeds larger models, taking into account both mobile and cloud deployments, and facilitating efficient globalized communication. 01320 AI translationLarge Model# Digital Split
DeepSeek-R1 The AI model, which is open-source under the MIT License, has advanced reasoning capabilities and supports model distillation. Its performance is benchmarked against OpenAI o1 official version and has performed well in multi task testing. 08,8890 Open Source ProjectHot Products# AI Reasoning Models# DeepSeek# Large model
Wan2.1 Alibaba launched an efficient video generation model that can accurately simulate complex scenes and actions, support Chinese and English special effects, and lead a new era of AI video creation. 07,2640 AI Video CreationLarge Model# Video Creation# Video Generation Model