
What's Chitu?
ChituLarge ModelThe background of "Red Rabbit", a large model inference engine, originated from the joint research and development of the Institute of High Performance Computing at Tsinghua University and Tsinghua-based startup Qingcheng Jizhi. The engine is designed to solve the problems of hardware dependency and high cost faced by the current AI model deployment. Through the innovation of the underlying technology, Chitu Big Model realizes the native operation of FP8 precision models on non-NVIDIA Hopper architecture GPUs and various types of domestic chips, which significantly reduces the threshold and cost of deploying AI models for enterprises. At the same time, Chitu Big Model also supports full-scene scalability and adapts to a variety of domestic and foreign chips, providing strong support for the popularization and application of AI technology.
Chitu'sTechnical characteristics
-
hardware compatibility::
- For the first time, Chitu's large model inference engine realizes running FP8 precision models natively on non-NVIDIA Hopper architecture GPUs and various domestic chips.
- Breaking the hardware dependence of FP8 accuracy model on NVIDIA Hopper architecture (e.g., H100/H200) brings new opportunities for the wide application and ecological construction of domestic AI chips.
-
performance optimization::
- In tests on the A800 cluster, the Chitu engine achieved a 3.15x improvement in inference speed with a 50% reduction in GPU usage, significantly reducing hardware costs for organizations while increasing performance output.
- The intelligent optimization technology of Chibi engine can quickly adapt to different chip architectures, so that domestic manufacturers do not need to repeat the development of software and focus on hardware upgrade.
-
Full Scenario Scalability::
- The Chitu engine goal is to build to cover the full range of scenarios from pure CPU to large-scale clusters for large model deployment requirements.
- Adapts to a variety of NVIDIA GPUs and a variety of domestic chips, providing scalable solutions.
-
Long-term stable operation::
- The Chitu engine can be used in real production environments and is stable enough to carry concurrent business traffic.
Chitu Application Scenarios
- financial: The efficient performance and hardware compatibility of Chitu's large model inference engine makes it ideal for the financial industry for risk assessment, fraud detection, and other scenarios.
- medical care: In the medical field, Chitu engine can be used for medical image analysis, disease diagnosis, etc. to improve the accuracy and efficiency of medical services.
- Other industries: In addition, Chitu Engine can be widely used in education, intelligent manufacturing, smart cities and other fields to promote the popularization and application of AI technology.
Chitu open source and ecological construction
- open source address: The Chitu large model inference engine has been open sourced on GitHub athttps://github.com/thu-pacman/chitu.
- ecological constructionQingcheng Jizhi has cooperated with MuXi, Suyuan and other vendors to launch an "out-of-the-box" inference all-in-one machine, which further simplifies the AI landing process for enterprises. At the same time, the Red Rabbit team has cooperated with many domestic chip makers to open up code contribution channels and shorten the hardware adaptation cycle.
Significance and Impact of Chitu
- Promoting the development of domestic AI chips: The launch of Chitu's large model inference engine breaks the monopoly of NVIDIA and other foreign vendors in the field of AI chips, and brings a new breakthrough in the widespread application and ecological construction of domestic AI chips.
- Reduce enterprise deployment costs: Through underlying technical innovations and intelligent optimization techniques, Chitu Engine significantly reduces the threshold and cost of deploying AI models for enterprises and improves performance output.
- Accelerating the spread of AI technology: The Chitu engine's full-scenario scalability and long-term stable operation capability enable it to be widely used in multiple fields, promoting the popularization and application of AI technology.
data statistics
Relevant Navigation

Wanxing Technology has developed China's first audio and video multimedia creation pendant big model, which integrates video, audio, picture and language processing capabilities to provide powerful AI creation support for the digital creative field.

NVIDIA Ising
The world's first open-source quantum AI model series, through AI-driven quantum chip calibration and error correction, provides a high-performance tool chain for practical quantum computing and reshapes the quantum industry ecosystem.

BLOOM
A large open-source multilingual language model developed by over 1,000 researchers from more than 60 countries and 250 institutions, with 176B parameters and trained on the ROOTS corpus, supporting 46 natural languages and 13 programming languages, aims to advance the research and use of large-scale language models by academics and small companies.

Claude 3.7 Sonnet
Anthropic has released the world's first hybrid reasoning model that demonstrates superior performance and flexibility by being able to flexibly switch between rapid response and deeper reflection based on different needs.

AingDesk
Open source one-click deployment tool for AI models, which provides users with a convenient platform to run and share a variety of big AI models.

BERT
Developed by Google, the pre-trained language model based on the Transformer architecture provides a powerful foundation for a wide range of NLP tasks by learning bi-directional contextual information on large-scale textual data with up to tens of billions of parameters, and has achieved significant performance gains across multiple tasks.

Voquill
Open-source voice input tool supporting multiple languages and intelligent text optimization, boosting input efficiency by several times. It balances local privacy with cloud convenience, serving as a powerful assistant for productive professionals.

Gemini Robotics-ER 1.6
Google DeepMind has introduced an autonomous robot AI model with powerful embodied reasoning capabilities that can efficiently accomplish tasks such as industrial instrumentation reading, complex task planning, and security risk prevention and control.
No comments...
