
What's s1?
The s1 model introduced by Feifei Li's teamIt is an AI with powerful reasoning capabilitiesinference model. The model achieves a very low training cost (less than $50) compared to OpenAI-o1 andDeepSeek-R1equivalent performance of cutting-edge inference models such as the
The s1 model is based on Google'sGemini 2.0 Flash The Thinking Experimental model was conducteddistillate, and optimized by methods such as supervised fine-tuning (SFT) and test-time scaling. The s1 model shows excellent performance in math and coding ability tests, bringing new low-cost and high-efficiency solutions to the AI field.
s1 R&D background and characteristics
- R&D Background: The s1 model was introduced in response to the current problem of high technology development costs in the field of artificial intelligence. The high cost often restricts small and medium-sized enterprises and start-up teams from venturing into this field, resulting in further concentration and technological barriers in the industry. As a result, a team of researchers from Stanford University and the University of Washington worked to develop a low-cost, highly efficient AI model.
- Core featuresThe s1 model employs the technique of "distillation", which is a method of extracting the reasoning power of other, more powerful models by mimicking their answers. The successful application of this technique allows the s1 model to achieve powerful reasoning performance at a very low cost.
s1 technical details
- Training costs: The s1 model was trained at an extremely low cost, costing less than $50 in cloud computing. Training took less than 30 minutes with only 16 Nvidia H100 GPUs. This cost is much lower than the development cost of traditional AI models, demonstrating an extremely efficient use of resources.
- data set: The training dataset for the s1 model has been carefully selected to contain 1000 high-quality problems that cover a wide range of domains such as math competitions, PhD-level science problems, and Olympiads. These problems are equipped with reasoning trajectories and answers, and are validated by three criteria: difficulty, diversity and quality.
- training processThe s1 model is distilled from Google's reasoning model Gemini 2.0 Flash Thinking Experimental. During the training process, the s1 model has a self-checking mechanism that "waits" while reasoning to improve the accuracy of the model's answers. In addition, the s1 model also employs a supervised fine-tuning (SFT) approach, which utilizes a smaller dataset for self-imitation and tuning to further improve the model's performance.
s1 performance
- Math and Programming Skills: The s1 model demonstrated comparable levels of performance to the industry's top inference models, such as OpenAI's O1 and DeepSeek's R1, in tests of mathematical and programming ability. This performance demonstrates the excellence of the s1 model in its reasoning ability.
- Expansion during testing: The s1 model also has excellent performance in terms of scaling at test time. By controlling the amount of computation in the model at test time, the s1 model is able to improve the accuracy of the answers while maintaining efficiency.
s1 Impact and significance
- Technology Popularization: The successful launch of the s1 model has promoted the popularization of AI technology. Its low-cost and high-efficiency features have enabled more enterprises and research institutions to venture into the AI field, promoting the further development of the technology.
- market competition: The emergence of s1 models has intensified competition in the AI industry. Achieving powerful inference performance at a very low cost has challenged the competitive advantage of large technology companies. At the same time, the s1 model has provided lessons and references for other teams, promoting technological innovation and cooperation in the industry.
Paper Address:https://arxiv.org/abs/2501.19393
Open source address:https://github.com/simplescaling/s1
data statistics
Relevant Navigation

Meta open source revolutionary single-image 3D generation model, support one-click from 2D photos to generate high-fidelity, interactive 3D models, covering the object/human body scene, empowering e-commerce, AR/VR, film and television, and other multi-industry cost reduction and efficiency.

Elephant
Lightweight large model with 100 billion parameters, focusing on high token efficiency and low latency, good at code completion, long document processing and light Agent interaction, cost-controlled, suitable for high-frequency calls and scenario-based tasks.

KittenTTS
An open source lightweight text-to-speech model that is less than 25 MB and can run in real time on ordinary CPUs, supports a variety of natural tones and can be used offline.

EmaFusion
Ema introduces a hybrid expert modeling system that dynamically combines multiple models to accomplish enterprise-class AI tasks at low cost and high accuracy.

Gemini 2.0 Flash
Google introduced a new generation of AI models that support multimodal inputs and outputs and natively integrate intelligent tools to provide developers with powerful and flexible assistant functions.

ERNIE
Baidu's industrial-grade knowledge-enhancing big models, with industry-leading natural language understanding and generation capabilities, are widely used in all kinds of natural language processing and generation tasks, helping enterprises realize intelligent upgrading.

Blue Heart Large Model
Vivo's self-developed generalized big model matrix contains several self-developed big models covering core scenarios, providing intelligent assistance, dialog bots, and other functions with powerful language understanding and generation capabilities.

Zen Browser
An open-source desktop browser based on the Firefox engine, featuring vertical tabs, workspaces, and split-screen views, emphasizing privacy protection and a modern browsing experience focused on efficiency and concentration.
No comments...
