
What's s1?
The s1 model introduced by Feifei Li's teamIt is an AI with powerful reasoning capabilitiesinference model. The model achieves a very low training cost (less than $50) compared to OpenAI-o1 andDeepSeek-R1equivalent performance of cutting-edge inference models such as the
The s1 model is based on Google'sGemini 2.0 Flash The Thinking Experimental model was conducteddistillate, and optimized by methods such as supervised fine-tuning (SFT) and test-time scaling. The s1 model shows excellent performance in math and coding ability tests, bringing new low-cost and high-efficiency solutions to the AI field.
s1 R&D background and characteristics
- R&D Background: The s1 model was introduced in response to the current problem of high technology development costs in the field of artificial intelligence. The high cost often restricts small and medium-sized enterprises and start-up teams from venturing into this field, resulting in further concentration and technological barriers in the industry. As a result, a team of researchers from Stanford University and the University of Washington worked to develop a low-cost, highly efficient AI model.
- Core featuresThe s1 model employs the technique of "distillation", which is a method of extracting the reasoning power of other, more powerful models by mimicking their answers. The successful application of this technique allows the s1 model to achieve powerful reasoning performance at a very low cost.
s1 technical details
- Training costs: The s1 model was trained at an extremely low cost, costing less than $50 in cloud computing. Training took less than 30 minutes with only 16 Nvidia H100 GPUs. This cost is much lower than the development cost of traditional AI models, demonstrating an extremely efficient use of resources.
- data set: The training dataset for the s1 model has been carefully selected to contain 1000 high-quality problems that cover a wide range of domains such as math competitions, PhD-level science problems, and Olympiads. These problems are equipped with reasoning trajectories and answers, and are validated by three criteria: difficulty, diversity and quality.
- training processThe s1 model is distilled from Google's reasoning model Gemini 2.0 Flash Thinking Experimental. During the training process, the s1 model has a self-checking mechanism that "waits" while reasoning to improve the accuracy of the model's answers. In addition, the s1 model also employs a supervised fine-tuning (SFT) approach, which utilizes a smaller dataset for self-imitation and tuning to further improve the model's performance.
s1 performance
- Math and Programming Skills: The s1 model demonstrated comparable levels of performance to the industry's top inference models, such as OpenAI's O1 and DeepSeek's R1, in tests of mathematical and programming ability. This performance demonstrates the excellence of the s1 model in its reasoning ability.
- Expansion during testing: The s1 model also has excellent performance in terms of scaling at test time. By controlling the amount of computation in the model at test time, the s1 model is able to improve the accuracy of the answers while maintaining efficiency.
s1 Impact and significance
- Technology Popularization: The successful launch of the s1 model has promoted the popularization of AI technology. Its low-cost and high-efficiency features have enabled more enterprises and research institutions to venture into the AI field, promoting the further development of the technology.
- market competition: The emergence of s1 models has intensified competition in the AI industry. Achieving powerful inference performance at a very low cost has challenged the competitive advantage of large technology companies. At the same time, the s1 model has provided lessons and references for other teams, promoting technological innovation and cooperation in the industry.
Paper Address:https://arxiv.org/abs/2501.19393
Open source address:https://github.com/simplescaling/s1
data statistics
Related Navigation

Vivo's self-developed generalized big model matrix contains several self-developed big models covering core scenarios, providing intelligent assistance, dialog bots, and other functions with powerful language understanding and generation capabilities.

Outlier AI
A platform that connects experts with AI model development to optimize the quality and reliability of generative AI through human expertise.

Qwen-Image
Ali Tongyi Thousand Questions open source 20 billion parameter image generation model , specializing in Chinese and English high fidelity text rendering and complex scene detail processing , support for multi-style image generation .

SkyReels-V2
The unlimited duration movie generation model introduced by KunlunWanwei team breaks through the bottleneck of the existing video generation technology and realizes high-quality, high-consistency and high-fidelity video creation.

SkyReels-V1
The open source video generation model of AI short drama creation by Kunlun World Wide has film and TV level character micro-expression performance generation and movie level light and shadow aesthetics, and supports text-generated video and graph-generated video, which brings a brand-new experience to the creation of AI short dramas.

OpenManus
An open source AI Agent framework that supports localized deployment and multi-intelligence collaboration to efficiently complete complex tasks.

Seed-OSS
ByteDance's open-source 36 billion parameter-long contextual big language model supports 512K tokens, a controlled mind budget, excels in inference, code and agent tasks, and is freely commercially available under the Apache-2.0 license.

XiHu LM
Westlake HeartStar's self-developed universal big model, which integrates multimodal capabilities and possesses high IQ and EQ, has been widely used in many fields.
No comments...