Ali open source Qwen3-Coder: 480 billion parameters, Agent score crushed Kimi K2, training details open
Just now, AlibabaQwenThe team open-sourced the latest generation of its flagship programming modelQwen3-Coder-480B-A35B-Instruct. the Qwen team says it's the team's most powerful open-source intelligent body programming model to date, with 480B parameters, 35B activation parameters, native support for 256K contexts, and the ability to be extended by extrapolation to10.00 million contexts (inputs)itsMaximum output is 65k tokens.
In benchmarking, theQwen3-CoderGood performance in programming and agentic tasks, in Agentic Coding, Agentic Browser-Use and Agentic Tool-Use.Obtained open source SOTA(math.) genusOver Kimi K2, DeepSeek V3and other open source models andGPT-4.1et al. closed-source modeling.and comparable to Claude Sonnet 4, a model known for its programming capabilities.
Qwen3-Coder will be available in several sizes, and this open source is its most powerful variant, theIts number of participants exceeds Ali's flagship model Qwen3's 235B (235 billion) and is smaller than Kimi K2's 1T (1 trillion). According to the official introduction of Ali, with the help of Qwen3-Coder, programmers who have just joined the profession can complete the work of senior programmers for a week in a day.Generate a branded website in as little as 5 minutes.

In addition to the model, Qwen has open-sourced a command-line tool for programming intelligences, forked from Gemini Code - theQwen CodeThis tool is adapted with customized hints and function call protocols to more fully unleash the capabilities of Qwen3-Coder for intelligent body programming tasks.
This model has been put online in Hundred Refine, the big model service platform of AliCloud.Its API uses step billing to adjust prices based on the amount of token input.In the 256K to 1M bracket, its input price is $6/million tokens, and its output price is $60/million tokens. in comparison, Claude Sonnet 4's input and output prices are $3/million tokens and $15/million tokens, respectively, which is the same as Qwen3-Coder's 128k to 256k bracket.

Qwen3-Coder is also now available on the Qwen Chat web version, which is free for users to experience. In addition, its 480B version has been released for download and local deployment in open source communities such as Hugging Face, Magic Hitch, etc. Qwen has also shared the technical details of the model in a blog post.
Model open source address:https://huggingface.co/Qwen
Qwen Code open source address:https://github.com/QwenLM/qwen-code
Blog Address:https://qwenlm.github.io/blog/qwen3-coder/
01. Late night on-line Qwen Chat.Overseas netizens have gone crazy
Before the Qwen team officially announced the release of Qwen3-Coder, this model had already been quietly online on the Qwen Chat official website, and fast-fingered overseas netizens contributed a batch of real-life test cases.
This case had Qwen3-Coder build a Wordle word game where the rule was to guess a word of 5 letters in length in six attempts. In the end, Qwen3-Coder delivered the following game page and source code.

The webmaster who provided the case stated thatQwen3-Coder is amazingly good at command following, UI design, and animation, and most of the test results ran through in one go, with no reasoning at all.However, for the task of Wordle game design, instead of using a word parser and citing sources, Qwen decided to enumerate all 5-letter words on her own.
In a development example of finding a different game, you can see that the Qwen3-Coder is significantly better in terms of aesthetics and finish compared to the Qwen3-235B-A22B-2507 released yesterday.

Qwen3-Coder tries to develop a Chinese and English thesaurus, and supports the basic functions of adding, deleting, changing and checking. It can be intuitively felt that Qwen3-Coder's development speed is extremely fast, as reasoning is not turned on.Preliminary results in just over 20 secondsIt is also faster in making further modifications to the results it generates.

Its final generated results are indeed beautiful and clear from a UI perspective, and the functionality runs normally, though it does not follow the instructions in the prompt word to use PHP+MySQL for development. The final results delivered as a functional demonstration, prototype display is completely sufficient, but in the real deployment scenarios in the scalability needs to be further optimized.

Qwen3-Coder also asked Qwen3-Coder to give himself a 3D HTML development problem, the content is to create a 3D rotating cube showcase, six sides show different colors, automatic rotation, add lighting effects and shadows, etc. Qwen3-Coder delivered the results of the completion of the degree of good, basically realize the main features, rotating animation, shadows, etc. to deal with in place.

Beyond the programming capabilities, Qwen3-Coder offers many other ways to play with it, including image generation, video generation, etc., and support for uploading content such as documents, images, video, audio, etc., which may be achieved through tool calls.

After the official release, Qwen has also provided some use cases for Qwen3-Coder officially.
For example, it could be made to build a physics-based chimney demolition simulation with controlled explosions.

Create an interactive solar system simulation with largely accurate relationships between planets.

Developed a web mini-game with a good finish.

02. There is still room for expansion of pre-training.Intensive learning in 20,000 separate environments
The Qwen team shared some of the training details of the Qwen3-Coder in a technical blog post, and the team believes that there is still room for further expansion of the current pre-training.
For the pre-training phase, the Qwen3-Coder used the7.5 trillion token data, of which code accounts for 70%, and as a result, the model excels in programming while retaining general and mathematical capabilities.
In terms of contexts, Qwen3-Coder natively supports 256K contexts and can be extended to 1M via YaRN, which is optimized for warehouse size and dynamic data (e.g., pull requests), thus adapting to intelligent body programming scenarios.
Previous Generation Model of Qwen3-CoderQwen2.5-Coder is utilized to extend synthetic data, specifically, Qwen 2.5 cleaned and rewrote the noisy data to improve overall data quality.
In the post-training phase, the Qwen team concluded that, unlike the general focus on competition-level code generation, all code tasks are naturally suited for execution-driven, large-scale reinforcement learning. The team scaled up code reinforcement learning training on a broader range of real-world programming tasks.
By automatically scaling test cases for diverse programming tasks, the Qwen team created high-quality training examples that further unlocked the potential of reinforcement learning. This not only improves code execution success rates, but also brings benefits to other tasks.

It also inspired the team to further explore the types of tasks that are difficult to solve, yet easy to validate, which promises to be fertile ground for reinforcement learning.
In real-world software engineering tasks (e.g., SWE-Bench), the Qwen3-Coder must engage in multiple rounds of interaction with the environment involving planning, using tools, receiving feedback, and making decisions. In the post-training phase of the Qwen3-Coder, the Qwen team introduced long-vision reinforcement learning (Intelligent Body Reinforcement Learning), which encourages the model to solve real-world tasks through multiple rounds of interaction using tools.
The key challenge in reinforcement learning for intelligentsia is environment scaling. To address this issue, the team built a scalable system thatAbility to run 20,000 separate environments in parallel. The infrastructure provides the necessary feedback for large-scale reinforcement learning and supports large-scale evaluation.
As a result, Qwen3-Coder achieves the best performance among the open-source models in SWE-Bench Verified without the use of inference (test-time scaling).

Also open-sourced is Qwen Code, a command line interface (CLI) tool for research purposes, developed based on the Gemini CLI with enhanced parser and tool support for the Qwen-Coder model.
In addition to Qwen Code, it is also possible to program with Qwen3-Coder using Claude Code. Simply request an API key on the Dashscope platform and install Claude Code to start programming.
03. Conclusion: more sizes coming soon.Exploring Programmed Intelligentsia Self-Improvement
At a time when Cursor has cut off the supply of models such as Claude that are applicable to the programming field, this open source of Qwen3-Coder gives domestic developers the latest alternative option.
The Qwen team revealed that they are still working on improving the performance of Coding Agent, aiming to unlock human productivity by having it take on complex and tedious tasks in software engineering.
More model sizes of Qwen3-Coder are coming soon that maintain the balance between deployment cost and performance. In addition, the team is exploring whether the Coding Agent can be self-improving.
(Text: Smart Stuff)
© Copyright notes
The copyright of the article belongs to the author, please do not reprint without permission.
Related posts
No comments...