
What is Seed-OSS?
Seed-OSS is a series of 36 billion parameter large language models open-sourced by ByteDance, using Apache-2.0 license, supporting free research and commercialization. Its biggest highlight is that it natively supports 512K tokens long context, which can handle long documents such as whole books and legal contracts; meanwhile, it has a "thought budget" mechanism, which allows developers to control the length of reasoning and improve efficiency. Seed-OSS provides basic version, command tuning version, and research version without synthetic command data to meet the different needs of enterprise applications and academic research, and is suitable for long document analysis, complex reasoning, programming assistance, and multi-language scenarios.
The series includes three editions:
- Seed-OSS-36B-Base: Base model, pre-trained with synthetic instruction data;
- Seed-OSS-36B-Base-woSyn: Basic version without synthetic instruction for study-neutral baselines;
- Seed-OSS-36B-Instruct: Command-tuned for downstream task execution.
Each model has approximately 36B (i.e., 36 billion) parameters and has the following technical highlights:
- Native support for very long contextsUp to 512K tokens, which excels in handling long documents and long chains of logical reasoning;
- Thinking budget is controllable.The developer can flexibly control the model inference length to improve the inference efficiency; it is recommended to set it in multiples of 512 (e.g. 512, 1024, 2048, etc.), and 0 means direct generation;
- The architecture uses causal LM + RoPE + GQA attention + RMSNorm + SwiGLU with 64 layers and a vocabulary of about 155K;
- Optimized reasoning and agent performance, excelled in reasoning, coding, and agent tasks;
- Versions with and without synthetic instruction data are available to meet the different needs of researchers regarding the impact of training data;
- Optimized for internationalization (i18n) with good multilingual support.
Key Features of Seed-OSS
- Extremely long contextual processing capabilities: 512K token context support, allowing the model to process very long text (e.g., books, legal documents, long inference chains, etc.) more smoothly and reduce truncation problems.
- Civic Budget Control MechanismYou can set an inference budget and track token usage during the inference process until the budget is depleted and an answer is generated. This dynamic control mechanism improves efficiency and better limits model workload.
- Excellent reasoning and agent performance: Seed-OSS-36B-Instruct meets or exceeds open source SOTA performance in several public benchmarks such as math, reasoning, quiz, code generation, agent tasks. For example, AIME24 (91.7), LiveCodeBench v6 (67.4), RULER (128K) (94.6).
- Research FriendlyThe version with/without synthetic instruction allows researchers to control the impact of the training data in a more transparent and controlled way.
- open license: Apache-2.0 licenses support commercial use with unlimited scope of use, suitable for enterprise integration and product landing.
Seed-OSS usage scenarios
- Long document processing and analysis: e.g., legal contracts, academic papers, e-books, technical documents, etc., utilizing 512K long contexts to process full-text content.
- Complex multi-step reasoning tasks: e.g., math problems, logical reasoning, case studies, or chain-of-thought solutions that control the model's reasoning steps through a thought budgeting mechanism.
- Agent system and tool invocation scenariosSeed-OSS has demonstrated strong capabilities in agent tasks, such as building knowledge-questioning bots, automated tool invocation, and multi-tasking collaborative bots.
- Code Generation and Programming Assistance: Excellent performance in LiveCodeBench v6 and other benchmarks, suitable for IDE smart-completion, code generation, bug fixing and other scenarios.
- Language Learning and Translation Tasks: Optimized for internationalization, suitable for NLU, translation, cross-language applications with multi-language support.
How to use Seed-OSS?
-
Model Selection
-
If the main focus is on performance: select
Seed-OSS-36B-Base(with synthetic data) orInstruct(after command tuning); -
If you are concerned with the research base model: choose
Base-woSyn.
-
-
Getting the model
-
The model has been released open source on platforms such as Hugging Face (e.g., Seed-OSS-36B);
-
Download or load through an existing LLM inference framework.
-
-
Reasoning and Thinking Budget Control
utilization<seed:think>labels as well as<seed:cot_budget_reflect>to specify and monitor inference budgets, for example:A multiple of 512 is recommended.
-
Resource allocation requirements
-
FP16 reasoning requires ~72GB VRAM, INT8 requires ~36GB, and INT4 about 18-20GB;
-
Inference frameworks that support partial offloading (such as vLLM or llama.cpp) can be used to reduce video memory pressure.
-
-
Sampling Setup Recommendations
-
Recommended
temperature = 1.1cap (a poem)top_p = 0.95to balance the generation of diversity with quality.
-
-
Deployment and commercialization
-
Based on the Apache-2.0 license, you can integrate it in commercial products;
-
It is recommended to refer to the model README and LICENSE to clarify the terms of use.
-
Seed-OSS project address
- GitHub repository::https://github.com/ByteDance-Seed/seed-oss
- HuggingFace Model Library::https://huggingface.co/collections/ByteDance-Seed/seed-oss-68a609f4201e788db05b5dcd
data statistics
Related Navigation

Microsoft has introduced a Visual Agent parsing framework that transforms large language models into intelligences that can manipulate computers, enabling efficient automated interactions.

Claude 3.7 Max
Anthropic's top-of-the-line AI models for hardcore developers tackle ultra-complex tasks with powerful code processing and a 200k context window.

Gemma
Google's lightweight, state-of-the-art open-source models, including Gemma 2B and Gemma 7B scales, each available in pre-trained and instruction-fine-tuned versions, are designed to support developer innovation, foster collaboration, and lead to responsible use of the models through their powerful language understanding and generation capabilities.

AlphaDrive
Combining visual language modeling and reinforcement learning, the autopilot technology framework is equipped with powerful planning inference and multimodal planning capabilities to deal with complex and rare traffic scenarios.

Gemma 3
Google launched a new generation of open source AI models with multi-modal, multi-language support and high efficiency and portability, capable of running on a single GPU/TPU for a wide range of application scenarios.

Zen Browser
An open-source desktop browser based on the Firefox engine, featuring vertical tabs, workspaces, and split-screen views, emphasizing privacy protection and a modern browsing experience focused on efficiency and concentration.

Waver 1.0
Waver 1.0 is an open source full-featured video generation model that makes it easy to create text/images to HD video with efficiency, convenience and outstanding quality.

OpenAI o3-mini
OpenAI introduces small AI models with inference capabilities and cost-effective pricing, designed for developers and users to optimize application performance and efficiency.
No comments...
