Seed-OSSTranslation site

2mos agoupdate 431 0 0

ByteDance's open-source 36 billion parameter-long contextual big language model supports 512K tokens, a controlled mind budget, excels in inference, code and agent tasks, and is freely commercially available under the Apache-2.0 license.

Language:
en
Collection time:
2025-08-21
Seed-OSSSeed-OSS

What is Seed-OSS?

Seed-OSS is a series of 36 billion parameter large language models open-sourced by ByteDance, using Apache-2.0 license, supporting free research and commercialization. Its biggest highlight is that it natively supports 512K tokens long context, which can handle long documents such as whole books and legal contracts; meanwhile, it has a "thought budget" mechanism, which allows developers to control the length of reasoning and improve efficiency. Seed-OSS provides basic version, command tuning version, and research version without synthetic command data to meet the different needs of enterprise applications and academic research, and is suitable for long document analysis, complex reasoning, programming assistance, and multi-language scenarios.

The series includes three editions:

  • Seed-OSS-36B-Base: Base model, pre-trained with synthetic instruction data;
  • Seed-OSS-36B-Base-woSyn: Basic version without synthetic instruction for study-neutral baselines;
  • Seed-OSS-36B-Instruct: Command-tuned for downstream task execution.

Each model has approximately 36B (i.e., 36 billion) parameters and has the following technical highlights:

  • Native support for very long contextsUp to 512K tokens, which excels in handling long documents and long chains of logical reasoning;
  • Thinking budget is controllable.The developer can flexibly control the model inference length to improve the inference efficiency; it is recommended to set it in multiples of 512 (e.g. 512, 1024, 2048, etc.), and 0 means direct generation;
  • The architecture uses causal LM + RoPE + GQA attention + RMSNorm + SwiGLU with 64 layers and a vocabulary of about 155K;
  • Optimized reasoning and agent performance, excelled in reasoning, coding, and agent tasks;
  • Versions with and without synthetic instruction data are available to meet the different needs of researchers regarding the impact of training data;
  • Optimized for internationalization (i18n) with good multilingual support.

Key Features of Seed-OSS

  1. Extremely long contextual processing capabilities: 512K token context support, allowing the model to process very long text (e.g., books, legal documents, long inference chains, etc.) more smoothly and reduce truncation problems.
  2. Civic Budget Control MechanismYou can set an inference budget and track token usage during the inference process until the budget is depleted and an answer is generated. This dynamic control mechanism improves efficiency and better limits model workload.
  3. Excellent reasoning and agent performance: Seed-OSS-36B-Instruct meets or exceeds open source SOTA performance in several public benchmarks such as math, reasoning, quiz, code generation, agent tasks. For example, AIME24 (91.7), LiveCodeBench v6 (67.4), RULER (128K) (94.6).
  4. Research FriendlyThe version with/without synthetic instruction allows researchers to control the impact of the training data in a more transparent and controlled way.
  5. open license: Apache-2.0 licenses support commercial use with unlimited scope of use, suitable for enterprise integration and product landing.

Seed-OSS usage scenarios

  • Long document processing and analysis: e.g., legal contracts, academic papers, e-books, technical documents, etc., utilizing 512K long contexts to process full-text content.
  • Complex multi-step reasoning tasks: e.g., math problems, logical reasoning, case studies, or chain-of-thought solutions that control the model's reasoning steps through a thought budgeting mechanism.
  • Agent system and tool invocation scenariosSeed-OSS has demonstrated strong capabilities in agent tasks, such as building knowledge-questioning bots, automated tool invocation, and multi-tasking collaborative bots.
  • Code Generation and Programming Assistance: Excellent performance in LiveCodeBench v6 and other benchmarks, suitable for IDE smart-completion, code generation, bug fixing and other scenarios.
  • Language Learning and Translation Tasks: Optimized for internationalization, suitable for NLU, translation, cross-language applications with multi-language support.

How to use Seed-OSS?

  1. Model Selection

    • If the main focus is on performance: select Seed-OSS-36B-Base(with synthetic data) or Instruct(after command tuning);

    • If you are concerned with the research base model: choose Base-woSyn.

  2. Getting the model

    • The model has been released open source on platforms such as Hugging Face (e.g., Seed-OSS-36B);

    • Download or load through an existing LLM inference framework.

  3. Reasoning and Thinking Budget Control
    utilization <seed:think> labels as well as <seed:cot_budget_reflect> to specify and monitor inference budgets, for example:

    <seed:think>...</pseed:think>
    <seed:cot_budget_reflect>I have used X tokens, Y remaining</pseed:cot_budget_reflect>

    A multiple of 512 is recommended.

  4. Resource allocation requirements

    • FP16 reasoning requires ~72GB VRAM, INT8 requires ~36GB, and INT4 about 18-20GB;

    • Inference frameworks that support partial offloading (such as vLLM or llama.cpp) can be used to reduce video memory pressure.

  5. Sampling Setup Recommendations

    • Recommended temperature = 1.1 cap (a poem) top_p = 0.95to balance the generation of diversity with quality.

  6. Deployment and commercialization

    • Based on the Apache-2.0 license, you can integrate it in commercial products;

    • It is recommended to refer to the model README and LICENSE to clarify the terms of use.


data statistics

Relevant Navigation

No comments

none
No comments...