Google open source Gemma-3 multimodal large model, support for 128K input and free of charge commercially available

Newsflash2mos agoupdate AiFun
357 0

Gemma系列大模型是Google Internet companyOpen source series of lightweight macromodels. Just a few moments ago (March 12, 2025), Google open-sourced the third generation of its Gemma series of macromodels, which consists of four different parameter scale versions, with the third generation of theGemma 3series are multimodal large models, and even the smallest billion-parameter-scale Gemma 3-1B supports multimodal inputs.

谷歌开源Gemma-3多模态大模型,支持128K输入并免费商用
  • Introduction to the Gemma 3 series models and their features
  • Excellent review of the Gemma 3 series models!
  • Gemma 3 open source

Introduction to the Gemma 3 series models and their features

The Gemma series of large models is technologically homologous to Google's Gemini series of models, but open-sourced under a free commercial license. It was first open-sourced in February 2024, when there were only the Gemma 2B model with 2 billion parameter scales and the Gemma 7B model with 7 billion parameter scales, and the context length was only 8 K. In May 2024, Google open-sourced the Gemma2 series, and the range of the versions was increased to three, with scales of 2B, 9B, and 27B, respectively.

Of these, the Gemma 3 - 27B version of the model was trained on a 14 trillion dataset, Gemma 3-12B was trained on a 12 trillion dataset, and the remaining 2 versions were trained on 4 trillion data and 2 trillion data, respectively.

With a vocabulary of 262K, Gemma 3 can be considered very powerful in terms of characterization.

Today, 10 months later, Google open-sourced the third generation of the Gemma 3 series of big models, with the range of versions increased to four, 1B, 4B, 12B and 27B, and upgraded from a purely big language model to a multimodal big model, i.e., it supports both image and video input.

This time around, the Gemma 3 has been upgraded considerably, as summarized below:

  • Gemma 3 series models support up to 128K contextual inputs (the 1 billion parameter version of Gemma 3-1B supports only 32K)
  • The Gemma 3 Series supports up to 140+ languages.
  • The Gemma 3 series of large models support multimodal inputs, including text, image, and video inputs
  • Gemma 3 series support function calls/tool calls

Excellent review of the Gemma 3 series models!

The Gemma 3 series of models consists of four versions, each of which open-sources a pre-training base version (pt suffix version, denoting pre-training) and an instruction fine-tuned version (it suffix version, denoting instruction fine-tuned), which means that a total of eight versions of the larger model have been open-sourced.

The maximum parameter size of the Gemma 3-27B IT has a fp16 precision size of 54.8GB, 27GB after INT8 quantization, two 4090s available, and INT4 quantization requires 14GB of video memory, which is perfectly fine for a single 4090.

And this version of the model reviewed very well, scoring 1,338 points (as of March 8, 2025) on the Big Model Anonymous Arena (Chatbot Arena), which is ranked 9th in the world, behind the o1-2024-12-17 model, and ahead of the likes of the Qwen 2.5-Max, and the DeepSeek V3, among others.

谷歌开源Gemma-3多模态大模型,支持128K输入并免费商用

And it performed well on the regular other reviews as well, outperforming the Qwen 2.5-72B in reviews and coming very close to the DeepSeek V3 and others.

谷歌开源Gemma-3多模态大模型,支持128K输入并免费商用

According to Google's official statement, this Gemma 3 series is a significant upgrade, with the Gemma 3-4B version modeling a level close to the Gemma 2-27B, while the Gemma 3-27B is close to the Gemini 1.5-Pro!

Gemma 3 open source

The 8 models of the Gemma 3 series are open-sourced under the Gemma Open Source License. Commercial use is allowed with free licenses.

The eco-adaptation has also been completed long ago, Huggingface, Ollama, Vertex, and llama.cpp are all supported.

This article was written by DataLearner, Source:DataLearner, original title: "Heavyweight! Google open-sources Gemma-3 model: multimodal support, 128K inputs, version 27B outperforms DeepSeeK V3 in large model anonymity arena, free commercial license

© Copyright notes

Related posts

No comments

none
No comments...