AI macromodeling: open source or closed source?

popularization of science1yrs agoupdate AiFun

In the vast world of Artificial Intelligence (AI), large models play a pivotal role. These large and complex models show amazing intelligence and potential through deep learning and massive data training.

However, there is not a single path for the development of big models, and open source and closed source have become the two main trends. This paper will explore the characteristics of these two models and their impact.

open source macromodel

01 What is open source bigmould

Open Source Large Models (OSLM) are large software models developed, maintained and shared by open source communities or organizations. The core features of these models are their openness and accessibility, which provide rich resources and flexibility for research and applications in the field of artificial intelligence and machine learning with the following characteristics:

Open Source:The source code of the model is publicly available, allowing anyone to view, copy, modify and distribute it. This openness helps to promote technology exchange and innovation.
Large:Often have large parameters and complex structures capable of handling large-scale data and complex tasks.
Community Support:Has an active developer community with members working together on model development, maintenance and optimization.
Customizability:Users can customize and optimize the model according to their needs to meet the application requirements in specific scenarios.
Free or low-cost: Open source large models are often free to use or available at a low cost, lowering the barrier to technology adoption.

02 typical example

With the core concepts of openness, sharing and collaboration, open source big models have driven the rapid development of AI technology. Among them, GPT series (e.g., GPT-3) and BERT, the leaders of Transformer model family, are undoubtedly the outstanding representatives of open source big models.

LLaMA 3

Posted by Meta (formerly Facebook)
Including both 8B and 70B versions, it is the best of the open source models.
Fine-tuning through reinforcement learning with human feedback (RLHF) demonstrates performance comparable to top closed-source models.
For scenarios such as chatbots, natural language generation tasks, and programming tasks.
The open source nature of the model allows developers the freedom to customize and optimize.

Phi-3

Published by Microsoft AI Research
Known for its compactness and high performance.
Including Mini, Small and Medium versions, even the smallest Phi-3-Mini boasts 3.8B parameters and performance comparable to larger models.
Small size and efficiency make it ideal for resource-constrained environments, such as mobile devices or edge computing scenarios, while maintaining high performance.

BEAT (Bidirectional Encoder Representations from Transformers)

Published by: Google
is a bidirectional language representation model based on the Transformer architecture, released in 2018.
Pre-train deep bidirectional representations from unlabeled text by joint conditioning on left and right contexts in all layers.
is an important innovation in the field of Natural Language Processing (NLP), setting new benchmarks for numerous NLP tasks.
widely used for a variety of natural language processing tasks, as well as being the basis for many other open source, free and pre-trained models.

Faclon 180B

Noted for its massive size of 180 billion parameters and superior performance.
Outperforms LLaMA 2 and GPT-3.5 on multiple NLP tasks.
Requires significant computational resources to run, and is therefore more suited to research and commercial applications than to individual developers.

BLOOM

Known for its 176 billion parameters and multi-language support.
Ability to provide coherent and accurate text in 46 languages and 13 programming languages.
Transparency is a central feature, and both source code and training data are accessible for ease of operation, research, and improvement.
For internationalization projects that require multilingual support.

Each of these open source big models has its own characteristics, realizes its own advantages through different technical paths and architectures, and provides strong support for the development and application of AI technology. At the same time, their open source characteristics also promote the popularization and advancement of AI technology, and promote the development and innovation of the whole community.

closed-source macromodel

01 Model overview

Compared to open source big models, closed source big models are more focused on commercialization and intellectual property protection. Some well-known technology companies such as Google, Microsoft and Facebook have launched their own closed-source big model products. It has the following characteristics:

Closed Source:The source code and internal implementation details of the model are not publicly available and are only held by specific organizations or companies.
Big model:It also has a complex structure and functionality, capable of handling large amounts of data and tasks.
Exclusive rights:Developed, owned and maintained by an organization or company, and made available for use, usually through commercial licensing or authorization.
Confidentiality:Since the source code is not public, users cannot directly view or modify the internal structure of the model, ensuring the confidentiality and stability of the model.

02 typical example

GPT-3 (Generative Pre-trained Transformer 3)

Published by Open AI
With up to 175 billion parameters, it is one of the largest natural language generation models available.
Ability to generate coherent, natural text in a wide range of natural language processing tasks, including dialog, text summarization, Q&A, etc.
By means of pre-training, GPT-3 is able to handle a wide range of natural language tasks without extensive fine-tuning.
Due to its powerful performance and wide range of application scenarios, GPT-3 has become the model of choice for many companies and research organizations.

T5 (Text-to-Text Transfer Transformer)

Published by: Google
An encoder-decoder architecture is used, capable of handling a wide range of text generation tasks.
Achieved excellent performance in several natural language processing benchmark tests, including translation, summarization, and Q&A.
Google provides multiple versions of the T5 model to accommodate different application scenarios and computing resources.
The release of T5 has further advanced the development and application of natural language processing technology.

BART (Bidirectional and Auto-Regressive Transformers)

Published by Facebook AI Research
A sequence-to-sequence architecture is used to achieve better performance in text generation and text comprehension tasks.
BART has shown strength in several natural language processing benchmark tests, including text summarization, machine translation, and more.
The open source version of BART also provides a valuable resource for researchers and developers to advance the field.

BERT (Bidirectional Encoder Representations from Transformers)

While BERT is also available in an open source version, Google has also launched a commercialized version of its closed source.
As a bi-directional Transformer model, BERT is able to capture the contextual information of text and improve the accuracy of natural language processing tasks.
Closed source versions may contain additional optimizations and features for commercial applications.

Discussion: open source or closed source?

Open-source and closed-source big models have their own advantages and disadvantages, and which model to choose depends on the specific application scenario and needs. Open source big models have the advantages of high transparency, collaboration and flexibility, but they may also have problems such as security risks and copyright disputes.

Closed-source large models, on the other hand, are more focused on commercialization and intellectual property protection, but can also lead to problems such as technology monopoly and lack of innovation.

Openness:

Open source big model: open, transparent, encourages community participation and collaboration.
Closed-source macromodel: closed, private, and only available to specific users or organizations.

Accessibility:

Open source big models: widely accessible, lowering the technical barriers.
Closed-source macromodel: limited accessibility, requires specific licenses or authorizations.

Transparency:

Open source big models: code and algorithms are transparent and easy to understand and trust.
Closed-source macromodeling: internal workings are kept secret, which may trigger user distrust.

Customizability:

Open Source Big Model: allows users to customize and optimize the model.
Closed Source Large Models: Users are usually not able to customize the model in depth.

Innovation and improvement:

The open source big model: community engagement for rapid technology iteration and innovation.
Closed-source big models: the speed of innovation and improvement may be limited by the capacity and resources of the development team.

Cost:

Open source large models: usually free or low cost.
Closed-source large models: may need to purchase a license or pay a fee for use.

Legal and Compliance:

Open source macromodels: follow the terms of a specific open source license.
Closed-source macromodeling: subject to strict legal and contractual terms.

Summary

In the future, as AI technology continues to develop and application scenarios continue to expand, the line between open source and closed source big models may become more blurred. Some companies may adopt a combination of open source and closed source to protect their technical secrets and intellectual property while being able to fully utilize the resources and advantages of the open source community.

At the same time, with the continuous progress of technology and the popularization of open source culture, more open source big models will emerge to promote the faster development of AI technology.

popularization of science

The copyright of the article belongs to the author, please do not reprint without permission.

AI macromodeling: open source or closed source?

open source macromodel

01 What is open source bigmould

02 typical example

LLaMA 3

Phi-3

BEAT (Bidirectional Encoder Representations from Transformers)

Faclon 180B

BLOOM

closed-source macromodel

01 Model overview

02 typical example

GPT-3 (Generative Pre-trained Transformer 3)

T5 (Text-to-Text Transfer Transformer)

BART (Bidirectional and Auto-Regressive Transformers)

BERT (Bidirectional Encoder Representations from Transformers)

Discussion: open source or closed source?

Openness:

Accessibility:

Transparency:

Customizability:

Innovation and improvement:

Cost:

Legal and Compliance:

Summary

Science: Big Model Filing

Explanation of the principles and applications of AI Agent technology in one article

Related posts

Science: Big Model Filing

7B? 13B? 65B?...? An article explaining the parameters of the big models

Understanding the MCP protocol in one article: making AI understand your data better

One thought on "Understanding Reasoning Large Models - Understanding Reasoning LLMs

No comments

Popular Articles

Popular Sites