OpenAI Releases Codex Intelligentsia, an Automated Software Programming Assistant Designed for Developers

OpenAi today announced Codex, a cloud-based programming intelligence that can multitask in parallel, powered by the codex-1 model. Available today for ChatGPT Pro, Team and Enterprise users, and soon for Plus users.

It is a cloud-based software engineering intelligent agent that can process multiple development tasks in parallel and assist developers to complete their programming work efficiently.

U.I. design didn't do a whole lot of work.Codex provides an input box and two buttons - "Ask" and "Code". All you have to do is describe the task clearly and it will start executing.

An excited Altman tweeted several times in a row on the X platform saying:

It's amazing and exciting the amount of software that a person can develop with tools like these. "You can actually just go ahead and do it" is one of my favorite stems; I had no idea that it would be applied to AI itself and its users in such a significant way so soon.

Codex can do more than just write functional code; it can understand the structure of the code, answer questions about the codebase, fix bugs, and even directly submit reviewable Pull Requests.

Each task is executed in a separate sandbox environment in the cloud, automatically loading the user's code repository. Running time ranges from 1 to 30 minutes, depending on the complexity of the task, and the user can view the progress of the task in real time.

Once the task is completed, Codex outputs a full set of traceable execution results, including terminal logs, test records, and other validation information. You can review the results, make suggestions, and even initiate PRs directly on GitHub or merge them into your local project.

With the AGENTS.md file in your project, you can also customize the behavior of the Codex to better fit your project's development specifications and testing standards.

The core model of Codex is codex-1, a version of the OpenAI o3 series fine-tuned for software engineering.

Benchmark results show that the codex-1 scored 72.1% on SWE-Bench, outperforming Claude 3.7 and o3-high on paper parameters.

The training approach is also very 'hands-on'. By training with reinforcement learning in a real development environment, Codex is able to generate code that is more in line with human coding styles and censorship preferences, execute it in strict accordance with the instructions, and keep running the test until it passes.

Starting today, Codex will be available to ChatGPT Pro, Enterprise and Team users, with support for Plus and Edu users coming online soon.

In terms of security, Codex does a restrained enough job. When performing tasks, it only accesses the codebase and preset dependencies you provide, no networking, no access to external APIs, and it stops when it encounters an uncertain problem to prompt you to deal with it.

Moreover, Codex is specifically trained to recognize and reject requests related to malicious development to avoid being abused for malware development.

Currently, OpenAI has been using Codex internally to help engineers accomplish repetitive tasks such as refactoring, naming conventions, and writing tests, which significantly improves development efficiency. Some external teams have also provided positive feedback, such as Cisco and Temporal, who have accelerated the pace of development and debugging with Codex.

In a late-night broadcast, OpenAI employees demonstrated Codex's more practical capabilities:

It understands the structure of the entire codebase, automatically locates and fixes bugs, and handles common problems such as timeout settings or spelling errors. Even when errors are reported on the command line, Codex analyzes the cause of the error and automatically generates a fix script and corresponding test cases.

In addition, it has a code review feature that combs through all the change points and points out the risks that could lead to test failures. In other words, Codex is evolving into a truly usable programming collaboration assistant.

OpenAI employees also shared their experience with Codex to manage large code changes that were merged and tested even if they never ran in their local environment.

Well-known tech writer Dan Shipper got an early taste of Codex and shared his experience with a blog post.

In his view, Codex allows users to assign tasks as if they were managing a team, without having to write code. codex is particularly well suited to senior developers, generating clean and efficient code changes as they perform their tasks and automatically generating pull requests to submit to GitHub.

However, Codex also has some limitations, such as being less friendly to novice engineers, not being very good at handling follow-up changes and additions, and not yet being fully integrated into mainstream development platforms such as GitHub and Slack.

Designed primarily for professional developers, rather than those who like to program while they chat, Codex's core strength is in increasing the productivity of advanced developers, enabling them to manage multiple tasks at once and thus speeding up the development process.

If you're a technical lead who needs to add features or fix bugs in an existing project, Codex is a tool you'll use often, but if you're building a "one-person, billion-dollar SaaS" from scratch, you probably won't need it.

Simply put, if you make a website or tool by yourself, users use it by subscription, and the monthly income reaches several million, and the annual income breaks ten million or even more than one hundred million, then you are doing "one person billion dollar SaaS".

In addition to the main model in the cloud, OpenAI has also launched codex-mini-latest, a lightweight model optimized for the command line, to support developers to quickly access AI in the local environment.

The Codex CLI login process is simple and straightforward, with direct access to your ChatGPT account and a basic free API quota. It is currently available to Pro, Enterprise and Team users globally, with support for Plus and Edu users coming in the next few weeks.

For developers with access to codex-mini-latest, the model can be called via the Responses API for $1.50 per million input tokens and $6 per million output tokens, with support for hint cache discounts of up to 75% to further minimize the cost of calls.

OpenAI's long-term vision for Codex is clear:

It is not only a tool for writing code, but also a prototype of future collaboration model. Multi-agent, asynchronous execution, automatic progress reporting, this set of logic in the future may be embedded in the IDE, Git tools, and even Slack, becoming the real "co-pilot" of developers.

Codex is still in the research preview phase, and advanced features such as image input are not yet available.

But all those past visions of AI programming assistants, such as automated code writing, PR mentioning, and bug fixing, have finally come to fruition in Codex in the form of usable tools that can actually get started and run into real workflows.