OpenAI L3-level AI intelligence when the first shot! Operator control computer independent autonomy to perform tasks, booking tickets, online shopping can be done for you!

Newsflash5mos agoupdate AiFun
404 0

A little over two weeks ago.OpenAI In a blog post reflecting on the second anniversary of ChatGPT, CEO Sam Altman predicted that 2025 will be the "big year" for AI agents, with the first of them likely to "join the workforce" and significantly change business output. The first AI agents could "join the labor force" and significantly change business output. Now OpenAI has fired the first shot to launch L3-level AI intelligent body.

On Thursday, January 23rd, EST, OpenAI announced the launch of a web-based platform calledOperator's first AI intelligent body, which can perform a variety of tasks through the web and independently do the work for the user without human intervention, "just give it a task and it will perform it."

Operator can use the Internet to perform a variety of tasks just like a human would, by opening a browser, clicking buttons on a page and typing in content. The things that a human user would do online, such as booking a flight, making a hotel reservation, planning a shopping order and completing an online purchase, can be done by Operator.

As shown in the screenshot below, Operator's interface has a variety of task categories for users to choose from, including Shopping, Delivery, Dining, Travel, and News, all of which support different types of automated execution of tasks.

打响OpenAI L3级AI智能体当头炮!Operator控制电脑独立自主执行任务,订票、网购都可代劳

OpenAI worked with a number of companies, including Instacart, OpenTable, Uber, and StubHub, to develop Operator and ensure that the service would run smoothly on the sites of these partners, according to Yash Kumar, OpenAI's head of Operator product and engineering.

Altman Says Operator Is the Beginning of a Journey into Level 3 AI

Operator means that after Microsoft, Anthropic and other rivals, OpenAI also entered the era of AI intelligences, and it is the third level (Level 3) of the AI development level that OpenAI "customized" last year.

Sam Altman, CEO of OpenAI, said this Thursday after doing a demo related to Operator, "This is the beginning of our journey into Level 3."

OpenAI has developed a system to "customize" AI evolutionary levels in order to track progress in developing human-level AI.OpenAI's five levels of the system are:

Lowest level one: chatbots, which are AIs that can interact with humans in a conversational language.
Level 2: Reasoner, this AI can solve human-level problems.
Level 3: Intelligence, this AI is a system that can take action.
Level 4: Innovator, this is the AI that can help with inventions.
Top Level 5: Organization, this AI can do the work of an organization.

Combines GPT-4o visual capabilities with advanced reasoning No API required!

OpenAI describes Operator's software as combining some of OpenAI's computer vision capabilities with multi-step problem-solving abilities designed to mimic human reasoning. Supporting Operator is the model known simply as CUA, the full name of which literally means Computer-Using Agent, a model that combines the vision capabilities of OpenAI's flagship model, GPT-4o, with advanced reasoning through reinforcement learning.

The CUA has been trained to interact with graphical user interfaces (GUIs), the buttons, menus, and text fields one sees on a screen, just like a human. It is therefore able to perform digital tasks flexibly, "without the need to use an operating system-specific or web-based API (application programming interface)."

OpenAI says that CUA achieved higher scores than the previous model of optimal effect (SOTA) in both the browser-use and browser-use benchmarks.

打响OpenAI L3级AI智能体当头炮!Operator控制电脑独立自主执行任务,订票、网购都可代劳

In terms of browser usage, CUA had a success rate of 58.11 TP4T in the WebArena test, which simulates real-world scenarios such as e-commerce, online store content management (CMS), and social forum platforms using offline, self-hosted open-source websites, and the WebVoyager, which tests performance on online real-time sites such as Amazon, GitHub, and Google Maps The CUA had a success rate of 871 TP4T in the test, although most of the tasks were relatively simple in the latter and more complex in the former. Previous success rates for computers using SOTA in WebArena and WebVoyager tests were 36.21 TP4T and 561 TP4T, respectively, and previous success rates for web browsing intelligences SOTA were 57.11 TP4T and 871 TP4T, respectively.

打响OpenAI L3级AI智能体当头炮!Operator控制电脑独立自主执行任务,订票、网购都可代劳

For computer use, in the OSWorld benchmark, which evaluates the model's ability to control full operating systems such as Ubuntu, Windows, and macOS, CUA had a success rate of 38.11 TP4T. Previously, SOTA had a success rate of 22.01 TP4T. OpenAI notes that CUA's performance improves as the test time is extended, which is when more steps are allowed. more steps, CUA's performance improves. Compared to the human level of testing, with a success rate of 72.41 TP4T, CUA still has a lot of room for improvement.

打响OpenAI L3级AI智能体当头炮!Operator控制电脑独立自主执行任务,订票、网购都可代劳

Research preview goes live first in the US for ChatGPT Pro users

This Thursday OpenAI is launching Operator, a research preview version, which is first going live in the U.S. market, where users in the U.S. can access the Operator service through the ChatGPT Pro package, which costs $200 per month for a subscription.

The research preview version of Operator can be accessed through the website operator.chatgpt.com.OpenAI says it hopes to incorporate Operator into all of its customer-facing ChatGPT services.

OpenAI said it plans to eventually roll out Operator's features to its ChatGPT Plus, Team, and Enterprise editions.CEO Altman added that Operator's features "will be rolled out in other countries soon. Unfortunately, it will take a while [in] Europe."

OpenAI also warns that Operator "is still learning and evolving and may make mistakes. For example, it currently faces the challenge of creating complex interfaces such as slideshows or managing calendars."

Some people are calling for the arrival of a large number of expert intelligences, while others think it's unattractive and they'd rather concentrate on modeling.

More than one media outlet has recently broken the news that OpenAI will be launching Operator, such as earlier this weekWall Street JournalAs mentioned, the media has learned that the outgoing Operator can automate tasks such as restaurant reservations and travel planning. Users can select different types of tasks, such as dining, shopping and traveling, and view the operation process on a small screen.

For OpenAI's official announcement of Operator on Thursday, social media platform X has received mixed reviews. Some outright shouted rush, while others bemoaned the fact that it would cost $200 a month to use it.

打响OpenAI L3级AI智能体当头炮!Operator控制电脑独立自主执行任务,订票、网购都可代劳

Karim Beguir, CEO of enterprise decision-making AI startup Instadeep, welcomed Operator. He commented that such AIs can visit websites, take screenshots, decide where to buy groceries or book a seat in a movie theater without the need for special APIs, and that the era of AI intelligences has arrived, with hordes of expert intelligences soon to follow.

打响OpenAI L3级AI智能体当头炮!Operator控制电脑独立自主执行任务,订票、网购都可代劳

And a comment from a user who received more than 1,000 likes wrote: "Operator is not attractive at all, this stuff should be done by Apple's iOS, not OpenAI. openAI should focus on putting out powerful models instead of stealing food from the ecosystem."

打响OpenAI L3级AI智能体当头炮!Operator控制电脑独立自主执行任务,订票、网购都可代劳

This article is fromWeChat "Hard AI"

© Copyright notes

Related articles

No comments

none
No comments...