Meet the OpenAI Operator, an AI agent that navigates the web for you


Join our daily and weekly newsletters for the latest updates and exclusive content on industry leading AI coverage. Learn more


OpenAI is open Operatorthe first of these semi-autonomous AI agents, designed to “operate” a web browser like a human, for them. The agent uses the cursor to point and click, self-type, browse the web and perform actions on various websites, such as booking restaurant reservations through OpenTable and assembling on Instacart and DoorDash orders. That’s instead of being confined to the ChatGPT interface or OpenAI’s application programming interface (API).

“This product is the beginning of our move to agents,” said CEO and cofounder Sam Altman in a demo livestreamed on the company’s YouTube Channel today at 1 pm ET.

President of OpenAI and fellow partner Written by Greg Brockman at X: “2025 is the year of agents.”

The preview, which is now available to paying US subscribers to OpenAI’s ChatGPT Pro ($200 per month) plan, aims to demonstrate the agent AI’s potential while gathering critical feedback to refine its capabilities.

The operator does not take over your web browser, however. Instead, you will visit a separate, new website – operator.chatgpt.com — and faces a prompt input box similar to ChatGPT.

Typing a request into this box – “find me tickets for tonight’s LA Lakers game” – will prompt the Operator to open a separate, virtual browser running in the cloud. OpenAI servers. Afterward, the agent can perform tasks such as filling out forms, managing online reservations, even booking tickets to sporting events and concerts, and navigating other common areas. workflows. The user watches the cursor move itself in the cloud-based browser in real time. If the agent encounters a problem, it will stop and message the user via text output, similar to ChatGPT responses.

Also, at the bottom of the virtual browser, the user can see suggestions of actions that the Operator can perform for them.

However, the user can take control at any time – similar to semi-autonomous driving systems in modern cars. The operator also asks the user to input their own payment credentials when it comes to the purchase screen of another website. Finally, users can save particular workflows they want to use in the future and start them again.

The operator is powered by what OpenAI calls computer-using agent (CUA) technology, a new variant of GPT-4o specifically trained to use computers.

Connecting AI and GUIs

Operator stands out from other automation tools by simulating human interaction with graphical user interfaces (GUIs).

Instead of relying on special APIs, the system uses screenshots for visual input and uses virtual mouse and keyboard actions to complete tasks.

The underlying CUA model combines the vision capabilities of GPT-4o with reinforcement learning, which enables the agent to understand, reason, and act on the screen.

This method allows the Operator to manage a variety of tasks, including ecommerce browsing, travel planning, and even repetitive tasks such as creating playlists or managing shopping lists. Outstanding benchmarks illustrate its effectiveness:

87% success rate on WebVoyagera test of live website navigation

58.1% success rate on WebArenawhich simulates real-world ecommerce and content management scenarios

But now there is tough competition: Just yesterday, Chinese tech firm ByteDance (the parent company of TikTok) has launched its own AI agent for controlling web browsers and performing actions of a user. for. Called UI-TARS, it’s fully open-source and boasts equally impressive benchmark performance (though not directly compared to the same benchmarks). That means OpenAI Operator needs to be much better or more reliable to justify the relatively high ($200/month) cost of accessing it via ChatGPT Pro subscriptions.

Tried in business web navigation use cases

OpenAI has partnered with many businesses to ensure that Operator meets real-world needs. Companies including Instacart, DoorDash and Etsy are already testing the technology for use cases ranging from grocery delivery to personal shopping.

Brett Keller, CEO of Priceline, touted its use for travel planning, calling it “an important step in making travel more seamless and personal.”

For public sector applications, the City of Stockton is exploring ways to use Operator to simplify civic engagement. Jamil Niazi, the city’s director of information technology, emphasized the potential of AI to make the registration of services easier for residents.

However there are limitations. Tech publication each got an early preview, tested it last week, and found that:

“One of the unique features of Operator’s design is that it doesn’t use your browser. Instead, it uses a browser in one of OpenAI’s data centers that you can view and interact with remotely. The height of this design decision is that you can use the Operator anywhere and anytime – for example, on any mobile device.

“The downside is that many sites like Reddit block AI agents from browsing so they cannot be accessed by the Operator. In this research preview mode, the Operator is also blocked by OpenAI from to access certain resource-intensive sites such as Figma or competitor-owned sites such as YouTube for performance or legal reasons.

Safety measures

Due to its ability to act for users, Operator has been developed with strong safety features:

User control: The operator asks for confirmation for sensitive actions, such as purchasing or sending emails.

Watch mode: Ensures user management for critical tasks, especially on sensitive sites such as email or financial platforms.

Avoid misuse: The system is trained to reject malicious requests and includes protections against enemy attacks, such as malicious prompts embedded in websites.

OpenAI also includes features to protect user privacy, including options to clear browsing data and opt out of data sharing for model improvement.

The business edition is coming

OpenAI envisions a broader role for Operator in both individual and business settings. Over time, the company plans to expand access to Plus, Team, and Enterprise users, eventually integrating Operator with ChatGPT.

There are also plans to make the underlying CUA technology available through an API, enabling developers to create custom agents that use the computer.

Despite its potential, Operator remains a work in progress. OpenAI is clear about its limitations, such as difficulties with complex interfaces or unfamiliar workflows. Early user feedback plays an important role in improving the accuracy, reliability and safety of the system.

As OpenAI refines the Operator through real-world use, it seeks to transform AI from a passive tool to an active participant in the digital ecosystem. Whether it’s simplifying daily tasks or revolutionizing business workflows, OpenAI positions the Operator as the next step in making AI easy, practical, and secure.



Source link
  • Related Posts

    The Disney World’s International Festival of the Arts celebrates Star Wars, X-Men ’97, and so on

    The International Festival of the Epcot Arts is the lowest festival of Walt Disney World. It happens during the off-season when it’s a bit less expensive to visit, and it…

    STARKEY EDDD AI RIC RT REVIECT: BEST WANTED TO TEST AIDS

    If the US Food and Drug Administration Opening the door for over-the-counter hearing aids In 2022, I have everything. Prescription drugs are expensive, and many OTC models don’t require visiting…

    Leave a Reply

    Your email address will not be published. Required fields are marked *