OpenAI Operator can surf the web for you


OpenAI has begun previewing a new tool called Operator that can navigate within a web browser. According to a blog post published Thursdaythe software is run by what the company calls a Computer-Using Agent. “CUA is trained to interact with graphical user interfaces (GUIs) — the buttons, menus, and text fields that people see on a screen — as humans do,” OpenAI said of the model. . “It provides the flexibility to perform digital tasks without using OS- or web-specific APIs.”

The current release of Operator builds on OpenAI’s GPT-4o model. It combines the vision capabilities of that algorithm with “advanced reasoning” trained through reinforcement learning. The operator has the ability to “break down tasks into multi-step plans and adapt to self-correction when there are challenges.” According to OpenAI, that capability represents the next stage of AI development.

The operator can interact with various websites, including the Instacart ordering platform.  The operator can interact with various websites, including the Instacart ordering platform.

Instacart

As with previous research previews, OpenAI cautioned that Operator is “still early and has limitations,” and that it may not yet “reliably work in all scenarios.” For example, depending on the complexity of the task and interface involved, the agent would greatly benefit from the user spending a few extra moments writing a more detailed prompt. Per The VergeThe Operator gives control to the user when he is stuck on a task. It will also provide control whenever a website asks for sensitive information, including login credentials. The company says it designed the tool to “deny malicious requests and block unauthorized content.”

OpenAI made Operator available to users first for $200 per month ChatGPT Pro Subscription. It also cooperates with companies like Instacart to offer the agent on their platforms, although there you need a subscription to ChatGPT Pro to test the integration.

The operator joins a growing list of AI agents that can navigate a web browser or an entire operating system. Anthropic is the first to offer this release capability Claude 3.5 Sonnet October modelfollowed recently by Google with this Gemini 2.0 model and Project Mariner.

If you purchase something through a link in this article, we may earn a commission.



Source link

  • Related Posts

    Why is everyone in AI off the Dreeseek

    Join our daily and weekly newsletters for newest updates and exclusive content to cover the industry. Learn more As a few days ago, the most displeased nerds (I said it…

    Everyone announced on Xbox Developer Direct Showcase

    Xbox hosts it Directer Direct Show now, detail the development of three games we know and a perfect new title, Ninja Gaiden 4. If you can’t tune, here’s what you…

    Leave a Reply

    Your email address will not be published. Required fields are marked *