Today we’re releasing Operator(opens in a new window), an agent that can go to the web to perform tasks for you. Using its own browser, it can look at a webpage and interact with it by typing, clicking, and scrolling.
To ensure a safe and iterative rollout, we are starting small. Starting today, Operator is available to Pro users in the U.S. at operator.chatgpt.com
Operator is powered by a new model called Computer-Using Agent (CUA). Combining GPT-4o's vision capabilities with advanced reasoning through reinforcement learning, CUA is trained to interact with graphical user interfaces (GUIs)—the buttons, menus, and text fields people see on a screen.
Operator can “see” (through screenshots) and “interact” (using all the actions a mouse and keyboard allow) with a browser, enabling it to take action on the web without requiring custom API integrations.
ChatGPTMastermind