- AI Breakfast
- Posts
- OpenAI Releases "Operator" AI Agent
OpenAI Releases "Operator" AI Agent
Good morning. It’s Friday, January 24th.
Did you know: On this day in 1996, the first version of the Java programming language was released. The ability of Java to “write once, run anywhere” made it ideal for Internet-based applications.
In today’s email:
OpenAI’s Operator
o4 Training Has Begun
ByteDance’s UI-TARS Agent
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
In partnership with HUBSPOT
Unlock the full potential of your workday with cutting-edge AI strategies and actionable insights, empowering you to achieve unparalleled excellence in the future of work. Download the free guide today!
Today’s trending AI news stories
OpenAI launches Operator—an agent that can use a computer for you
OpenAI has rolled out Operator, an AI agent capable of executing tasks autonomously within a web browser. Initially available to U.S. users on ChatGPT’s $200 Pro plan, it will expand across other tiers. Operator automates tasks like booking travel, making reservations, and online shopping, using a dedicated browser interface that mimics human navigation.
Powered by the Computer-Using Agent (CUA) model, it interacts with websites just like a human would, filling out forms and clicking buttons. While effective for routine tasks, it requires user supervision for sensitive actions such as banking or email. Collaborating with companies like eBay and Uber, OpenAI ensures compliance with service agreements.
In a notable shift in its data retention policies, OpenAI revealed that it may retain deleted data from Operator for up to 90 days—longer than the 30-day retention period for ChatGPT. This policy aims to prevent abuse and improve fraud monitoring. While data may be accessed by authorized personnel for legal or security purposes, users retain control over their information.
In parallel with the Operator release, Sam Altman announced on 𝕏 that ChatGPT's free tier will soon feature the o3-mini, with the Plus tier seeing expanded usage. Read more.
OpenAI is already training o4 and expects "another big jump in capabilities"
OpenAI has initiated the training phase for its succeeding reasoning model, tentatively designated "o4," as disclosed by Chief Product Officer Kevin Weil during his address at the World Economic Forum in Davos. This development follows the expeditious advancement of its predecessor, the o3 model, which was realized within a mere three-month timeframe.
Weil expressed confidence in a substantial augmentation of capabilities with the forthcoming o4 model, while concurrently projecting even more abbreviated iteration cycles for subsequent models. Read more.
ByteDance's UI-TARS can take over your computer, outperforms GPT-4o and Claude
ByteDance’s UI-TARS stands as a cutting-edge AI agent for PC and macOS, outstripping GPT-4o, Claude, and Gemini in GUI-centric tasks. Available in 7B and 72B parameter variants, it achieves state-of-the-art performance across 10+ benchmarks, showcasing its prowess in perception, contextual grounding, and sequential reasoning. Trained on 50 billion tokens, augmented by a screenshot-rich dataset, UI-TARS interprets multimodal inputs with finesse, autonomously completes complex workflows, and iteratively refines its outputs through error analysis.
ByteDance just dropped UI-TARS
Pioneering Automated GUI Interaction with Native Agents
— AK (@_akhaliq)
4:57 AM • Jan 22, 2025
Equipped with sophisticated memory systems, UI-TARS seamlessly balances rapid, intuitive responses with deliberate, multi-step planning. In evaluations such as VisualWebBench, ScreenQA-short, and WebSRC, it demonstrates unparalleled proficiency, reflecting an astute grasp of web and mobile GUIs. By offering transparent, stepwise task execution, it sets a new standard for adaptive AI agents in the competitive landscape of multimodal intelligence. Read more.
Google releases free Gemini 2.0 Flash Thinking model, pressuring OpenAI's premium strategy
Musk Challenges Stargate Funding, Altman and Nadella Respond
OpenAI, SoftBank each commit $19 bln to Stargate AI data center, the Information reports
OpenAI Partners with Bertelsmann to Supercharge Global Brands
Anthropic CEO expects major AI breakthrough, plans to launch "virtual collaborators"
Grok 3 writes python script of a ball bouncing inside a tesseract
DeepMind’s new inference-time scaling technique improves planning accuracy in LLMs
Hugging Face open-sources world’s smallest vision language model
"Nerd sniping" and "think less" attacks emerge as AI models get more time to reason
Freepik Launches Google's Imagen 3 Integration with Free Trial Offer
KREA AI Launches Real-Time Custom AI Model Training for Personalized Styles
Google Labs shares a short film, KITSUNE. All visuals were generated with Veo 2 using VideoFX
China unveils Mach-4 commercial drone prototype, eyes supersonic passenger jet future
Today’s CEOs are the last to manage all-human workforces, says Marc Benioff
5 new AI-powered tools from around the web
Hire Ava, the AI SDR & Get Meetings on Autopilot
Ava automates your entire outbound demand generation process, including:
Intent-Driven Lead Discovery
High Quality Emails with Waterfall Personalization
Follow-Up Management
Free up your sales team to focus on high-value interactions and closing deals, while Ava handles the time-consuming tasks.
arXiv is a free online library where researchers share pre-publication papers.
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!