Good morning. It’s Wednesday, October 1st.

On this day in tech history: In 2003, the DARPA-funded CALO (Cognitive Assistant that Learns and Organizes) project held its first cross-institution integration demo on. CALO’s legacy code directly seeded Apple’s Siri years later. It was one of the first serious attempts to unify NLP, task planning, and context models in a single assistant.

In today’s email:

Sora 2 - The Best AI Video Generator Yet
Deepmind’s “Chain-of-Frames” Theory
Claude Sonnet 4.5s 30hr Reasoning
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

8 Weeks. Actionable AI Skills. MBA-Style Networking.

The 8-week AI for Business & Finance Certificate Program helps you:

Build AI confidence with role-specific use cases
Learn how leaders are implementing AI strategies at top financial firms
Secure a lasting network that supports your career growth

Earn your certificate from Columbia Business School Executive Education—program starts November 10.

Enroll by Oct. 13 to get $200 off tuition + use code AIBREAKFAST for an additional $300 off.

Today’s trending AI news stories

OpenAI goes full platform with Sora 2 drop, copyright opt-out, and in-chat payments

Sora 2 is a major technical upgrade: synced audio, beats Google’s Veo 3 on physics and realism, better scene continuity, and support for everything from anime to cinematic shots. It can follow complex, multi-shot prompts without breaking continuity, keeping track of objects, momentum, and scene logic. A defining feature is “cameos,” where verified users can insert their likeness and voice into generated scenes, with strict opt-in consent, revocation rights, and cryptographic watermarking.

The companion Sora app mirrors TikTok with a vertical feed, remix tools, and social features. But unlike competitors, OpenAI has warned studios that future Sora versions may incorporate copyrighted material unless rightsholders explicitly opt out, a sharp inversion of the traditional consent model and a likely legal flashpoint. The app is invite-only in the U.S. and Canada, with Android and API support coming soon.

ChatGPT also added a shopping layer. “Instant Checkout” lets U.S. users buy Etsy and soon Shopify products directly in chat. Purchases run on the new Agentic Commerce Protocol (ACP), co-developed with Stripe and now being open-sourced. ACP uses encrypted, per-merchant payment tokens and step-by-step confirmations. Merchants can plug in via Stripe or token APIs, and OpenAI takes a transaction fee without touching pricing or rankings. Multi-item carts and international rollout are next.

Image: OpenAI

With Sora’s TikTok-style AI feed, default copyright opt-outs, and agent-driven payments, OpenAI is facing what critics call its “infinite slop” moment. Critics argue the company is collapsing search, ranking, and payments into a single black box controlled by one AI vendor, raising antitrust concerns, privacy alarms over behavioral data, and claims it’s drifting away from its safety mission. Read more.

DeepMind says video models are the next LLMs, powered by zero-shot “chain-of-frames”

A new Google DeepMind paper positions its Veo 3 video model as the visual counterpart to large language models. Trained with a continuation objective on web-scale data, Veo 3 performs zero-shot across more than 60 visual tasks, covering segmentation, detection, denoising, and super-resolution, without task-specific tuning.

The model also shows early signs of what researchers call chain-of-frames reasoning, using temporal cues across generated frames to solve visual logic tasks like mazes and symmetry. DeepMind expects inference costs to decline similarly to LLMs, suggesting generative video could replace specialized vision models over time.

Image: DeepMind

On the consumer front, Google is upgrading AI Mode in Image Search with conversational querying. Rather than filtering by static attributes, users can describe what they want in natural language, combine prompts with reference photos, and iteratively refine results, whether shopping for clothing or browsing interior design ideas. Powered by Gemini 2.5 and built on Google Lens, the system parses subtle visual context, secondary objects, and stylistic cues while surfacing shoppable links directly in Search.

— # (#)

Google Drive for desktop, meanwhile, is getting AI-based ransomware detection. A model trained on millions of real-world samples analyzes file behavior and halts syncing if it detects mass encryption or malicious modification. The system then alerts users via email and desktop notification and offers rollback to clean versions. The tool is entering open beta now, with general availability expected by year’s end. Read more.

Claude Sonnet 4.5 ships with 30-hour focus, Agent SDK, and no-scaffold coding preview

Anthropic just dropped Claude Sonnet 4.5, and it’s clearly gunning for GPT-5, not with flashy demos, but with stamina and tooling. The model can stay locked onto a coding job for more than 30 hours without derailing, which is a big jump from even its own Opus 4.1. Benchmarks back it up: 77.2% on SWE-bench Verified, 50% on Terminal-bench, and a jump from 42.2% to 61.4% on OSWorld in just four months. That shows it isn’t just spitting out code, it can actually operate inside real computing environments.

Pricing hasn’t budged ($3 per million input tokens, $15 per output), even though GPT-5 is up to seven times cheaper. Anthropic’s betting that enterprises will pay for reliability, especially since it still holds 42% of the code-gen market and a $5B run rate, though most of that leans on Cursor and GitHub Copilot.

Anthropic also added checkpoints for pausing long tasks, a VS Code extension, better terminal controls, context editing, and a memory tool to reduce context blowouts. Security is tightened under ASL-3, with better guards against prompt injection and sensitive misuse. To test what's next, Anthropic also launched a five-day preview called “Imagine with Claude” for Max users. It strips out prewritten functions entirely, Claude has to generate software logic from zero, in real time.

The company claims Sonnet 4.5 is also scoring higher in math, finance, cybersecurity, and domain reasoning. With GPT-5 pushing on price and scale, Anthropic is pitching endurance, autonomy, and developer fit as its differentiators. Read more.

5 new AI-powered tools from around the web

Integrity

Stop jumping between Notion, Miro, and ChatGPT. Integrity unifies structure, visual thinking, and AI so you can turn ideas into results faster.

integrity.sh

CursorClip

Record professional demos in minutes with CursorClip's smart auto-zoom. Native macOS app, lifetime license $47. No subscriptions, watermark-free exports.

cursorclip.com

WorkBeaver

WorkBeaver handles your repetitive tasks - so you don't have to. Show your task from start to finish just once, and let AI work for you.

workbeaver.com/en

Nexa SDK

Nexa SDK lets developers run LLMs, multimodal, ASR & TTS models across PC, mobile, automotive, and IoT. Fast, private, and production-ready on NPU, GPU, and CPU.

sdk.nexa.ai

Thesys

Frontend infrastructure for AI products. Build dynamic, real-time UIs with C1 Generative UI API.

www.thesys.dev

arXiv is a free online library where researchers share pre-publication papers.

📄 LongLive: Real-time Interactive Long Video Generation

📄 Quantile Advantage Estimation for Entropy-Safe Reasoning

📄 PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning

📄 SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

📄 Multiplayer Nash Preference Optimization

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!

OpenAI's Sora 2 is INCREDIBLE

8 Weeks. Actionable AI Skills. MBA-Style Networking.

OpenAI goes full platform with Sora 2 drop, copyright opt-out, and in-chat payments

DeepMind says video models are the next LLMs, powered by zero-shot “chain-of-frames”

Claude Sonnet 4.5 ships with 30-hour focus, Agent SDK, and no-scaffold coding preview

Thank you for reading today’s edition.

Keep Reading

AI Breakfast

Home

Account