- AI Breakfast
- Posts
- Hello GPT-Realtime
Hello GPT-Realtime
Good morning. It’s Friday, August 29th.
On this day in tech history: In 1993, the Mbone (“multicast backbone”) quietly proved the Internet could move more than text. It streamed a David Blair’s full-length film, Wax or the Discovery of Television Among the Bees to thousands using IP multicast, tunneling packets through routers like a guerrilla overlay network. The demo was fragile, but its elegant overlay proved large-scale media wasn’t a server problem - it was a distribution problem.
In today’s email:
GPT-Realtime
Google’s Multimedia AI
Musk’s Big Week
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
In partnership with Rube
Your AI now works with 500+ apps to actually get things done
Rube connects your AI to 500+ apps so you can:
Prep for meetings in seconds → “Hey Rube, what’s on my calendar with Acme Corp?”
Update projects automatically → “Add these notes to Notion under Q3 Roadmap.”
Cut busywork instantly → “Upload the new ☕️ emoji to Slack.”
Stay in flow → all without switching tabs, copying links, or chasing logins.
Why teams love it:
One login for every tool.
Share access across your team instantly.
Works inside VSCode, Claude, Cursor, and more
Thank you for supporting our sponsors!

Today’s trending AI news stories
OpenAI rolls out gpt-realtime and GPT-5 Codex but joint safety tests expose cracks
The new gpt-realtime model is now production-ready through the Realtime API. Unlike older systems, it processes speech directly without text conversion, cutting latency and capturing nuance like laughter, sighs, or accent shifts. It can even switch languages mid-sentence.
Benchmarks show major jumps: 82.8% on Big Bench Audio (vs. 65.6% prior) and 30.5% on MultiChallenge (vs. 20.6%). Developers also get SIP support for phone systems, MCP for secure tool access, image input for contextual analysis, and two new expressive voices (Cedar, Marin). Pricing dropped 20% to $32 per million input tokens.
On the dev side, ChatGPT Codex now runs on GPT-5, integrated directly with GitHub for pull requests, branch management, and code reviews. A new IDE extension, upgraded CLI, one-shot task execution, and customizable “agents.md” files streamline automation.
We’re releasing new Codex features to make it a more effective coding collaborator:
- A new IDE extension
- Easily move tasks between the cloud and your local environment
- Code reviews in GitHub
- Revamped Codex CLIPowered by GPT-5 and available through your ChatGPT plan.
— OpenAI Developers (@OpenAIDevs)
9:01 PM • Aug 27, 2025
But joint OpenAI–Anthropic tests using the SHADE-Arena sabotage framework flagged risks: GPT-4.1 and GPT-4o still leaked detailed misuse instructions under pressure, while Claude models leaned toward refusals but showed sabotage quirks. Both families exhibited sycophancy, validating unsafe user decisions. With GPT-5 evaluations ahead, audits and refusal testing remain non-optional. Read more.
From AI avatars to private blockchain, Google layers multimedia and fintech tools
Google Vids now supports AI avatars that generate videos from scripts with selectable voices and personas, automatic transcript trimming to remove filler words, and eight-second image-to-video clips powered by Veo 3. Complementing this, Flow offers five (5) free Veo 3 Fast AI videos, or a single standard video, per month, with credits and per-second API pricing ($0.040 Fast, $0.0075 standard) for scalable production.
On the learning front, NotebookLM is testing Drive search and AI Flashcards. “Discover Sources” surfaces internal Docs and Slides alongside web content, while AI Flashcards generate study aids from documents, extracting key facts and questions. Public sharing of audio artifacts and upcoming video overviews deepen collaboration and multimedia integration.
BREAKING 🚨: Google is working on "Flashcards" for NotebookLM and a possibility to discover sources across your Google Drive!
"Generate AI flashcards based on your sources"
— TestingCatalog News 🗞 (@testingcatalog)
10:03 PM • Jun 30, 2025
Meanwhile, Google Cloud is quietly piloting its own blockchain, Google Cloud Universal Ledger (GCUL), a private, permissioned Layer 1 network for Python-based smart contracts, payment automation, and digital asset management. Initially tested with CME Group, GCUL offers a “credibly neutral” infrastructure for banks and fintechs, emphasizing compliance, API accessibility, and regulated adoption, though its private design has sparked debate over decentralization. Read more.
Musk’s week: a coding agent, a vision-trained robot, and a reusable Starship
Elon Musk’s empire just pulled a triple tech swing that feel like pressure tests for scale. At xAI, the new grok-code-fast-1 model dropped, a lean, agentic coder built to churn out routine dev tasks quickly and cheaply. Unlike bloated LLMs, it’s tuned for compact execution, already plugged into GitHub Copilot and Windsurf. Free to try (for now), the real play is whether it can evolve from quick snippets into full-stack automation that seriously rivals Microsoft’s Copilot and OpenAI’s Codex.

Image: xAI
Tesla’s Optimus, meanwhile, just ditched motion-capture suits for helmet-and-backpack rigs with five synced cameras, training Optimus humanoids via vision-only feeds. It’s the Autopilot playbook: flood models with scaled human video instead of costly teleoperation. Gains include: data density, speed, lower cost. Risks: zero tactile feedback, making dexterity harder. Leadership shuffled too, with AI chief Ashok Elluswamy folding Optimus under Tesla’s camera-first AI stack.
And then there’s SpaceX. Starship S37 pulled a precision splashdown after a 66-minute hop, validating reinforced heat tiles, engine-out recovery, and a satellite bay. Both stages survived, underscoring that fully reusable rocketry is shifting from theory to engineering cadence. Read more.

Anthropic’s settlement with authors may be the ‘first domino to fall’ in AI copyright battles
Meta races to patch Llama AI while Hypernova glasses stay in test mode
Anthropic users face a new choice – opt out or share your data for AI training
Google and Grok are catching up to ChatGPT, says a16z's latest AI report
Microsoft drops first in-house AI models, built for speed not show
Nous Research drops Hermes 4 AI models that outperform ChatGPT without content restrictions
MathGPT.ai, the ‘cheat-proof’ tutor and teaching assistant, expands to over 50 institutions
Scientists develop interface that ‘reads’ thoughts from speech-impaired patients
Watch: World's first flying car built by US firm to start operations at Silicon Valley airports
Over half of professionals think AI trainings feel like a second job, LinkedIn survey finds
PixVerse debuts AI video model V5 with free access week, surpasses 100M global users
Watch: China open-sources HunyuanVideo-Foley for high-fidelity TV2A audio
Chip giant Nvidia beats revenue expectations, defying fears of AI 'bubble'
US fighter pilots try taking directions from AI for the first time
Packaging with functional inks and AI recognition shows product condition in real time

5 new AI-powered tools from around the web

arXiv is a free online library where researchers share pre-publication papers.
📄 MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation


Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!