OpenAI's New Voice Mode

Good morning. It’s Friday, March 21st.

Today in tech history: On this day in 2006, Jack Dorsey sent the first tweet, marking the launch of Twitter.

In today’s email:

  • OpenAI’s New Voice Mode

  • Oracle’s No-Code Agent Tool

  • Pika Labs’ AI Video Editing

  • 4 New AI Tools

  • Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

Unlock the full potential of your workday with cutting-edge AI strategies and actionable insights, empowering you to achieve unparalleled excellence in the future of work. Download the free guide today!

Today’s trending AI news stories

OpenAI’s New AI Voice Model Turns Any Text App into a Voice-Powered AI in Seconds

OpenAI has launched three advanced voice AI models—gpt-4o-transcribe, gpt-4o-mini-transcribe, and gpt-4o-mini-tts—engineered for high-fidelity transcription and customizable speech synthesis. Integrated into OpenAI’s API and accessible via OpenAI.fm, these models refine real-time transcription with a 2.46% English word error rate, enhanced noise cancellation, and semantic voice activity detection. Unlike Whisper, they do not support speaker diarization but offer superior accuracy across 100+ languages.

With pricing set at $6 per million audio input tokens, OpenAI enters a competitive landscape dominated by ElevenLabs’ Scribe and Hume AI’s Octave TTS. Developers can embed voice functionality with minimal code via OpenAI’s Agents SDK.

Some critics argue OpenAI is deprioritizing real-time conversational AI, while some say this trajectory suggests a bigger play—one that extends beyond transcription into full-spectrum multimodal intelligence. Read more.

Oracle Lets Customers Build AI Agents with No-Code Studio

Oracle just put enterprise AI on autopilot with AI Agent Studio, a no-cost tool that lets users craft and refine AI agents inside its Fusion Cloud Application Suite. Featuring drag-and-drop customization, API-level access, and a library of prebuilt templates, the platform keeps automation tight with existing business logic. Users can tweak over 50 preconfigured agents, wire in third-party APIs, and swap between Llama, Cohere, OpenAI’s GPT, or other LLMs—all without starting from scratch.

The platform supports multi-agent orchestration, allowing agents to collaborate on tasks with checkpoints and approvals. While Fusion security policies extend to new agents, connecting to third-party APIs may require additional coding. Oracle leverages REST APIs for external integration, enhancing automation without disrupting existing systems. Read more.

Pika previews precision video editing—move objects without disrupting scenes

Pika has released a behind-the-scenes preview of its latest AI-powered video editing tool, allowing users to manipulate characters and objects within a scene while keeping the rest of the footage untouched. This precision-editing capability opens new creative possibilities, offering greater control without the usual artifacts or distortions.

The feature is currently available to Pika Creative Partners through exclusive early access, hinting at broader rollout plans. Read more.

4 new AI-powered tools from around the web

arXiv is a free online library where researchers share pre-publication papers.

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!