OpenAI's o3 Worse Than o3-preview?

Good morning. It’s Monday, April 28th.

On this day in tech history: 2003: Apple launched the iTunes Music Store, revolutionizing digital music distribution. It sold 1 million songs in its first week and reached 10 billion downloads by 2010.

In today’s email:

  • How did OpenAI’s o3-preview beat o3?

  • Google’s glimpse into the AI driven future

  • RUKA robotic hand

  • New AI Tools

  • Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

In partnership with TAVUS

Introducing Hummingbird-0 — the lipsync that just works

Upload any MP3 + MP4, get up to 5 minutes of photorealistic, zero-shot lipsync.
No cloning. No training. Beats every model we’ve tested and costs less than SyncLabs.

What you get:

  • Hollywood-grade realism & accuracy

  • Perfect for use with Veo/Sora/Kling + ElevenLabs

  • Already powering creator & enterprise pipelines

Today’s trending AI news stories

Was OpenAI’s o3-preview more powerful than the full o3 model?

OpenAI’s o3 model outperforms the o1 model from fall 2024 by 20% on the ARC-AGI-1 benchmark, though it still lags behind the o3-preview results from December 2024. The chart illustrates the price-to-performance ratio. | Image: ARC Prize Foundation

OpenAI’s latest o3 model has fallen short of expectations in recent evaluations, particularly in reasoning tests. The ARC Prize Foundation found that o3 scored 41% on low compute and 53% on medium compute for ARC-AGI-1, significantly behind the 76% and 88% results of the December 2024 preview version. This drop is linked to changes in the model's architecture, with the new multimodal, smaller o3 optimised for chat and product use rather than advanced reasoning.

Despite outperforming earlier models like o1, the o3 model still struggles on more challenging benchmarks, scoring under 3% on ARC-AGI-2. The more cost-efficient o4-mini model, in contrast, delivers solid performance at a fraction of the price. This highlights a critical gap between AI and human reasoning, as well as the fact that more computational effort does not always correlate with better results.

Meanwhile, CEO Sam Altman addressed growing criticism of GPT-4o's overly agreeable responses, calling it "sycophant-y and annoying." Altman acknowledged the feedback and confirmed that OpenAI is working on immediate updates, with further adjustments expected over the week. Changes will introduce more conversational flexibility. Read more.

Google Reveals 601 Real-World Generative AI Use Cases

Image: Google

Google has expanded its generative AI offerings, with 601 use cases now available through Google Cloud, up from 101 last year. Major companies like Uber, Citi, and Mercedes-Benz are leveraging AI models powered by Vertex AI and Gemini for tasks including customer service and healthcare diagnostics.

In addition, Google Photos has introduced a shortcut to bypass the slow “Ask Photos” feature. Users can now double-tap the search icon on Android devices to quickly switch to the classic search mode. This improvement addresses user concerns over the speed of “Ask Photos,” which uses Gemini AI for deeper natural language queries.

Alphabet’s Q1 2025 results exceeded expectations, with $90.23 billion in revenue. CEO Sundar Pichai highlighted AI-driven features like “AI Overviews,” used by 1.5 billion people monthly. Google is also experimenting with multimodal search and enhancing visual tools. The company’s focus on cost efficiency, driven by custom Tensor Processing Units (TPUs), positions it as a more affordable alternative to Nvidia GPUs. Google continues to prioritize flexibility and interoperability over a more integrated approach like OpenAI’s partnership with Microsoft. Read more.

RUKA robotic hand offers 15 degrees of freedom and open-source design

NYU researchers have introduced RUKA, a tendon-driven, 3D-printable robotic hand that is open-source and costs under $1,300. It features 15 degrees of freedom and can perform 31 out of 33 grasps in a standard grasp taxonomy. The hand can be assembled in under 7 hours by first-time builders and offers precise control through a data-driven approach, utilizing a MANUS glove to map fingertip positions to motor commands.

RUKA operates at 40 Hz for control and 25 Hz for teleoperation using devices like motion-capture gloves or VR headsets. A calibration script ensures consistency across builds, and the project provides detailed assembly instructions and support for community engagement. Read more.

New AI-powered tools from around the web

arXiv is a free online library where researchers share pre-publication papers.

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!