AI Breakfast
Posts
OpenAI: Hallucinations Will Never Go Away

OpenAI: Hallucinations Will Never Go Away

AI Breakfast
September 08, 2025

Good morning. It’s Monday, September 8th.

On this day in tech history: In 2012, the Google Brain team led by Andrew Ng and Jeff Dean showed that a large-scale neural network could learn to recognize cats from unlabeled YouTube frames. It’s one of the earliest striking demonstrations of unsupervised deep learning at scale using a 1-billion-connection sparse autoencoder trained on 16,000 CPU cores via DistBelief, Google’s distributed framework that later evolved into TensorFlow.

In today’s email:

OpenAI Burning $115B
Veo 3 Gets 60% Cheaper
GPT-4V Reads Brain Scans
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

How to Build Secure AI Agent Workflows

AI agents with too much access can leak data, perform unintended actions, or expose your platform to risk.

WorkOS helps engineering teams secure agent workflows by implementing:

Scoped access to limit what agents can touch
Least-privilege enforcement to prevent overreach
Auditability to observe and debug runtime behavior
Secure handling of secrets and credentials
Detection of risky or unusual activity

Built for fast-moving AI teams, WorkOS gives you the building blocks to ship secure, enterprise ready agent workflows without building auth from scratch.

Start securing your agents →

Today’s trending AI news stories

OpenAI folds personality into training, concedes hallucinations, burns $115B

OpenAI has folded its Model Behavior team, which shaped ChatGPT’s tone and reduced sycophancy, into its Post Training division. This puts alignment and “personality” inside the core optimization loop. Former lead Joanne Jang now heads OAI Labs, tasked with designing new human–AI collaboration models beyond chat. The shift comes amid backlash over GPT-5’s colder tone, lawsuits over GPT-4o’s mishandling of sensitive conversations, and CEO Sam Altman’s provocative claim that the “dead internet theory” may hold weight, as LLM-driven accounts increasingly flood platforms.

The GPT-5 mini-thinking model is designed to admit uncertainty much more often than the o4-mini model. | Image: OpenAI

The company also concedes hallucinations will never disappear since they predict word sequences rather than track truth. To mitigate, it is layering in retrieval pipelines, fact-checking modules, and reinforcement strategies that teach systems to admit uncertainty instead of guessing. The deeper challenge, however, is systemic. Current benchmarks reward confident fabrications over “I don’t know.” OpenAI is pushing for new metrics that score restraint, with early models already declining unsolved problems rather than fabricating answers.

These efforts coincide with a massive infrastructure and revenue gamble. OpenAI has revised its cash burn forecasts upward by $80 billion, projecting $115 billion in outflows by 2029 as compute, training, and talent costs surge. Nearly $100 billion is earmarked for custom chips and data centers to reduce dependence on outside cloud providers. Despite the spend, revenue is expected to reach $200 billion by 2030, with ChatGPT alone generating $90 billion and free users monetized through ads and commissions. Investors remain optimistic, lifting the firm’s valuation from $300 billion to $500 billion.

OpenAI is also extending into creative industries by backing Critterz, an AI-assisted animated feature, which will debut at Cannes in 2026. With a budget of under $30 million, well below typical animated feature costs, the film will combine human voice acting and artist sketches with AI-powered enhancements from OpenAI’s generative image models. Read more.

Veo 3 gets 60% cheaper and Gemini limits finally revealed

Veo 3 video generation models just got up to 60% cheaper : standard Veo 3 now runs $0.40/sec with audio ($0.20 without), while Veo 3 Fast drops to $0.15/$0.10. That slashes a 5-minute clip from $225 to $120. Both versions output 720p/1080p at 24 fps with optional synced audio, plus a new image-to-video feature that blends stills with prompts for consistent motion.

In Search, Google’s “AI Mode” is set to become the default, turning queries into AI Overviews with follow-up chat. Already live in 180 countries, it now adds transactional agents that can handle bookings and purchases inside the interface. Google’s lawyers argue in antitrust court that “the open web is in rapid decline,” even as AI Mode keeps users locked inside Google’s own ecosystem.

Image via 𝕏

On the Gemini front, Google finally got specific about usage caps: free accounts get 5 prompts, 5 Deep Research reports, and 100 images a day; Pro gets 100 prompts and 1,000 images; Ultra jumps to 500 prompts with the same image cap. Finally, Google is embedding Gemini directly into Google Sheets with a lightweight function designed for non-technical users. By typing =AI in any cell followed by a natural language instruction, it’ll categorize, summarize, or generate across ranges, no coding, no add-ons. Read more.

AI is starting to look uncomfortably human. A new Imaging Neuroscience study tested GPT-4V on social perception, the subtle stuff we read off faces and gestures, like trustworthiness or dominance. The model annotated 138 traits across 468 images and 234 video clips, then stacked up against 2,254 people. The kicker: GPT-4V’s scores correlated at r=0.79 with the human consensus and were often more consistent than individual raters. When researchers ran those annotations against fMRI scans from 97 subjects, the AI lit up the same brain regions humans use for decoding social cues. That means GPT-4V isn’t just labeling emotions, it’s modeling how our brains process them, which could blow open how psychology and neuroscience scale experiments.

Brain maps show GPT-4V and humans light up the same social perception circuits—LOTC, pSTS, aSTS, TPJ—with only subtle differences. | Image: MIT

Meanwhile, Simon Willison argues GPT-5 has quietly turned ChatGPT into a research-grade search engine. Its “Thinking” mode takes its time, sometimes minutes, to chain Bing queries, parse PDFs, read images, and even fire off Python scripts. In one run it verified Cambridge’s full legal name after double-checking legal docs; in another, it ID’d a random building from a train window. Willison calls it his “Research Goblin,” relentless, nosy, sometimes overkill, but consistently better than manual Googling. Read more.

5 new AI-powered tools from around the web

CapCut AI Editing

CapCut AI Editing is a smart editing partner for camera + prompt-led creation. From auto edits to avatars, smart cuts, and asset generation, it’s AI editing the CapCut way: simple, powerful, smart.

100 Vibe Coding

Learn to build small web projects quickly using AI with 100 vibe coding challenges. Skip the complex theory and focus on practical results. Perfect for beginners who want to create real projects fast.

www.100vibecoding.com

Uxia

Validate your User flows UX & UI in seconds with AI. Traditional tools are slow, costly, and often inaccurate. Uxia uses AI and synthetic users to deliver fast, reliable insights, enabling more agile and efficient product development.

www.uxia.app

Tripo Studio

Tripo is a generative AI model that turns text or images into production-ready 3D assets. Tripo 3.0, built with tens of billions of parameters, delivers sharper geometry, cleaner topology, and richer textures for higher-quality results.

studio.tripo3d.ai/home