- AI Breakfast
- Posts
- OpenAI: Hallucinations Will Never Go Away
OpenAI: Hallucinations Will Never Go Away
Good morning. It’s Monday, September 8th.
On this day in tech history: In 2012, the Google Brain team led by Andrew Ng and Jeff Dean showed that a large-scale neural network could learn to recognize cats from unlabeled YouTube frames. It’s one of the earliest striking demonstrations of unsupervised deep learning at scale using a 1-billion-connection sparse autoencoder trained on 16,000 CPU cores via DistBelief, Google’s distributed framework that later evolved into TensorFlow.
In today’s email:
OpenAI Burning $115B
Veo 3 Gets 60% Cheaper
GPT-4V Reads Brain Scans
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
AI agents with too much access can leak data, perform unintended actions, or expose your platform to risk.
WorkOS helps engineering teams secure agent workflows by implementing:
Scoped access to limit what agents can touch
Least-privilege enforcement to prevent overreach
Auditability to observe and debug runtime behavior
Secure handling of secrets and credentials
Detection of risky or unusual activity
Built for fast-moving AI teams, WorkOS gives you the building blocks to ship secure, enterprise ready agent workflows without building auth from scratch.

Today’s trending AI news stories
OpenAI folds personality into training, concedes hallucinations, burns $115B
OpenAI has folded its Model Behavior team, which shaped ChatGPT’s tone and reduced sycophancy, into its Post Training division. This puts alignment and “personality” inside the core optimization loop. Former lead Joanne Jang now heads OAI Labs, tasked with designing new human–AI collaboration models beyond chat. The shift comes amid backlash over GPT-5’s colder tone, lawsuits over GPT-4o’s mishandling of sensitive conversations, and CEO Sam Altman’s provocative claim that the “dead internet theory” may hold weight, as LLM-driven accounts increasingly flood platforms.

The GPT-5 mini-thinking model is designed to admit uncertainty much more often than the o4-mini model. | Image: OpenAI
The company also concedes hallucinations will never disappear since they predict word sequences rather than track truth. To mitigate, it is layering in retrieval pipelines, fact-checking modules, and reinforcement strategies that teach systems to admit uncertainty instead of guessing. The deeper challenge, however, is systemic. Current benchmarks reward confident fabrications over “I don’t know.” OpenAI is pushing for new metrics that score restraint, with early models already declining unsolved problems rather than fabricating answers.

These efforts coincide with a massive infrastructure and revenue gamble. OpenAI has revised its cash burn forecasts upward by $80 billion, projecting $115 billion in outflows by 2029 as compute, training, and talent costs surge. Nearly $100 billion is earmarked for custom chips and data centers to reduce dependence on outside cloud providers. Despite the spend, revenue is expected to reach $200 billion by 2030, with ChatGPT alone generating $90 billion and free users monetized through ads and commissions. Investors remain optimistic, lifting the firm’s valuation from $300 billion to $500 billion.
OpenAI is also extending into creative industries by backing Critterz, an AI-assisted animated feature, which will debut at Cannes in 2026. With a budget of under $30 million, well below typical animated feature costs, the film will combine human voice acting and artist sketches with AI-powered enhancements from OpenAI’s generative image models. Read more.
Veo 3 gets 60% cheaper and Gemini limits finally revealed
Veo 3 video generation models just got up to 60% cheaper: standard Veo 3 now runs $0.40/sec with audio ($0.20 without), while Veo 3 Fast drops to $0.15/$0.10. That slashes a 5-minute clip from $225 to $120. Both versions output 720p/1080p at 24 fps with optional synced audio, plus a new image-to-video feature that blends stills with prompts for consistent motion.

In Search, Google’s “AI Mode” is set to become the default, turning queries into AI Overviews with follow-up chat. Already live in 180 countries, it now adds transactional agents that can handle bookings and purchases inside the interface. Google’s lawyers argue in antitrust court that “the open web is in rapid decline,” even as AI Mode keeps users locked inside Google’s own ecosystem.

Image via 𝕏
On the Gemini front, Google finally got specific about usage caps: free accounts get 5 prompts, 5 Deep Research reports, and 100 images a day; Pro gets 100 prompts and 1,000 images; Ultra jumps to 500 prompts with the same image cap. Finally, Google is embedding Gemini directly into Google Sheets with a lightweight function designed for non-technical users. By typing =AI in any cell followed by a natural language instruction, it’ll categorize, summarize, or generate across ranges, no coding, no add-ons. Read more.
AI is starting to look uncomfortably human. A new Imaging Neuroscience study tested GPT-4V on social perception, the subtle stuff we read off faces and gestures, like trustworthiness or dominance. The model annotated 138 traits across 468 images and 234 video clips, then stacked up against 2,254 people. The kicker: GPT-4V’s scores correlated at r=0.79 with the human consensus and were often more consistent than individual raters. When researchers ran those annotations against fMRI scans from 97 subjects, the AI lit up the same brain regions humans use for decoding social cues. That means GPT-4V isn’t just labeling emotions, it’s modeling how our brains process them, which could blow open how psychology and neuroscience scale experiments.

Brain maps show GPT-4V and humans light up the same social perception circuits—LOTC, pSTS, aSTS, TPJ—with only subtle differences. | Image: MIT
Meanwhile, Simon Willison argues GPT-5 has quietly turned ChatGPT into a research-grade search engine. Its “Thinking” mode takes its time, sometimes minutes, to chain Bing queries, parse PDFs, read images, and even fire off Python scripts. In one run it verified Cambridge’s full legal name after double-checking legal docs; in another, it ID’d a random building from a train window. Willison calls it his “Research Goblin,” relentless, nosy, sometimes overkill, but consistently better than manual Googling. Read more.

Meta compresses context 16×, decoding 30× and pledges $600B in US AI
This new framework lets LLM agents learn from experience, no fine-tuning required
Alibaba unveils Qwen3-Max-Preview, its largest language model yet
Apple is being sued by US authors who accuse it of using pirated books to train its AI models
New ‘benevolent hacking’ method could prevent AI models from giving rogue prompts
Tilde AI releases TildeOpen LLM: an open-source LLM with over 30 billion parameters
Snapchat's new Lens lets you create AI images using text prompts
LIGO and Google create a new AI tool to supercharge the hunt for gravitational waves
Microsoft's light-powered computer could run AI 100x faster and more efficiently
Google Gemini dubbed 'high risk' for kids and teens in new safety assessment
Base10-backed Throxy challenges AI sales giants with pay-per-meeting model
ASML becomes Mistral AI’s top shareholder after leading latest funding round, sources say
How San Jose’s mayor is using AI to speed up transportation, government
Palantir CEO: ‘Silicon Valley totally effed up’ on AI’s promise. And he’s right.
Anthropic to pay $1.5 billion to authors in landmark AI settlement

5 new AI-powered tools from around the web

arXiv is a free online library where researchers share pre-publication papers.


Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!