AI Breakfast
Posts
OpenAI Readies GPT-4.1 with 1M-token Context and Live Memory

OpenAI Readies GPT-4.1 with 1M-token Context and Live Memory

AI Breakfast
April 11, 2025

Good morning. It’s Friday, April 10th.

On this day in tech history: In 2010, the first iPad went on sale.

In today’s email:

OpenAI Readies GPT-4.1 with 1M-token Context and Live Memory
Google’s AI Blitz
Here’s what’s in development on the humanoid front
New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

_{In partnership with Nebius}

Power Your AI with the Latest NVIDIA Blackwell GPUs

Be among the first to access NVIDIA’s most advanced AI hardware with Nebius, a leading AI cloud provider

Early access to NVIDIA Blackwell platforms: GB200 & HGX B200
On-demand H200 & H100 GPUs — ready when you are
Seamless cluster orchestration with Slurm on Kubernetes
High-performance storage optimized for AI workloads
Reserve your NVIDIA Blackwell cluster today — availability is limited!

_{Thank you for supporting our sponsors!}

Today’s trending AI news stories

OpenAI Readies GPT-4.1 with 1M-token Context and Live Memory

OpenAI is preparing a multi-pronged rollout led by GPT-4.1, an enhanced multimodal model building on GPT-4o, with scaled-down variants like o4-mini, o4-mini-high, and nano versions. Also surfacing in ChatGPT infrastructure are o3 and a compact o4-mini model. Though CEO Sam Altman cautioned that these models aren't launching immediately, the infrastructure suggests release is imminent—pending capacity reliefs. Alongside the models, Altman might have also hinted at a new development about “quasar alpha,” a context window upgrade reportedly supporting up to 1 million tokens.

In tandem, OpenAI has expanded ChatGPT’s memory to include entire conversation histories, enabling it to recall and adapt across past chats without prompt engineering. Noam Brown described memory in language models as more than a functional upgrade, framing it as a fundamental shift in user interaction.

Starting today, memory in ChatGPT can now reference all of your past chats to provide more personalized responses, drawing on your preferences and interests to make it even more helpful for writing, getting advice, learning, and beyond.
— OpenAI (@OpenAI)
5:06 PM • Apr 10, 2025

OpenAI has also introduced the Pioneers Program, a bid to redesign AI benchmarking by co-creating domain-specific evaluations with startups in fields like law, finance, and healthcare. These benchmarks—intended to guide reinforcement fine-tuning and model improvements—will be released publicly. Critics, however, warn that the company’s role as both model maker and evaluator risks blurring the line between progress and self-interest.

The backdrop to these launches is OpenAI’s countersuit against Elon Musk, alleging a campaign to discredit the company as it seeks to finalize a $40 billion funding round and transition into a capped-profit entity. Read more.

Google’s Cloud Next blitz: agentic dev kits, AI assistants, and fast, cheap models

Google has expanded its AI developer ecosystem with a suite of new tools and models focused on agentic computing, cost efficiency, and full-stack app development—highlighted at Cloud Next 2025.

Agent Development Kit (ADK) is now open source, providing a framework for building hierarchical, multi-agent systems. ADK supports modular design, dynamic routing, multimodal inputs, and integrations with Vertex AI and LiteLLM. It also offers built-in evaluation tools and simplifies deployment via Vertex AI Agent Builder or containers.

Gemini Code Assist enters preview with agentic capabilities. These agents can now autonomously translate code, generate apps from specs, manage Kanban-style workflows, run tests, conduct reviews, and handle migrations—moving toward more self-directed software engineering.

Firebase Studio, powered by Gemini and built on Code OSS, enables in-browser, no-setup app development. It supports multilingual frameworks, imports from Git repos, and natural language prototyping, with deployment via Firebase App Hosting and Cloud Run.

Meet Firebase Studio: A cloud-based, agentic dev environment powered by Gemini ✨💻✨
Find everything you need to prototype, build, and run production-quality full-stack AI apps quickly and safely.
Learn more about building AI apps with Firebase → goo.gle/4j3MS9v
— Firebase (@Firebase)
4:05 PM • Apr 9, 2025

Gemini 2.5 Flash, a reasoning-optimized model, balances performance with low latency and cost. Available soon in Vertex AI, it’s built for real-time, high-volume use cases like customer support. On-prem deployment via Google Distributed Cloud and Nvidia Blackwell will follow in Q3.

Google introduced its seventh-generation Ironwood TPU, optimized for inference and designed to run AI models. Available in 256-chip and 9,216-chip clusters, Ironwood delivers 4,614 TFLOPs peak performance. Each chip is equipped with 192GB of RAM and 7.4 Tbps bandwidth, making it Google’s most powerful and energy-efficient TPU. Ironwood will integrate with Google’s AI Hypercomputer for high-scale workloads. Read more.

From Camera Rigs to Combat Moves, Humanoids Show What’s Next

Humanoid robots are finding new ground beyond labs and logistics. Boston Dynamics’ Atlas has entered film production, assisting with camera operations alongside WPP and Canon. Trained using synthetic data generated through Nvidia Cosmos simulations, Atlas can carry 20 kg and maintain stability in awkward positions—ideal for long, repeatable shots or filming in hard-to-reach environments.

Meanwhile, Unitree’s $16K G1 robot—built with 43 actuated joints and trained via imitation learning—is branching from flips and kung fu to boxing, with a livestreamed robot fight in the works. The G1 also demonstrates agility on uneven terrain and resilience under physical disturbances. Priced at $16,000, the G1 undercuts high-end models like Boston Dynamics' Atlas, offering a low-cost entry point. Read more.

3 new AI-powered tools from around the web

Improve your AI infrastructure - AI memory engine

Cognee is an open source AI memory engine. Try it today to find hidden connections in your data and improve your AI infrastructure.

www.cognee.ai

Hera - Motion design in seconds

Hera enables you to create motion designs in seconds just by describing them.

hera.video

Voicenotes Pages: Publish Voice Notes

Create your own mini-podcast effortlessly with Voicenotes Pages. Record and publish with a single tap—podcasting made as easy as chatting with a friend.

voicenotes.com/pages