Good morning. It’s Wednesday, March 5th.

On this day in tech history: In 1924, The Computing-Tabulating-Recording Corporation (CTR) officially rebranded as International Business Machines Corporation (IBM).

In today’s email:

Google’s Gemini w/ Vision Coming Soon
The Most Realistic AI Voice Yet
Altman Hints At Image Gen Upgrade
6 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

Today’s trending AI news stories

Google to launch Gemini with Vision in March for AI-powered live video analysis

Google is rolling out live video analysis and screen-sharing features for Gemini as part of the Google One AI Premium Plan. The update allows users to stream video from their smartphone cameras or share their screens for real-time AI-powered insights. Initially exclusive to Android devices, the features support multiple languages and enhance Gemini’s ability to interpret visual content.

This expansion aligns with Google’s broader vision for multimodal AI, leading up to "Project Astra," an assistant designed to process text, video, and audio in real-time. While Astra’s full rollout remains uncertain, these incremental updates suggest Google is steadily embedding multimodal AI into everyday interactions.

Google has also added lockscreen widgets to its Gemini AI assistant on iOS and iPadOS, enabling instant access to key features like text prompts, live conversations, voice commands, and image analysis. With a full Siri overhaul still years away, Google is capitalizing on the gap. Read more.

Related Story: Google upgrades Colab with an AI agent tool

Sesame’s AI Voice Demo Stuns With Realism

Sesame AI’s Conversational Speech Model (CSM) delivers strikingly human-like voices, mimicking breath sounds, chuckles, and self-corrections. Built on Meta’s Llama architecture, it processes text and audio in a single-stage transformer model, enhancing realism beyond traditional text-to-speech.

— # (#)

The demo, featuring voices “Miles” and “Maya,” has impressed users while raising concerns over emotional attachment and deepfake risks. Blind tests show CSM’s speech rivals human recordings, though real voices still hold an edge in context. Its ability to roleplay dynamic personalities, including aggressive tones, sets it apart from competitors.

Sesame plans to expand language support, scale its models, and open-source key components. Read more.

Altman Hints at Major Image Generation Upgrade

OpenAI CEO Sam Altman announced that GPT-4.5 will roll out gradually to Plus-tier users over several days. He stated that an immediate full release would have required stricter rate limits, and the team expects high usage.

— # (#)

In a separate response, Altman hinted at significant improvements to ChatGPT’s image generation. When a user complained about declining quality, he replied that they would soon be "wild with joy," suggesting an upcoming upgrade. OpenAI has not provided a timeline for these enhancements. Read more.

6 new AI-powered tools from around the web

Pieces for Developers — Long term memory for developer workflows

Pieces is your AI companion that captures live context from browsers to IDEs and collaboration tools, manages snippets and supports multiple LLMs.

pieces.app

Meet Opera’s AI Browser Operator

We're introducing an AI agent into the browser making us the first major browser with AI-based agentic browsing.

blogs.opera.com/news/2025/03/opera-browser-operator-ai-agentics

SSSModel | One prompt, 2 AI results

SSSModel offers cutting-edge AI technology, providing 3 unique results from a single prompt for enhanced creative possibilities.

sssmodel.com

Characters | OpenArt

Free AI image generator. Free AI art generator. Free AI video generator. 100+ models and styles to choose from. Train your personalized model. Most popular AI apps: sketch to image, image to video, inpainting, outpainting, model fine-tuning, real-time drawing, text to image, image to image, image to text and more!

openart.ai/characters