In partnership with

Good morning. It’s Friday, November 28th.

On this day in tech history: In 1999, MIT researchers released the RoboCup Simulation League platform, a 2D virtual soccer environment where autonomous agents learned strategy through multi-agent cooperation and competition. It looks quaint today, but it helped establish emergent behaviors in swarm robotics and game-theoretic AI. Concepts tested in simulated “soccer” later shaped reinforcement learning for logistics, traffic optimization, and even StarCraft-playing bots.

In today’s email:

Claude’s new framework bridges sessions, trims errors, and powers ahead in benchmarks
xAI charts the frontier of autonomous, pixel-driven AI
Karpathy’s weekend “vibe code” hack reveals the hidden layer of enterprise AI orchestration
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

A Better Way to Deploy Voice AI at Scale

Most Voice AI deployments fail for the same reasons: unclear logic, limited testing tools, unpredictable latency, and no systematic way to improve after launch.

The BELL Framework solves this with a repeatable lifecycle — Build, Evaluate, Launch, Learn — built for enterprise-grade call environments.

See how leading teams are using BELL to deploy faster and operate with confidence.

Get the Guide

Today’s trending AI news stories

Claude’s new framework bridges sessions, trims errors, and powers ahead in benchmarks

Anthropic is leveling up Claude agents to work more like human software engineers. Multi-session projects used to trip them up, forgetting previous work, overbuilding features, or marking tasks completed too early. The new two-agent system changes the game. An initializer agent sets up the project, seeds a git repository, and logs progress, while a coding agent tackles features step by step, produces clean, documented outputs, and runs end-to-end tests via Puppeteer. Early trials show agents can now sustain continuity across days-long projects, though Anthropic is still testing whether a multi-agent setup could push performance even further.

— # (#)

The results are showing up in benchmarks. Claude Opus 4.5 snagged top-two spots in web development challenges and top-three in text arenas, just behind Gemini 3 Pro and Grok 4.1.

Image: lmarena

Meanwhile, CEO Dario Amodei has been called to testify before the House Homeland Security Committee on December 17 about a Chinese state-linked cyber-espionage campaign that leaned heavily on Claude Code. This is the first documented case of an AI running a near-complete cyberattack. Google Cloud CEO Thomas Kurian and Quantum Xchange CEO Eddy Zervigon have also been asked to weigh in on how commercial AI can both fuel and defend against attacks that operate at machine speed. Read more.

xAI charts the frontier of autonomous, pixel-driven AI

Shen Zhuoran, a technical member at xAI, outlined a breakthrough AI system capable of reading and understanding computer interfaces from raw video, reasoning under tight time constraints, and executing actions with precision - all without APIs and within 150 milliseconds.

OpenAI Five and DeepMind’s AlphaStar mastered Dota 2 and StarCraft II using direct API access, giving them perfect game-state knowledge and superhuman precision, while Grok 5, the new xAI system Shen Zhuoran describes, operates purely from raw video input, reading the screen, reasoning under tight time limits, and executing clicks and commands like a human, aiming to generalize across any computer interface without specialized APIs.

— # (#)

xAI is addressing the energy demands of its massive Colossus data center in Memphis by planning an 88-acre solar farm capable of generating roughly 30 megawatts, around 10% of the center’s power needs. Local authorities have temporarily permitted 15 turbines through January 2027. xAI previously announced a 100-megawatt solar-plus-battery project funded with a $414 million interest-free USDA loan. Read more.

Karpathy’s weekend “vibe code” hack reveals the hidden layer of enterprise AI orchestration

Andrej Karpathy, former Tesla AI lead and OpenAI researcher, released LLM Council, a simple orchestration framework where multiple large language models debate, critique, and synthesize answers under a “Chairman.” Built with FastAPI, React, JSON storage, and OpenRouter for API integration, it runs GPT-5.1, Google Gemini 3, Claude Opus 4.5, and Grok 4 as swappable components.

— # (#)

The prototype proves multi-model orchestration is technically possible but highlights the missing operational essentials that keep commercial platforms in play. Frontier models are increasingly swappable, but orchestration, governance, and observability remain the differentiators that determine safe, scalable deployment.

Karpathy also turned to education, arguing that policing AI-generated homework is a losing battle. He recommends a “flipped classroom” approach with the goal of having dual competency. Students must use AI effectively while retaining the ability to reason independently. Through his startup Eureka Labs, Karpathy is exploring AI-native classrooms. Read more.

5 new AI-powered tools from around the web

ORI

ORI is a BFSI-focused generative AI platform that powers compliant, multilingual voice/chat/email agents with BrandGPT guardrails, real-time analytics, sentiment detection, and seamless CRM/on-prem integrations for scalable, regulation-aligned automation.

oriserve.com

EveChange

AI-powered changelog management platform that automatically generates content and marketing materials for software updates.

www.evechange.com

Gaffa

Gaffa is a simple REST API that controls real browsers at scale, no frameworks, proxies, or headless setup. Automate scraping, screenshots, data extraction, and LLM-ready page processing with built-in observability and global access.

gaffa.dev

UTMGuard

UTMGuard automatically audits Google Analytics 4 for broken or inconsistent UTM parameters, running 89 rule checks with scoring, scheduled audits, and alerts—helping marketers prevent costly tracking errors and protect campaign attribution.

www.utmguard.com

Interachat

Interachat is a privacy-focused messaging app for individuals and teams, with built-in AI you can summon via @InterachatAI for summaries, answers, opinions, and search—enhancing chats without replacing real conversation.

interachat.interasoul.com