- AI Breakfast
- Posts
- Anthropic maps the neural switch that keeps AI from going rogue
Anthropic maps the neural switch that keeps AI from going rogue
Good morning. It’s Wednesday, January 21st.
On this day in tech history: In 1999, @Home Network acquired Excite for $6.7 billion, merging broadband with one of the web's earliest search portals using statistical ranking algorithms. This obscure deal highlighted pre-Google search tech, indirectly advancing AI through web indexing techniques that evolved into vector embeddings for semantic search. Excite's "intelligent concept extraction" foreshadowed NLP breakthroughs in transformer models.
In today’s email:
Anthropic maps the neural switch that keeps AI from going rogue
OpenAI at Davos: 2026 is ‘practical adoption’ year
Davos 2026 reality check: centralized AI delivers 8x returns, but entry-level displacement is here
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
Turn AI into Your Income Engine
Ready to transform artificial intelligence from a buzzword into your personal revenue generator?
HubSpot’s groundbreaking guide "200+ AI-Powered Income Ideas" is your gateway to financial innovation in the digital age.
Inside you'll discover:
A curated collection of 200+ profitable opportunities spanning content creation, e-commerce, gaming, and emerging digital markets—each vetted for real-world potential
Step-by-step implementation guides designed for beginners, making AI accessible regardless of your technical background
Cutting-edge strategies aligned with current market trends, ensuring your ventures stay ahead of the curve
Download your guide today and unlock a future where artificial intelligence powers your success. Your next income stream is waiting.

Today’s trending AI news stories
Anthropic maps the neural switch that keeps AI from going rogue
Anthropic’s new research identifies an “Assistant Axis,” a neural dimension that determines whether large language models stay in their intended helper role or drift into alternative personas. By analyzing internal activations across Gemma 2, Qwen 3, and Llama 3.3, Anthropic mapped a 275-role persona space.
Moving away from the assistant axis increased jailbreak success, delusion reinforcement, and self-harm encouragement. A mitigation technique called activation capping reduced harmful outputs by roughly 50% without hurting performance, positioning persona stability as a new AI safety control.
That finding matters as Anthropic expands Claude into sensitive domains. Claude for Healthcare and new mobile health connectors now allow users to link medical data directly to the model. Critics warn these same systems hallucinate diagnoses, the exact persona drift problem researchers just documented.
Anthropic’s Claude Code VS Code extension is also now generally available, bringing CLI-style features like @-mention file context and familiar slash commands to the editor.
Meanwhile, CEO Dario Amodei has taken a hard geopolitical stance, calling the US decision to allow Nvidia’s H200 chip sales to China “crazy” and a national security risk. He argues advanced semiconductors are one of America’s last structural advantages in the AI race. Read more.
OpenAI at Davos: 2026 is ‘practical adoption’ year
Speaking from Davos, CFO Sarah Friar outlined aggressive scaling across health, science, and enterprise, noting compute use surged from 0.2 to 1.9 gigawatts in 2025 while annual revenue jumped from $2 billion to $20 billion.
The enterprise push materialized through a three-year partnership with ServiceNow, embedding OpenAI models including GPT-5.2 into a platform processing 80 billion workflows annually. The AI agents will autonomously handle end-to-end business tasks, including natural language incident resolution and workflow creation. ServiceNow stays multi-model but the deal will expand OpenAI’s reach across Walmart, PayPal, Accenture, and Morgan Stanley.
OpenAI also confirmed plans to launch its first physical device in the second half of 2026. Altman has pitched the device as something closer to a new computing category than a gadget.
OpenAI is simultaneously tightening safety controls adding an age prediction system to flag users likely under 18 and apply tailored restrictions. Adults can verify via selfie-based Persona checks, while parents get controls like rest periods and distress alerts. Read more.
Davos 2026 reality check: centralized AI delivers 8x returns, but entry-level displacement is here
Davos 2026 marked the end of AI hype and the start of hard ROI conversations. CEOs made it clear: top-down, board-led AI strategies with dedicated centers of excellence crush broad employee access approaches. Celonis reported 8x higher returns when process optimization is centralized. Siemens chairman and LogicMonitor leaders emphasized structured data management and strategic oversight as non-negotiable for scaling impact.
The SaaS race is now about becoming the orchestration layer for AI agents. Workday, Salesforce, Microsoft, and Snowflake are competing to unify enterprise data and automate workflows and forward-deployed engineering teams and agent-specific platforms are turning AI insights into executable actions.
DeepMind CEO Demis Hassabis and Anthropic CEO Dario Amodei delivered blunt warnings: AI is hitting entry-level jobs and internships now, not later. Hassabis noted displacement is already happening at DeepMind. Amodei said up to half of office jobs could vanish within one to five years, pointing to declining demand for junior and mid-level roles at Anthropic.
On AGI timelines, Amodei put it within one to two years for surpassing human-level performance in most tasks. Hassabis gave a broader 2027-2030 window, noting digital work like coding automates easily but physical sciences need validation. Hassabis also confirmed Gemini won't run ads, suggesting OpenAI's ad testing is revenue pressure. Read more.

Palmer Luckey says Meta's VR layoffs might actually save the industry from itself
Notion expands custom agents with connectors, workers, and AI co-editor
Google Stitch adds API keys and PRD generation, turning designers into product managers
These 55 US AI startups raised $100M or more in 2025, and eight of them did it twice
Microsoft just ranked 40 jobs by AI exposure and teachers didn't escape the list
STEP3-VL-10B outperforms Gemini 2.5 Pro on visual tasks despite being 20x smaller
Elon Musk seeks up to $134B in damages from OpenAI over alleged nonprofit mission breach
Blackbox AI just launched an API that lets you run Claude Code and other agents on remote VMs
GLM-4.7-Flash is a free 30B coding assistant you can run locally and it's actually good
New study finds experts are less certain about AI consciousness than you'd think
DeepWisdom's new platform lets solo entrepreneurs build products using only natural language prompts
Steam clarifies its AI disclosure policy to focus on player-facing content, not dev tools
Why Razer's CEO thinks gamers will embrace AI once they stop seeing it as 'generative slop'
IMF raises 2026 growth forecast to 3.3% as AI boom offsets trade war damage
IBM is now selling the AI tools it built for its own 160,000 consultants as a service
Davos 2026 is all about tariffs, AI, and wars as Trump arrives after WEF promised no "woke" topics
Gen Z workers are the most worried about AI taking their jobs, Randstad survey finds
This new RL system optimizes your code automatically, just define what you want and let it run
Korea launches a high-stakes AI competition to find the nation’s top homegrown models
ByteDance ramps up AI cloud ambitions in a direct play against Alibaba
Palantir CEO says AI could make mass immigration obsolete by filling jobs at home
Physical AI, not software hype, is Europe’s real path to AI advantage
Groq founder says robots will make you richer and companies desperate for workers
Ukraine plans to share millions of hours of combat data to train allied AI systems
Software stocks tank as AI-powered tools spark fears of disruption
While Gates and Musk promise leisure, this CEO warns AI will only speed up work, not shrink it
Internal emails reveal the shadow dealmaking behind Microsoft’s decade-long AI dominance with OpenAI

5 new AI-powered tools from around the web

arXiv is a free online library where researchers share pre-publication papers.

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!






