• AI Breakfast
  • Posts
  • Anthropic maps the neural switch that keeps AI from going rogue

Anthropic maps the neural switch that keeps AI from going rogue

In partnership with

Good morning. It’s Wednesday, January 21st.

On this day in tech history: In 1999, @Home Network acquired Excite for $6.7 billion, merging broadband with one of the web's earliest search portals using statistical ranking algorithms. This obscure deal highlighted pre-Google search tech, indirectly advancing AI through web indexing techniques that evolved into vector embeddings for semantic search. Excite's "intelligent concept extraction" foreshadowed NLP breakthroughs in transformer models.

In today’s email:

  • Anthropic maps the neural switch that keeps AI from going rogue

  • OpenAI at Davos: 2026 is ‘practical adoption’ year

  • Davos 2026 reality check: centralized AI delivers 8x returns, but entry-level displacement is here

  • 5 New AI Tools

  • Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

Turn AI into Your Income Engine

Ready to transform artificial intelligence from a buzzword into your personal revenue generator?

HubSpot’s groundbreaking guide "200+ AI-Powered Income Ideas" is your gateway to financial innovation in the digital age.

Inside you'll discover:

  • A curated collection of 200+ profitable opportunities spanning content creation, e-commerce, gaming, and emerging digital markets—each vetted for real-world potential

  • Step-by-step implementation guides designed for beginners, making AI accessible regardless of your technical background

  • Cutting-edge strategies aligned with current market trends, ensuring your ventures stay ahead of the curve

Download your guide today and unlock a future where artificial intelligence powers your success. Your next income stream is waiting.

Today’s trending AI news stories

Anthropic maps the neural switch that keeps AI from going rogue

Anthropic’s new research identifies an “Assistant Axis,” a neural dimension that determines whether large language models stay in their intended helper role or drift into alternative personas. By analyzing internal activations across Gemma 2, Qwen 3, and Llama 3.3, Anthropic mapped a 275-role persona space.

Moving away from the assistant axis increased jailbreak success, delusion reinforcement, and self-harm encouragement. A mitigation technique called activation capping reduced harmful outputs by roughly 50% without hurting performance, positioning persona stability as a new AI safety control.

That finding matters as Anthropic expands Claude into sensitive domains. Claude for Healthcare and new mobile health connectors now allow users to link medical data directly to the model. Critics warn these same systems hallucinate diagnoses, the exact persona drift problem researchers just documented.

Anthropic’s Claude Code VS Code extension is also now generally available, bringing CLI-style features like @-mention file context and familiar slash commands to the editor.

Meanwhile, CEO Dario Amodei has taken a hard geopolitical stance, calling the US decision to allow Nvidia’s H200 chip sales to China “crazy” and a national security risk. He argues advanced semiconductors are one of America’s last structural advantages in the AI race. Read more.

OpenAI at Davos: 2026 is ‘practical adoption’ year

Speaking from Davos, CFO Sarah Friar outlined aggressive scaling across health, science, and enterprise, noting compute use surged from 0.2 to 1.9 gigawatts in 2025 while annual revenue jumped from $2 billion to $20 billion.

The enterprise push materialized through a three-year partnership with ServiceNow, embedding OpenAI models including GPT-5.2 into a platform processing 80 billion workflows annually. The AI agents will autonomously handle end-to-end business tasks, including natural language incident resolution and workflow creation. ServiceNow stays multi-model but the deal will expand OpenAI’s reach across Walmart, PayPal, Accenture, and Morgan Stanley.

OpenAI also confirmed plans to launch its first physical device in the second half of 2026. Altman has pitched the device as something closer to a new computing category than a gadget.

OpenAI is simultaneously tightening safety controls adding an age prediction system to flag users likely under 18 and apply tailored restrictions. Adults can verify via selfie-based Persona checks, while parents get controls like rest periods and distress alerts. Read more.

Davos 2026 reality check: centralized AI delivers 8x returns, but entry-level displacement is here

Davos 2026 marked the end of AI hype and the start of hard ROI conversations. CEOs made it clear: top-down, board-led AI strategies with dedicated centers of excellence crush broad employee access approaches. Celonis reported 8x higher returns when process optimization is centralized. Siemens chairman and LogicMonitor leaders emphasized structured data management and strategic oversight as non-negotiable for scaling impact.

The SaaS race is now about becoming the orchestration layer for AI agents. Workday, Salesforce, Microsoft, and Snowflake are competing to unify enterprise data and automate workflows and forward-deployed engineering teams and agent-specific platforms are turning AI insights into executable actions.

DeepMind CEO Demis Hassabis and Anthropic CEO Dario Amodei delivered blunt warnings: AI is hitting entry-level jobs and internships now, not later. Hassabis noted displacement is already happening at DeepMind. Amodei said up to half of office jobs could vanish within one to five years, pointing to declining demand for junior and mid-level roles at Anthropic.

On AGI timelines, Amodei put it within one to two years for surpassing human-level performance in most tasks. Hassabis gave a broader 2027-2030 window, noting digital work like coding automates easily but physical sciences need validation. Hassabis also confirmed Gemini won't run ads, suggesting OpenAI's ad testing is revenue pressure. Read more.

5 new AI-powered tools from around the web

arXiv is a free online library where researchers share pre-publication papers.

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!