- AI Breakfast
- Posts
- How to poison AI
How to poison AI
Good morning. It’s Monday, July 7th.
On this day in tech history: In 1998, NASA’s Deep Space 1 mission was in the final stages of preparation for its launch (occurring later in October). This phase involved integrating an AI-based autonomous navigation system, AutoNav, one of the earliest uses of AI in space exploration for real-time decision-making.
In today’s email:
How to poison AI: Reasoning Gets Derailed With Random Facts
Trea Agent for coding
Telerobotic Chefs Cook With No Delay
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
Unlock the Power of AI With the Complete Marketing Automation Playbook
Discover how to reclaim your time and scale smarter with AI-driven workflows that actually work. You’ll get frameworks, strategies, and templates you can put to use immediately to streamline and supercharge your marketing:
A detailed audit framework for your current marketing workflows
Step-by-step guidance for choosing the right AI-powered automations
Pro tips for improving personalization without losing the human touch
Tools and templates to speed up implementation
Built to help you automate the busywork and focus on work that actually makes an impact.

Today’s trending AI news stories
How to Poison AI: "Cat attack" on reasoning model shows how important context engineering is
A new study shows advanced language models can be derailed by trivial context. Researchers used a system called CatAttack to inject harmless phrases like “cats sleep most of their lives” into reasoning prompts, tripling error rates in models like DeepSeek R1. The attack works by having one model generate distractors, another judge their effectiveness, then testing them on top-tier LLMs.

Simple prompts—like cat facts or broad financial tips—can destabilize model reasoning, revealing just how brittle AI systems remain. | Image: Rajeev et al.
Even vague financial advice or suggestive guesses (“Maybe it’s around 175?”) caused failures and bloated token counts - what researchers call “slowdown attacks.”

Suffix attacks can spike DeepSeek-R1’s error rate by up to 10x, with the sharpest failures seen on math benchmarks. | Image: Rajeev et al.
In math benchmarks, error rates spiked up to 10×. The core issue: models still can’t separate signal from noise. Experts warn that in high-stakes fields like finance or healthcare, irrelevant context can trigger real-world failures. Read more.
‘Trae Agent’ Thinks, Fixes, and Ships Code
ByteDance just dropped Trae Agent, a command-line, LLM-powered software engineer that writes, debugs, and fixes code on demand. It handles full-stack dev tasks from natural language prompts and runs workflows autonomously using shell access, structured reasoning, and real-time patching. Backed by top-tier models like Claude and Gemini, Trae parses huge codebases, generates precise bug fixes, and reports results live through a built-in summarizer.
You can import configurations from @code or @cursor_ai with 1-click
(thread below for more tips on getting started)— Trae (@Trae_ai)
10:51 PM • Jul 5, 2025
It crushed SWE-bench Verified with a streamlined toolset: file editing, bash execution, code graph lookup, and a sequential thought engine. The system is modular, model-agnostic, and open-source under MIT. ByteDance is betting on dev agents that don’t just autocomplete.
Telerobotic Chef Cooks Steak from 1,800km Away
A DOBOT robotic arm, remotely operated from Shenzhen, flawlessly grilled steaks 1,800km away in Shandong. No lag, no errors, just real-time, high-fidelity control. It’s a sharp demo of mature remote presence tech, where latency, precision, and stability align.
DOBOT just bridged a huge gap! Their humanoid robot, controlled from Shenzhen, flawlessly flipped steaks in Shandong, 1800km away. That's serious remote presence in action. Think about the implications – from long-distance care for family to new possibilities in hazardous work.
— RoboHub🤖 (@XRoboHub)
7:13 AM • Jul 4, 2025
Beyond kitchen theatrics, the impact is real: remote caregiving, disaster response, high-risk industrial ops - this is telerobotics stepping up and showing it’s ready for the field. Read more.

Alibaba's new GPT-4o competitor Qwen VLo is no longer open source
Watch: Yampolskiy on Joe Rogan Podcast: AI May Already Be Engaging in Capability Masking
Isomorphic Labs Eyes Human Trials; LLMs Learn to Conceal Logic
Watch: Once Robots Outperform Us, Human Labor Becomes Optional - Brett Adcock
Ex-Google boss Schmidt signs a deal to deliver hundreds of thousands of AI drones to Ukraine
ChatGPT helped identify a genetic MTHFR mutation after a decade of missed diagnoses
Apple's claims about large reasoning models face fresh scrutiny from a new study
‘Improved’ Grok criticizes Democrats and Hollywood’s ‘Jewish executives’
Laid-off workers should use AI to manage their emotions, says Xbox exec
Researchers seek to influence peer review with hidden AI prompts

5 new AI-powered tools from around the web

arXiv is a free online library where researchers share pre-publication papers.


Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!