- AI Breakfast
- Posts
- Researchers find new LLM Jailbrake
Researchers find new LLM Jailbrake
Good morning. It’s Monday, July 22nd.
Did you know: On this day in 1975, MITS signed an agreement with Bill Gates and Paul Allen to license their BASIC interpreter for use on the Altair 8800.
In today’s email:
Apple’s DataComp AI
NVIDIA’s China Chip
LLM Jailbrakes
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
Today’s trending AI news stories
Apple shows off open AI prowess: new models outperform Mistral and Hugging Face offerings: Apple has recently introduced its DataComp for Language Models (DCLM) project, releasing a new suite of open-source language models on Hugging Face. The collection includes a 7 billion parameter model and a 1.4 billion parameter model, demonstrating notable performance improvements over existing models. The 7B model, trained on 2.5 trillion tokens, achieved 63.7% accuracy on the MMLU benchmark, outperforming Mistral-7B and approaching the performance of other top models like Llama 3 and Gemma. This model uses 40% less compute compared to its predecessor, MAP-Neo. The smaller 1.4B model also shows strong results, exceeding the performance of recent models such as SmolLM. Read more.
Nvidia preparing version of new flagship AI chip for Chinese market: Nvidia is engineering a delicate balancing act, developing a version of its new flagship AI chip, the Blackwell series, specifically for the Chinese market while adhering to U.S. export controls. The new chip, expected to be named "B20," will be distributed in China through Nvidia's partner Inspur. The move comes as the company seeks to reclaim lost market share in a nation where tech titans Huawei and Enflame are breathing down its neck, having seen a decline in revenue share from 26% to 17% over the past two years. This adaptation is also crucial due to recent U.S. sanctions aimed at preventing technological advancements that could benefit China's military capabilities. The Blackwell series, including the B200 model, features significant performance enhancements, such as a 30-fold increase in speed for certain tasks compared to previous models. Read more.
Mistral releases three new LLMs for math, code and general tasks: Mistral AI has released three new language models designed to enhance performance across various tasks. The Mathstral model, with 7 billion parameters, achieves top results in mathematical benchmarks like MATH (56.6%) and MMLU (63.47%), surpassing similarly sized models. The Codestral Mamba, an upgrade from the previous Codestral model, features the new Mamba2 architecture with an extended context window of up to 256,000 tokens, enabling efficient code generation and integration of large codebases. Additionally, the Mistral NeMo model, developed with NVIDIA, offers 12 billion parameters and a context window of up to 128,000 tokens, demonstrating strong capabilities in logic, world knowledge, and multilingual applications. Mistral maintains its position as a leading European AI company, supported by recent partnerships and a $600 million funding round, focusing on both specialized and general-purpose language models. Read more.
Researchers uncover an all-too-easy trick to bypass LLM safeguards: Researchers at EPFL have identified a critical security vulnerability in leading AI language models. By rephrasing malicious queries into the past tense, users can often bypass the models' safeguards, which are designed to block harmful content. The study found that this method effectively evades protections in models such as GPT-4o and Llama-3 8B. For instance, a query about creating a Molotov cocktail, normally blocked, becomes accessible when asked in the past tense. The success rate for bypassing restrictions increased from 1% to 88% after multiple reformulation attempts, with 100% success on sensitive topics like hacking. The study underscores the fragility of current alignment techniques like SFT and RLHF and suggests that these models require more evaluation and refinement. A potential mitigation involves fine-tuning GPT-3.5 with past-tense prompts to enhance detection of sensitive content. Read more.
California is a battleground for AI bills, as Trump plans to curb regulation: In California, a contentious debate unfolds as federal and state AI regulations diverge. Republican delegates, aligned with former President Donald Trump, propose reducing federal AI restrictions and enhancing military AI capabilities. Meanwhile, California's Democratic-controlled legislature is considering a bill by State Senator Scott Wiener that would require extensive testing for "catastrophic" AI risks before public release. The bill seeks to address dangers such as weapon development and infrastructure attacks, but faces opposition from tech leaders who argue it could stifle innovation and create excessive bureaucracy. Critics, including Google and Meta, contend that the bill's provisions are technically unfeasible and could unjustly penalize developers. Supporters argue that the bill is essential for managing extreme risks and fostering public trust in AI. Read more.
CrowdStrike and Microsoft outage latest updates — aftermath of the biggest IT outage in history: On July 19, 2024, a critical IT issue impacted Windows machines globally, originating from a faulty update by cybersecurity firm CrowdStrike. This update caused numerous devices to enter a recovery boot loop, displaying the Blue Screen of Death (BSOD) and disrupting operations across various sectors including finance, aviation, and broadcasting. The issue led to significant delays and cancellations in flights and hindered services such as banking and health clinics. CrowdStrike has since identified and reversed the problematic update, though this fix prevents further crashes rather than repairing already affected systems. Additionally, Microsoft faced separate issues with Microsoft 365 apps due to a configuration change in Azure, which has now been resolved. In response, Microsoft has introduced a recovery tool for IT administrators to restore impacted machines. This tool facilitates system recovery by creating a bootable USB drive to bypass the damaged update and access necessary repair functions. Read more.
Etcetera: Stories you may have missed
5 new AI-powered tools from around the web
Supermemory is the ultimate hub for organizing, searching, and utilizing saved information with powerful tools like a search engine, writing assistant, and canvas.
Cohesive AI integrates AI-powered web scraping, research, and email validation within Google Sheets, enhancing productivity and streamlining data analysis and personalization.
Discovery Outcomes is an AI-powered product management tool integrating insights, streamlined workflows, and strategic planning to boost productivity and revenue growth.
Xspiral is an online 3D visualization tool integrating 2D/3D hybrid design, real-time collaboration, and AI to enhance productivity and creativity.
fastn simplifies API integration, making it effortless and accessible for businesses, improving efficiency and overcoming integration challenges.
arXiv is a free online library where researchers share pre-publication papers.
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.