• AI Breakfast
  • Posts
  • Musk Sues OpenAI, and New AI Malware Unleashed

Musk Sues OpenAI, and New AI Malware Unleashed

Good morning. It’s Monday, March 4th.

Did you know: The Sony Playstation 2 turns 24 years old today?

In today’s email:

  • AI Worm Morris II Infects GenAI Tools, Compromises Data

  • Musk Sues OpenAI, Alleges Betrayal of Non-Profit Roots

  • Meta AI Introduces Searchformer for Efficient AI Decision-Making

  • EPFL Develops MultiModN, an AI for Text, Video, Images, Sound

  • Waymo Gets Green Light for Self-Driving Taxi Highway Expansion

  • Groq Buys Definitive Intelligence, Enhances GroqCloud with AI Analytics

  • 5 New AI Tools

  • Latest AI Research Papers

  • ChatGPT Creates Comics

You read. We listen. Let us know what you think by replying to this email.

Today’s trending AI news stories

Advancements in AI Models and Technologies

> AI Worm on the loose: Researchers have uncovered a new generation of AI-powered malware called "Morris II." This "AI worm" specifically targets popular Generative AI (GenAI) tools and email assistants like Gemini Pro and ChatGPT 4.0. It exploits "prompt-injection" vulnerabilities to spread and steal sensitive data. Morris II uses self-replicating prompts within ordinary text or image files to infect systems. Once compromised, it can access personal information, including credit card numbers and social security details. The researchers have notified Google and OpenAI to boost security efforts.

> Elon Musk sues OpenAI, calling it a 'betrayal' of its non-profit mission. Tesla's outspoken CEO alleges OpenAI violated its founding principles as a public AI counterweight to Google. He claims its Microsoft partnership pivoted the focus to commercializing AGI for profit. Musk backs this up by citing a Nadella interview suggesting strong alignment and potential licensing of AGI like GPT-4. The suit seeks to force OpenAI back to its roots and stop monetizing tech born as non-profit. This case could be a watershed moment for open-source AI and the limits of commercializing AGI.

Flash back to 2015: Leaked email reveals OpenAI's origins. A pre-founding email from Sam Altman to Musk shows OpenAI's original vision: create the first AGI, prioritizing safety and public benefit. Altman outlined a small team, 5-person leadership (including himself and Musk), and pay tied to progress, not profits. Musk's involvement was meant to be light-touch: recruiting, public advocacy, and monthly updates. There's even discussion of a regulation letter tied to OpenAI's milestones. These insights fuel Musk's claims that the current closed-source, money-minded OpenAI violates its founding ideals.

> Meta AI's new model, Searchformer, bridges the gap between powerful Transformer models and traditional planning methods. While Transformers excel in general-purpose tasks, traditional planning approaches offer a structured advantage for complex decision-making. Searchformer, trained on simulated data and further refined with expert input, learns to mimic efficient search strategies. This allows it to find optimal solutions in fewer steps compared to traditional methods, demonstrating a 27% efficiency improvement in solving complex puzzles. This breakthrough paves the way for AI systems that can navigate intricate decision-making processes with greater efficiency and accuracy.

> Researchers at EPFL have developed a new AI model called MultiModN that can handle a wider range of data than traditional Large Language Models (LLMs). While LLMs primarily focus on text, MultiModN can work with text, video, images, sound, and even time-series data, making it much more versatile in its outputs. This innovation tackles a major challenge for smaller Multimodal Models: their inability to handle situations where some data might be missing. MultiModN uses separate modules for different data types, allowing it to adapt to whatever information is available without being biased by missing data points. Tests on real-world tasks like medical diagnosis and weather forecasting show that MultiModN is not only adaptable but also easier to understand.

Market Dynamics in AI

> California regulators have given Waymo the green light to expand its self-driving taxi service onto highways in the Bay Area and parts of Los Angeles. This follows a temporary halt last month prompted by safety concerns after recent high-profile accidents involving autonomous vehicles. However, the California Public Utilities Commission (CPUC) concluded that Waymo's technology and focus on safety were sufficient. Despite some objections, the commission approved the expansion, aligning with California's broader autonomous vehicle development goals. This decision marks a major milestone for self-driving technology, allowing Waymo to proceed with its expansion immediately.

> AI startup Groq has acquired Definitive Intelligence to boost its GroqCloud platform, which offers on-demand access to its AI chips. This acquisition will add AI-powered analytics capabilities to GroqCloud, enhancing its functionality for large language model inference. Groq's specialized LPU Inference Engine, featuring TSP cores and an on-chip network, delivers up to 10 times faster inference than competitors. Sunny Madra, co-founder of Definitive Intelligence, will lead GroqCloud's business unit to expand its capabilities and reach. Additionally, the acquisition paves the way for Groq Systems, a new unit focused on deploying Groq's LPU technology for government and other organizations.

5 new AI-powered tools from around the web

Anytalk.ai facilitates real-time translation of video and audio streams, including random videos on YouTube, Twitch streams, and Google Meet. Free testing available with a 5-second delay.

DATAKU is an AI-powered data extraction tool. Batch analyze, extract, and summarize data into organized tables, reducing repetitive tasks by 10x.

QA.tech is an AI-powered testing solution for web apps. Autonomously tests, generates detailed bug reports. Integrates with pipelines, adapts to new features and reduces manual QA costs.

RecCloud is an AI-powered multimedia platform that offers video/audio processing tools: screen recording, AI video chatting, subtitle generation, voice-to-text conversion, editing, and cloud storage.

thinkstack.AI enables building AI-powered chatbots trained on your data. Customize appearance, tone, support, multiple languages.

arXiv is a free online library where researchers share pre-publication papers.

AtP∗ presents an efficient method for pinpointing behavior in Large Language Models (LLMs) to specific components. By approximating Activation Patching, it tackles scalability challenges while delivering reliable causal attributions. The method involves two forward passes and one backward pass, significantly expediting the process compared to brute force activation patching. Through a systematic study, AtP∗ outperforms alternatives, demonstrating its effectiveness in localizing behavior. Additionally, AtP∗ introduces two modifications to address failure modes, ensuring better accuracy while maintaining scalability. The proposed approach not only enhances interpretability but also provides practical insights for understanding LLMs' internal mechanisms. With applications across various domains, including circuit discovery and causal abstraction, AtP∗ offers a promising avenue for advancing mechanistic interpretability in deep neural networks, fostering more reliable and scalable methods for behavior analysis.

VisionLLaMA introduces a unified vision transformer framework inspired by the success of language models like LLaMA. By adapting the transformer architecture, VisionLLaMA addresses the architectural disparities between language and vision modalities. The method leverages both plain and pyramid transformer designs, offering flexibility for various vision tasks. In extensive evaluations, VisionLLaMA demonstrates superior performance over existing vision transformers in tasks such as image generation, classification, segmentation, and object detection. Notably, VisionLLaMA incorporates auto-scaled 2D positional encoding to handle variable input resolutions effectively. The framework's effectiveness is showcased through supervised and self-supervised learning paradigms, positioning it as a strong baseline model for vision understanding and generation. Overall, VisionLLaMA bridges the gap between language and vision modalities, offering a promising avenue for diverse downstream applications.

In this paper, the BigCode project presents advancements in Code Large Language Models (LLMs). Collaborating with Software Heritage, the Stack v2 is built atop a vast source code archive. Alongside Software Heritage's repositories spanning 619 languages, high-quality data sources like GitHub pull requests and Kaggle notebooks enrich the training set, 4x larger than its predecessor. StarCoder2 models, ranging from 3B to 15B parameters, are trained and evaluated extensively on Code LLM benchmarks. Results show StarCoder2-3B surpasses similar-sized LLMs, with StarCoder2-15B outperforming models of comparable size. Despite DeepSeekCoder-33B's superiority in code completion for high-resource languages, StarCoder2-15B excels in math and code reasoning benchmarks, even in low-resource languages. OpenRAIL-licensed model weights and transparent training data release foster trust and collaboration within the community.

The paper introduces Hawk, an RNN incorporating gated linear recurrences, and Griffin, a hybrid model combining gated linear recurrences with local attention. The models aim to address the limitations of Transformers in scaling to long sequences efficiently. Griffin achieves lower held-out loss than strong Transformer baselines across various model scales and matches the performance of Llama-2 despite being trained on fewer tokens. The models exhibit power law scaling between held-out loss and training FLOPs, comparable to Transformers. Griffin demonstrates superior inference throughput and lower latency than MQA Transformers. Additionally, it performs well on longer sequences and can efficiently learn copying and retrieval tasks. The paper proposes the RG-LRU layer, a novel gated linear recurrent layer, and demonstrates its effectiveness in replacing traditional attention mechanisms like MQA. Griffin is scaled up to 14B parameters, with insights provided for efficient distributed training.

The paper presents Priority Sampling, a deterministic technique for large language models (LLMs) in code generation. It addresses issues of repeated and incoherent samples by producing unique samples ordered by model confidence. Priority Sampling augments a search tree, selecting unexplored paths based on previous samples to avoid repetition. It supports generation based on regular expressions for structured exploration. Evaluation on optimizing LLVM optimization passes demonstrates Priority Sampling's superiority over Nucleus Sampling, achieving a 5% improvement over default optimization with just 30 samples. Additionally, it outperforms the autotuner used for training label generation. Priority Sampling's efficiency highlights LLMs' potential with sophisticated sampling strategies, accessing knowledge through intelligent tree expansion.

ChatGPT Creates Comics

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.