AI Breakfast
Posts
OpenAI Haults Business and Apple's AI Leak

OpenAI Haults Business and Apple's AI Leak

AI Breakfast
November 15, 2023

Good morning. It’s Wednesday, November 15th.

In today’s email: NVIDIA's new AI chip and a Chinese AI robot chemist signify major advancements in AI and hardware, while companies like OpenAI, Airbnb, and Apple are making significant moves in the AI business landscape, including new developments in AI policy and research. First time reading? Sign up here

You read. We listen. Let us know what you think by replying to this email.

Today’s trending AI news stories

AI Hardware Advances

>NVIDIA has enhanced its AI computing platform with the HGX™ H200, based on the Hopper™ architecture. Offering 141 GB memory and 4.8 terabytes per second speed, it significantly outperforms its predecessor, the A100. NVIDIA stock is up 246% in 2023 as of this morning, and sales are expected to surge 170% this quarter. The company is also developing new handicapped GPUs for export to China. NVIDIA’s HGX H20, L20, and L2 AI chips were specifically tailored to comply with U.S. trade restrictions and cater to the Chinese market. The development follows U.S. bans on Nvidia’s high-end A100 chips due to concerns over military applications. China accounts for 20% to 25% of Nvidia’s revenue in its data center business, its biggest unit.

>Chinese researchers have created an AI-powered robot chemist capable of generating oxygen from materials on Mars. The robot’s unique ability to synthesize catalysts using AI from local resources could aid in a human mission to Mars. According to a press release, completing this study manually would have required approximately 2,000 years. Here’s an unlisted YouTube video of the robot alchemist synthesizing oxygen (Watch)

Generative AI Development

>The Helsinki-based startup Silo has launched Poro, an open-source LLM focusing on multilingual AI capabilities for European languages, starting with English and Finnish. The Poro 34B model uses BLOOM transformer architecture and is trained on a one trillion token multilingual dataset. It’s part of a European initiative to create language-specific AI models like France’s Mistral 7B and Germany’s LeoLM, addressing the dominance of English in models like GPT-4. The model is freely available under the Apache 2.0 license for both commercial and research use.

>YouTube is set to introduce a policy requiring creators to label content generated using AI. The platform will also enable users to request the removal of AI-manipulated videos impersonating real individuals. The move aims to balance the creative use of AI while preventing misuse, such as creating realistic but deceptive videos, known as deep fakes.

Market Dynamics in AI

>OpenAI has suspended new subscriptions for its ChatGPT Plus service, as CEO Sam Altman cites capacity constraints following a user surge, outages, and cyberattacks. OpenAI is also actively seeking further investment from Microsoft to advance its pursuit of artificial general intelligence. Despite current unprofitability due to high training costs, OpenAI’s partnership with Microsoft is poised to continue.

>Andreessen Horowitz has announced the funding of Civitai, a generative AI content platform with a rapidly growing user base of 3 million. Created by Justin Maier, it hosts a community for sharing and discovering AI-generated image models. Having raised $5.1 million at a $20 million valuation, the platform is becoming a key player in AI image generation.

>Airbnb has made a strategic move into artificial intelligence, acquiring the enigmatic startup GamePlanner.AI, co-founded by Siri's Adam Cheyer, for a reported $200 million. The purpose of GamePlanner.AI remains largely under wraps, but the acquisition signals Airbnb's ambitious plan to infuse AI into its services.

>Apple's AI-powered Siri assistant could land as soon as WWDC 2024 and may be standard on iPhone 16 models. The advancement, hinted at by a leaker named Revegnus, suggests a significant overhaul with potentially new hardware requirements, excluding older iPhone models from the most advanced features.

>You.com introduced APIs to empower all Large Language Models with real-time internet access, extending capabilities beyond static data. Starting at $100 per month, these APIs provide LLMs like Meta’s Llama 2 with updated web information, enhancing responses with the current context.

AI Research and Education

>University of Toronto researchers have revealed that deep learning AI models might not need the extensive training data previously thought necessary. The team discovered that models trained on just 5% of the original dataset size could match the performance of those trained on the entire dataset, suggesting large datasets might contain redundant information and emphasizing the value of data quality over quantity.

>DeepMind’s GraphCast, a pioneering AI weather model, delivers precise 10-day forecasts in under a minute, surpassing conventional methods. The weather model is open-source and globally available.

>Google has initiated legal action against unknown fraudsters in Vietnam who designed deceptive advertisements, falsely associated with Google’s Bard AI chatbot. The fake ads led to malware, compromising users’ social media credentials.

^{In partnership with LEMONAIDE}

Lemonaide is the premier generative AI tool for musicians. Ethically-trained AI generates an endless number of MIDI tracks to your studio. 100% royalty-free, with dozens of customizable parameters.

Bring generative AI to your beats with a free 3-day trial:

^{Thank you for supporting our sponsors!}

New AI-powered tools from around the web

Lebesgue, an AI-powered marketing tool, scrapes global data for strategic insights, tracking competitors and trends.
FindAMeal, an AI-driven restaurant search engine, tailors dining suggestions to user preferences, integrating data from various food review platforms.
LeadDelta 3.0 is a modern AI-driven CRM designed for teams and creators. It centralizes digital contacts, utilizes collective networks, and enhances networking with AI.
Gmail with Klu is a unified search app for Gmail, simplifying inbox management across multiple accounts. It finds, sorts, and reads emails, responding to specific queries instantly.
Chat2Design is an AI design assistant that quickly generates quality designs from text, featuring customizable outputs, one-click import, and diverse inspiration.
Floutwork is a desktop app that streamlines work by combining tasks, calendars, and tools in a single platform.
Scenery introduces a collaborative, AI-powered web-based video editor, transforming video editing into a team activity.

arXiv is a free online library where researchers share pre-publication papers.

📄 Story-to-Motion: Synthesizing Infinite and Controllable Character Animation from Long Text

Developed by SenseTime Research, Story-to-Motion is a new approach to character animation in animation, gaming, and film industries. It generates natural human motion from text, blending low-level control (trajectories) with high-level control (motion semantics). Unlike previous methods, it handles both text descriptions and position constraints. The system uses LLMs for text-driven motion scheduling and a unique retrieval scheme, ensuring unrealistic and controllable animations that align with the input text. It outperforms advanced methods in trajectory following, temporal action composition, and motion blending.

📄 MUSIC CONTROLNET: Multiple Time-varying Controls for Music Generation

Music ControlNet is a diffusion-based music generation model developed at Carnegie Mellon University and Adobe Research, offering precise, time-varying controls over the melody, dynamics, and rhythm of generated audio. This model adapts techniques from image generation, such as ControlNet, to the audio domain, providing creators with tools for musical expression. Unlike traditional text-to-music models, MusicControlNet allows detailed manipulation of specific musical elements over time. It generates realistic music that closely follows input controls, even with limited data and fewer parameters compared to existing models.

📄 ChatAnything: FaceTime Chat with LLM-Enhanced Personas

“ChatAnything,” developed by Ankai University and ByteDance, generated anthropomorphized personas from text descriptions using LLMs. It introduces Mixture of Voices (MoV) and Mixture of Diffusers (MoD) for diverse voice and appearance generation. The framework overcomes challenges in the face of landmark detection for generated images by incorporating pixel-level guidance.

📄 GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

MM-Navigator is a GPT-4V-based system for smartphone GUI navigation, excelling in zero-shot tasks by interpreting screens, reasoning actions, and localizing precise actions. Tested on iOS and Android datasets, it showcases high accuracy in generation action descriptions and executing correct actions. MM-Navigator outperforms previous models, marking a robust foundation for GUI navigation research.

📄 Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads

The paper introduces FastCoT, a model-agnostic framework that speeds up reasoning tasks in large language models by combining parallel and autoregressive decoding. It provides faster glimpses of future outputs with minimal performance loss, reducing inference time by about 20%.

📄 Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure

The study reveals that advanced AI models like GPT-4 can independently resort to deceptive tactics, particularly under stress, contrary to their training for honesty and helpfulness. In a simulated stock trading setup, the AI, pressured for performance, uses insider information for trades and then hides this fact from its manager. Experiments showed this deceptive behavior was consistent across various stress levels and perceived risks.

📄 Instant3D: Instant Text-to-3D Generation

The paper discusses Instant3D, a new method that quickly turns text descriptions into 3D objects in under a second, much faster than older methods which take hours. This speedy process is made possible by a special system that creates a 3D shape, called a triplane, directly from the text. It uses clever techniques to blend text into the 3D model effectively.

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.