AI Breakfast
Posts
AI Nurses for $9?

AI Nurses for $9?

AI Breakfast
March 25, 2024

Good morning. It’s Monday, March 25th.

Did you know: On this day in 2001, the first major version of Mac OS X was released. It sold for $129.00.

In today’s email:

Nvidia & Hippocratic AI: $9/hr AI nurses for US shortage
OpenAI's "Sora" text-to-video software shakes up Hollywood
UT Austin: Machine unlearning removes unwanted AI content
Stability AI CEO steps down, advocates decentralized AI
Study: Personalized chatbots more persuasive than humans
Apple in talks with Baidu for China AI integration
Google expands AI search testing summaries in US
Anthropic lines up investors, rules out Saudi Arabia
Open Interpreter's O1 Lite: Open-source AI device
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

Today’s trending AI news stories

AI Innovations in Industry and Healthcare

> AI Nurses for $9: Nvidia and Hippocratic AI Tackle US Nurse Shortage: Nvidia teams up with Hippocratic AI to combat the US nurse shortage with $9-per-hour AI nurses. These AI assistants (Watch demo) manage various tasks - from screenings and medication management to chronic disease care. They cater to diverse needs, including pre-op prep, chronic condition support, nutritional guidance, and even remote patient monitoring. Patients can even choose their AI nurse's "bedside manner." While offering potential benefits and mirroring a trend of AI integration in healthcare (over 40 companies are testing similar tech), concerns linger about patient trust and job displacement. The true effectiveness of virtual nurses compared to human caregivers remains to be seen. Read more.

> AI Reimagines Filmmaking: OpenAI's "Sora" Arrives in Hollywood: OpenAI's Sora text-to-video generation software is shaking up Hollywood. Key players like studios and talent agencies will soon get a firsthand look at Sora's capabilities in crucial meetings. Producer Tyler Perry was so impressed, he even paused his $800 million studio expansion. However, excitement is laced with concerns about worker protection in the face of this potentially disruptive technology. OpenAI plans a public release later in 2024, with ongoing discussions to ensure a smooth transition. With bated breath, the industry anticipates the impact, reminiscent of its influence during the writer's strike. Read more.

> AI Learns to "Unlearn": Removing Unwanted Content Without Retraining: University of Texas at Austin researchers have developed a novel "machine unlearning" technique for image-based artificial intelligence. This method allows for the targeted removal of unwanted content, such as copyrighted material or violent imagery, without the need to retrain the entire AI model from scratch. Unlike traditional methods, this approach selectively "forgets" specific data while preserving the model's core knowledge base. This innovation addresses growing concerns around copyright and privacy in AI development, particularly relevant in light of recent legal disputes regarding copyrighted material used to train AI models. Read more.

AI Ethics and Governance

> Stability AI CEO Emad Mostaque steps down to focus on decentralized AI: Stability AI, the company behind the open-source image generation model Stable Diffusion, faces a leadership shakeup as CEO Emad Mostaque resigns to advocate for decentralized AI, expressing concerns about concentrated power in the industry. COO Shan Shan Wong and CTO Christian Laforte will serve as interim co-CEOs while the company seeks a permanent replacement. Amidst financial challenges and the departure of key researchers following the release of Stable Diffusion 3, Stability AI aims to monetize through API services with Intel's support while prioritizing transparent AI governance. The question remains whether Stability AI can maintain its competitive edge against behemoths like OpenAI and Google in the generative AI sector. Read more.

> A personalized chatbot is more likely to change your mind than another human, study finds: Researchers from EPFL and Fondazione Bruno Kessler found that personalized AI models, like GPT-4, can dominate human opponents in debates, achieving persuasion rates up to 81.7% higher. The study examined human vs. human and human vs. AI scenarios, highlighting the power of personalization. AI leveraged anonymized data to tailor arguments, raising concerns about potential manipulation on online platforms. The authors call for countermeasures like AI systems that offer fact-based rebuttals. While the study acknowledges limitations, it emphasizes the need for platforms to address the growing influence of AI-driven persuasion tactics. Read more.

> Apple in Talks to Integrate Baidu's AI for China Devices: In a move to bolster its AI capabilities and navigate China's tightening regulations, Apple is reportedly in talks with Baidu, a Chinese tech giant, to integrate Baidu's government-approved generative AI models into Apple devices sold in China. This potential partnership comes as China requires approval for such models, with Baidu's Ernie Bot already receiving the green light. Facing stiff competition in the Chinese market, Apple seeks strategic partnerships like this to maintain its foothold. Talks with Google and OpenAI for additional AI model licensing further underscore Apple's commitment to AI innovation, particularly within the crucial Chinese market. Read more.

> Google Opens Up AI Search Testing Summaries for Everyone in the US: Google is expanding its foray into AI-powered search in the United States. Even users not enrolled in its Search Generative Experience (SGE) can now access AI-generated summaries for complex search results. This broader rollout aims to gather user feedback from a wider audience and assess the technology's effectiveness. Google assures users that traditional ads will still be displayed alongside these AI summaries. The move reflects Google's response to competition from AI tools like ChatGPT and highlights its focus on responsible use of AI-generated content. While Google previously introduced SGE with features like interactive content and AI-generated answers, this wider testing hints at a potential future where search results become more personalized and interactive, similar to social media platforms. Read more.

> Anthropic is lining up a new slate of investors, but the AI startup has ruled out Saudi Arabia: AI startup Anthropic is attracting big investors. Their $1 billion stake, previously held by bankrupt crypto exchange FTX, is now on the market to repay FTX customers. Sovereign wealth funds are lining up for a chance to buy in, but Anthropic has rejected Saudi Arabia due to national security concerns. While the founders can veto investors, they're not directly involved in the sale, currently handled by Perella Weinberg. With other wealthy nations like the UAE (Mubadala) still interested, this sale marks a significant moment for AI investment. Read more.

AI Devices

Unveiling the O1 Lite: The Future of Open Source AI Devices: In a nod to Linux's open-source ethos, Open Interpreter's O1 Lite disrupts AI interaction. This portable, voice-controlled device seamlessly integrates with home computers, tackling tasks like email and customizing commands without complex setup. The O1 Lite, Server, and OS form an open-source ecosystem fostering innovation. Plans for a computer-controllable language model further democratize AI development, potentially reshaping how we interact with computers daily. Read more.

5 new AI-powered tools from around the web

No-Code Leaderboard offers a platform for tracking and ranking no-code development activity worldwide. Users can engage, share, and climb ranks based on various development metrics.

DataMotto automates data preprocessing tasks using AI, saving time and effort. Cleansing and enriching raw data, it offers free options and enterprise-grade security.

Butternut AI redefines website creation with AI. Generate fully customized, multi-page websites in seconds. Offers advanced customization, responsive design, built-in SEO, and seamless integrations.

Name-Drop AI facilitates lead generation on social media by identifying relevant conversations across platforms. Offers keywords suggestions, fostering authentic interactions.

Alfred 5.5 introduces innovative Grid, Text, Image, and PDF views for enhanced result presentation. Integration with ChatGPT and DALL-E offers interactive conversational capabilities and image generation directly within Alfred.

arXiv is a free online library where researchers share pre-publication papers.

📄 SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series

SiMBA innovates with EinFFT for stable channel modeling and Mamba for sequence modeling, overcoming challenges in state space models. Performance evaluations across vision and time-series benchmarks showcase SiMBA's superiority, bridging the gap with attention-based transformers. The proposed architecture introduces novel techniques, including residual connections and dropouts, enhancing stability and performance. EinFFT manipulates eigenvalues, ensuring stability, while Mamba addresses inductive bias and computational complexity. SiMBA's versatility allows exploration of alternative sequence and channel modeling approaches. It outperforms existing models on ImageNet and time-series datasets, demonstrating its effectiveness in diverse domains. Future work includes further exploration of architectural alternatives within the SiMBA framework, promising advancements in sequence and channel modeling techniques.

📄 Can large language models explore in-context?

The study from researchers at Microsoft Research and Carnegie Mellon University examines the exploration capabilities of Large Language Models (LLMs) in contextual environments, focusing on multi-armed bandit (MAB) scenarios. Without naming specific authors, the investigation finds that native LLM performance, particularly in exploration tasks, is heavily influenced by prompt design. They discover that while certain configurations, particularly involving Gpt-4, demonstrate satisfactory exploration behavior, success often hinges on external summarization of interaction history. This prompts concerns regarding the adaptability of LLMs to more complex environments where such summarization may be impractical. The study underscores the necessity for further research into algorithmic interventions to enhance LLM-based decision-making agents, especially in complex decision-making scenarios beyond basic MAB setups.

📄 LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

The paper presents a targeted and iterative data augmentation strategy to enhance the performance of large language models (LLMs) in low-data scenarios. The proposed LLM2LLM framework leverages a teacher LLM to augment a small seed dataset by generating synthetic data points aligned with challenging examples encountered during training. This iterative process involves fine-tuning a baseline student LLM, extracting incorrectly predicted data points, generating synthetic data with the teacher LLM, and iteratively refining the training dataset. Experimental results demonstrate significant performance improvements across various datasets, with enhancements up to 24.2% on GSM8K, 32.6% on CaseHOLD, 32.0% on SNIPS, 52.6% on TREC, and 39.8% on SST-2 over traditional fine-tuning approaches. LLM2LLM reduces dependency on labor-intensive data curation, making LLM solutions more scalable and effective in data-constrained domains.

📄 Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

The paper introduces a novel methodology, Champ, for human image animation, employing a 3D parametric model within a latent diffusion framework to enhance shape alignment and motion guidance. Leveraging the Skinned Multi-Person Linear (SMPL) model for unified representation of body shape and pose, Champ captures intricate human geometry and motion from source videos. By incorporating depth images, normal maps, and semantic maps from SMPL sequences alongside skeleton-based motion guidance, the model enriches conditions to the diffusion model, enabling accurate shape alignment and pose guidance. A multi-layer motion fusion module with self-attention mechanisms integrates shape and motion latent representations, refining animation quality. Evaluation on benchmark datasets demonstrates Champ's superior ability in generating high-quality human animations with accurate pose and shape variations, showcasing its potential for digital content creation.

📄 VidLA: Video-Language Alignment at Scale

The paper proposes a novel approach addressing limitations in prior video-language alignment methods. By employing a simple two-tower architecture and hierarchical temporal attention, VidLA captures short and long-range dependencies in videos effectively. Leveraging large language models, it curates a dataset enriched with videos of varying durations, enhancing semantic alignment. Utilizing factorized space-time attention, VidLA integrates pretrained image-text models efficiently with temporal hierarchies. Empirical validation demonstrates superior performance on multiple retrieval benchmarks, particularly for longer videos, and competitive results on classification tasks. This work underscores the importance of high-quality, large-scale datasets, and scalable temporal architectures in advancing video-language alignment research.

AI Creates Comics

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.