AI Breakfast
Posts
Character.ai Surpasses ChatGPT in Mobile App Usage

Character.ai Surpasses ChatGPT in Mobile App Usage

AI Breakfast
September 13, 2023

Good morning. It’s Wednesday, September 13th.

Did you know: 20 years ago today, gaming client Steam officially launched.

In today’s email:

AI in Software Development & Management
AI in Social Media & Content Creation
AI in Workforce & Training
AI in Entertainment & Marketing
AI in Governance & Regulation
AI in Computer Vision & Research
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think of this edition by replying to this email, or DM us on Twitter.

Today’s trending AI news stories

AI in Software Development & Management

AI chatbots were tasked to run a tech company. They built software in under 7 minutes — for less than $1. AI chatbots, powered by models like ChatGPT, can efficiently and cost-effectively run a software company with minimal human intervention according to a study by researchers from Brown University and Chinese universities. The AI bots were assigned roles and completed various software development tasks, demonstrating their ability to make logical decisions and troubleshoot bugs. On average, ChatDev is the hypothetical AI-powered company, that completed software development in under seven minutes for less than one dollar in cost. While the study identified some limitations, it highlights the potential for AI to assist in software development and other tasks across industries.

AI in Social Media & Content Creation

Instagram might be getting generative AI... panoramas? Instagram might be incorporating generative AI panorama, as hinted by an update in the iOS app. This aligns with Instagram’s ongoing experimentation with AI-related features, potentially expanding its offerings in AI-generated content. Further details are awaited from Instagram regarding this development.

YouTube Announces AI-Powered Creative Guidance In Google Ads This feature evaluates video ads against Google’s best practices and provides actionable suggestions for improvement. It focuses on elements like brand logo visibility, video duration, voiceover quality, and aspect ratio. AI-driven advertising solutions, including AI-powered video campaigns, have shown the potential to increase conversion rates and lower ad costs. However, it’s important to note that AI recommendations are based on historical data and may not align with every brand’s style and strategy.

Character.ai Surpasses ChatGPT in Mobile App Usage in the US Character.ai, an AI app allowing users to create their own AI characters, is gaining traction in the US, with 4.2 million monthly active users, closely training ChatGPT’s 6 million users. Despite launching in May 2023, Character.ai’s impressive growth and user retention can be attributed to its mobile app focus, which appeals to a younger demographic. While ChatGPT dominates the web, Character.ai’s success showcases its potential for further expansion. The startup’s Series A funding of $150 million suggests a promising future.

AI in Workforce & Training

These Prisoners Are Training AI In Finnish prisons Inmates are becoming an unconventional workforce for AI training. Metric, a Finnish startup, employs prisoners to label data, assisting AI models in understanding construction-specific language. This initiative aims to provide inmates with cognitive stimulation and relevant skills for future employment. While some view it as a positive step, others raise concerns about the potential exploitation of vulnerable labor forces in the AI industry. Nonetheless, this creative approach addresses the scarcity of native Finnish-speaking data labelers and the need for language diversity in AI.

Salesforce embeds conversational AI across the platform with Einstein Copilot Salesforce has unveiled Einstein Copilot, an AI tool that enables users to ask questions in natural language across its platform, aiming to enhance efficiency and productivity. This conversational AI assistant can provide information and perform tasks without the need for extensive user knowledge or clicks. Salesforce is addressing AI trust issues by linking CoPilot to its Data Cloud and implementing a trust layer for security and privacy. While AI models like CoPilot still have limitations, Salesforce aims to reduce issues like hallucinations. Currently in beta, CoPilot’s general release and the Einstein Trust Layer are expected soon.

AI in Entertainment & Marketing

AI Robots Take Seats At NFL Game at SoFi Stadium during a Los Angeles Chargers game to promote Disney’s upcoming sci-film, “The Creator.” These robots were displayed on the stadium’s video screen, interacting with human fans. Disney has previously used the stadium for film promotions, leveraging its proximity to Hollywood. This event occurs amid labor tensions in the entertainment industry, where AI’s role and compensation for creative workers are debated. This marketing stunt follows similar initiatives like Paramount’s creepy actors at MLB stadium for the promotion of “Smile” horror film.

Coke introduces a new mystery flavor, made by AI called Y3000. This zero-sugar cola is part of Coca-Cola’s ongoing mystery flavor program, aiming to appeal to health-conscious consumers. While the exact extent of AI’s involvement in the flavor development process remains unspecified, Coca-Cola emphasizes its foray into AI as part of its strategy to stay innovative. The mystery flavor trend has resonated with younger consumers, although opinions on Y3000’s taste have been mixed.

AI in Governance & Regulation

Microsoft president and Nvidia chief scientist testify in Senate AI hearings as the US government grapples with AI regulation. Senator Richard Blumenthal called for a risk-based approach to AI regulation. Microsoft and Nvidia emphasized the importance of regulating high-risk AI while differentiating it from less capable systems. They also highlighted the need for enforcement mechanisms and addressed concerns about disinformation, data privacy, and age limits for AI usage during the hearings. Digital advocacy groups stressed that self-regulation by tech companies may not be sufficient.

AI in Computer Vision & Research

EfficientViT brings massive speedup to computer vision benefiting applications like autonomous driving and medical AI. EfficientViT, based on the Vision Transformer, alters the attention mechanism to reduce computational complexity. Though it sacrifices some local information for efficiency, it compensates with additional components. In tests, EfficientViT processed high-resolution images up to nine times faster than less efficient models, enabling real-time segmentation for autonomous vehicles, medical imaging, VR, and edge applications. The code and model are available on GitHub.

5 new AI-powered tools from around the web

Supademo 2.0 assists in product communication with its AI-powered interactive demos. It empowers users to effortlessly create demos, collaborate effectively, and add personalization through AI. Among new additions are Generative AI Text annotations and advanced analytics.

Pocket Hansei is an AI-powered personal assistant that facilitates learning by connecting users to trusted sources, including books, research papers, and articles. Users can ask questions and receive well-researched answers.

Makersuite offers an AI-powered Video Script Editor that accelerates script creation for YouTube videos. Users can generate ready-to-record scripts of 1-15 minutes in minutes, based on content briefs. The Content Explorer helps analyze top content in your niche, while the Ask-AI feature enables interactive script editing.

Somantic AI’s Makersuite streamlines content creation, offering AI-powered solutions for various marketing needs. Generate SEO-optimized copy, images, and more with ease. Trusted by over 1,245 companies, Maekersuite simplifies content generation.

Knowlery AI offers a unique approach to knowledge-based question answering, focusing on helping users understand domain-specific concepts from lengthy documents. Features like contextual extraction with highlighted keywords and broader answers make it a valuable tool for enhancing comprehension.

arXiv is a free online library where scientists share their research papers before they are published. Here are the top AI papers for today.

📄 NExT-GPT: Any-to-Any Multimodal LLM

In the era of Multimodal Large Language Models (MM-LLMs), a critical limitation is their inability to produce content in multiple modalities, hindering human-level AI development. To address this gap, NExT-GPT is introduced as a versatile any-to-any MM-LLM system. It seamlessly handles input and output in text, images, videos, and audio by connecting established encoders, an LLM core, and multimodel diffusion decoders. NExT-GPT minimizes computational overhead and introduces modality-switching instruction tuning (MosIT) to empower cross-modal semantic understanding. This research opens doors to developing more human-like MM-LLMs capable of modeling universal modalities, marking a significant step towards AGI.

📄 InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation

The paper introduces InstaFlow, a novel text-to-image generation model based on Rectified Flow. Rectified Flow is a method for improving the sampling speed and computational efficiency of diffusion models, specifically for text-to-image generation. The core idea of Rectified Flow is to straighten the trajectories of probability flows, refine the coupling between noises and images, and facilitate the distillation process with student models. InstaFlow leverages this method to create an ultra-fast one-step text-to-image generator, achieving high image quality and surpassing previous state-of-the-art techniques in terms of Fréchet Inception Distance (FID). This approach reduces inference time and computational costs while maintaining image quality, making it a significant advancement in the field. The paper provides detailed insights into the methodology, including the reflow procedure and distillation process, and demonstrates its effectiveness through experiments on large-scale text-to-image generation datasets.

📄 AstroLLaMA: Towards Specialized Foundation Models in Astronomy

In this study, the researchers present AstroLLaMA, a 7-billion-parameter language model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from ArXiv. AstroLLaMA showcases a 30% lower perplexity than LLaMA-2, showcasing its domain adaptation capabilities. Despite having few parameters, AstroLLaMA generates more insightful and scientifically relevant text completions, serving as a robust, domain-specific model with broad fine-tuning potential. We evaluated its performance against GPT-4 and LLaMA-2, demonstrating AstroLLaMA’s superior context awareness and ability to capture nuanced understanding in the astronomy domain. Additionally, AstroLLaMA’s text embeddings exhibit higher granularity, making it suitable for better document retrieval and semantic analysis in the field of astronomy.

📄 Uncovering MESA-Optimization Algorithms In Transformers

In this paper, the authors propose a hypothesis suggesting that Transformers’ superior performance in deep learning may be attributed to a built-in architectural bias towards “mesa-optimization.” Mesa-optimization involves a learned process within the forward pass of the model, consisting of two steps: (i) constructing an internal learning objective and (ii) finding its corresponding solution through optimization. The authors demonstrate this hypothesis by reverse-engineering autoregressive Transformers trained on sequence modeling tasks, uncovering gradient-based mesa-optimization algorithms guiding prediction generation.

Furthermore, they introduce a novel self-attention layer called the mesa-layer, which can improve performance and in-context learning capabilities. Overall, this paper explores the hidden mesa-optimization aspect in trained Transformers’ weights and its potential significance.

📄 Learning Disentangled Avatars with Hybrid 3D Representations

The DELTA project, by researchers from the Max Planck Institute for Intelligent Systems and ETH Zürich, introduces a novel approach to creating animatable, photorealistic human avatars. Unlike traditional methods, DELTA leverages hybrid explicit-implicit 3D representations to disentangle different aspects of the avatar, such as face and hair or body and clothing. This disentanglement allows for more accurate modeling, animation, and transfer of various avatar components. DELTA’s innovative mesh-integrated volumetric rendering enables learning directly from monocular videos without 3D supervision. This research showcases the potential of hybrid 3D representations in human avatar modeling, offering promising results in disentangled reconstruction, clothing try-on, and hairstyle transfer. The code is available as an open-source resource for further exploration.

Thank you for reading today’s edition.

Your feedback is valuable.

Respond to this email and tell us how you think we could add more value to this newsletter.