AI Breakfast
Posts
Stability AI Launches AI Music Generator

Stability AI Launches AI Music Generator

Plus, Adobe’s Firefly generative AI tools are now commercially available

AI Breakfast
September 15, 2023

Good morning. It’s Friday, September 15th.

Did you know: There are jobs boards that host open positions exclusively in the AI field?

In today’s email:

AI Policy & Regulation
AI in Entertainment & Media
AI in Defense & Military
AI in Business & Market
AI in Healthcare
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think of this edition by replying to this email, or DM us on Twitter.

Today’s edition is brought to you by:

Things just got insanely easy for content creators

Invideo AI lets you create publish-ready videos with just text prompts. Just type in your idea, and it will generate a complete video with script, stock footage, voiceover & captions!

Unlike other AI video tools that only make short clips, Invideo AI gives you a fully packaged video. Plus, you can make quick changes by prompting the AI, just as you’d ask a human editor.

Test it out now for free!

Today’s trending AI news stories

AI Policy & Regulation

AI Forum: Tech Executives Warn Of AI Dangers And ‘Superintelligence’ In Closed-Door Meeting with lawmakers, advocating for broad AI regulatory frameworks and warning of potential existential AI risks. Zuckerberg stressed safety and America’s leadership in AI, while Musk expressed concerns about “civilizational risk” and proposed the creation of a federal department of AI. Despite differing opinions on AI regulation was recently proposed by Senators Elizabeth Warren and Josh Hawley, requiring AI companies to apply for licenses managed by an independent oversight authority.

California lawmakers are taking action to protect actors and artists from being replaced by AI in the entertainment industry. Assemblymember Ash Kalra is introducing a bill, Assembly Bill 459, which aims to nullify provisions in contracts that allow studios to use AI to clone actors’ voices, faces, and bodies. The legislation would give actors and artists the ability to escape such contracts if they weren’t represented by a labor union or lawyer. This move comes as Hollywood faces ongoing strikes and concerns about the impact of AI on entertainment jobs.

The European Union plans to grant AI startups access to its high-performance computing (HPC) supercomputers to train AI models. This move aims to support responsible AI development. While details are limited, it signifies the EU’s commitment to fostering AI innovation and providing resources for AI research and development.

AI in Entertainment & Media

Stability AI has launched an AI-powered music generator, expanding its offerings beyond image generation. The London-based startup previously introduced the open-source image-generating AI model, Stable Diffusion, which can generate songs and sound effects using AI. This move highlights the company’s ambition to broaden its AI capabilities into music generation.

The iPhone 15 Opts for Intuitive AI, Not Generative AI While competitors focus on generative AI for chatbots and image generation, Apple showcased subtler AI features. These include voice isolation, camera improvements, and automated predictive text recommendations. Apple aims to make AI integral to everyday tasks like photography and phone calls, enhancing user experiences without overwhelming them with AI-driven features. While generative AI gains traction, Apple’s focus on user-friendly, intuitive AI sets it apart.

Adobe’s Firefly generative AI tools are now commercially available across Adobe Creative Cloud, Adobe Express, and Adobe Experience Cloud. This expansion brings generative AI capabilities, such as Illustrator’s vector recoloring, Express text-to-image effects, and Photoshop’s Generative Fill tools, to a wider user base. Adobe is also launching a standalone Firefly web app and introducing a credit-based system for accessing Firefly-powered workflows, along with a bonus scheme for Adobe Stock contributors.

AI in Defense & Military

German military plows millions into AI 'environment' for weapons tests that could change combat forever. This “military metaverse” allows developers to test various weapons and systems within a risk-free environment. Funded by the German Defense Ministry, GhostPlay aims to create unpredictable conditions to improve military planning and preparation. It uses “third-wave” AI algorithms for more human-like decision-making and recreates detailed environments for realistic simulations. The platform is exploring the optimization of swarm tactics, particularly loitering munitions.

AI in Business & Market

Nvidia's new cloud business competes with AWS. This service, hosted by major cloud providers such as Microsoft, Google, and Oracle, offers companies infrastructure for training AI models without being tied to a specific cloud provider. Nvidia pays for the servers and rents them to AI developers. This move allows Nvidua to gain direct customer access, potentially impacting larger providers like AWS. The goal is to demonstrate optimal GPU server configurations, strengthening Nvidia’s position in the AI hardware market. Notable customers include Adobe, Getty Images, and ServiceNow.

EY Unveils Fruits of $1.4 Billion Artificial-Intelligence Investment and they’ve developed their large language model, EY.ai EYQ, and plan to train their 400,000-employee workforce on AI. This investment follows similar announcements from other consulting firms like KPMG, Accenture, PwC, and Deloitte. EY’s AI platform includes new and existing products embedded with AI and provides guidelines for deploying AI at scale, aiming to help companies navigate the complex world of AI implementation. They also intend to address privacy and data concerns when using LLMs.

AI in Healthcare

AI detects eye disease and risk of Parkinson’s from retinal images. What makes RETFound unique is its use of self-supervised learning, similar to models like ChatGPT. Instead of manually labeling images as “normal” or “not normal,” the model learned to predict missing portions of images from a multitude of retinal photos. This approach could be a significant breakthrough in medical imaging, reducing the need for labor-intensive labeling of data.

5 new AI-powered tools from around the web

ChartGen AI by Einblick AI offers instant chart creation with just one sentence. Upload your dataset, provide a prompt, and witness your data transformed into beautifully formatted charts from scatter plots to histograms.

Trickle employs AI to convert screenshots into searchable assets, aligning with a vision to simplify information handling and enhance memory recall. Their roadmap includes a forthcoming Mac client and iPhone app, featuring auto-syncing and on-device storage options for secure, efficient screenshot management and utilization.

Soilroad offers AI-powered sales role-play training, like a flight simulator for sales calls. Sales professionals use it to refine their pitches, receive customized feedback, and elevate their sales skills.

Melon is a personal AI learning companion, allowing you to create a ‘digital twin’ of your brain by feeding it knowledge from online sources.

timeOS AI assists in time management by integrating time-aware AI directly into daily workflows. This innovative tool transforms calendars into proactive, conversational assistants, streamlining tasks such as meeting note-taking, follow-up email drafting, and contextual meeting preparation. It can even attend less critical meetings on your behalf.

arXiv is a free online library where scientists share their research papers before they are published. Here are the top AI papers for today.

📄 Efficient Memory Management for Large Language Model Serving with PagedAttention

The paper discusses the challenges of high throughput serving for large language models (LLMs) caused by inefficient memory management, particularly the key-value cache (KV cache). To address this, the authors propose PagedAttention, inspired by virtual memory and paging techniques, and developed vLLM, an LLM serving system. vLLM minimizes KV cache memory waste and supports popular LLMs, significantly improving throughput compared to existing systems, especially with longer sequences and larger models. The research highlights the importance of efficient memory management in LLM serving systems, tackling issues like KV cache size, complex decoding algorithms, and dynamic input/output lengths.

📄 Tree-Structured Shading Decomposition

The paper introduces a novel approach for inferring a tree-structured representation, referred to as a “shade tree,” from a single image to a model object shading. Existing methods either use parametric or measured representations for shading, which lack interpretability and editability. The proposed shade tree representation combines basic shading nodes and compositioning methods, enabling intuitive and efficient object shading editing. The challenge lies in inferring both the discrete tree structure and continuous node parameters. The solution involves a hybrid approach, combining auto-regressive inference and optimization. The experiments demonstrate the effectiveness of these methods, showing potential applications in material editing, vectorized shading, and relighting.

📄 DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models

DreamStyler introduces a novel framework for artistic image synthesis, excelling in text-to-image synthesis and style transfer. It overcomes the limitations of text-only descriptions by incorporating context-aware text prompts and multi-stage textual embedding. The model achieves superior performance in generating high-quality artistic images across various scenarios, promising potential in art creation. For style-guided text-to-image synthesis, DreamStyler balances text and style scores effectively, outperforming existing methods. In style transfer tasks, it achieves state-of-the-art results in terms of text and image scores, as well as user preference. DreamStyler’s flexibility also allows users to stylize their objects in their chosen styles, making it a versatile tool for artistic expression.

📄 MagiCapture: High-Resolution Multi-Concept Portrait Customization

MagiCapture is a novel method for generating high-resolution portrait images by integrating subject and style concepts. It addresses the challenge of personalization in text-to-image models and aims to create realistic images using limited subject and style references. The method includes two-phase optimization, masked reconstruction, composed prompt learning, and Attention Refocusing loss. By optimizing text embeddings and model parameters, MagiCapture achieves a balance between reconstruction loss and ensures the disentanglement of identity and style information, while the composed prompt learning allows for the generation of multi-concept images. The Attention Refocusing loss prevents information leakage and enhances image quality. Post-processing steps further enhance image fidelity. MagiCapture outperforms existing methods in terms of identity similarity, style, and aesthetic quality.

📄 Statistical Rejection Sampling Improves Preference Optimization

The paper introduces a novel approach called Statistical Rejection Sampling Optimization (RSO) to improve the alignment of language models with human preferences. It addresses limitations in existing methods like Direct Preference Optimization (DPO) and Sequence Likelihood Calibration (SLiC) by searching preference data from the target optimal policy using rejection sampling. RSO also proposes a unified framework for enhancing loss functions used in SLiC and DPO from a preference modeling standpoint. Extensive experiments across three tasks show that RSO consistently outperforms SLiC and DPO in evaluations from both LLMs and human raters. This approach offers a promising alternative to Reinforcement Learning from Human Feedback (RLHF) for model alignment.

Thank you for reading today’s edition.

Your feedback is valuable.

Respond to this email and tell us how you think we could add more value to this newsletter.