The Newest Text-to-Audio Tools

Good morning. It’s Monday, April 24th.

The music industry is grappling with how to handle the AI audio revolution. Some artists hate it, and some are embracing it.

In today’s email:

  • Musicians Embrace AI-generated music

  • 3D rendering for Image Generation with Plask

  • Top 5 AI Research Papers

  • Trending AI Tools

You read. We listen. Share your feedback by replying to this email, or DM us on Twitter.

The Big Business of AI Audio

AI “voice cloning” has been at the center of creative controversy in the past month. At the forefront of this revolution is ElevenLabs, a software company founded by Piotr Dabkowski and Mati Staniszewski.

ElevenLabs specializes in AI-assisted text-to-speech software, offering a browser-based platform that allows users to generate natural-sounding speech from text that includes the added ability to upload custom voice samples to “clone” a user’s voice.

The technology's ability to closely copy real voices has led critics to liken it to deepfaking, raising concerns about its potential for abuse. The ethical challenges of AI voice cloning were further highlighted when a song featuring AI-generated vocals purporting to be Drake and the Weeknd went viral last week.

The song, titled "Heart on My Sleeve," was posted on TikTok by user Ghostwriter977 and quickly gained traction on streaming services.

Universal Music Group condemned the song for "infringing content created with generative AI" and had it removed from streaming platforms after it gained millions of views in just a few days.

However, other musical artists like Grimes (former partner of Elon Musk) has embraced AI-generated content, taking to Twitter to offer a 50% royalty split with anyone who creates successful AI-generated songs using her voice.

Grimes expressed her enthusiasm for being "fused with a machine" and her support for open-sourcing art and "killing copyright."

Artists like Grimes who embrace AI-generated versions of themselves will likely win in the long run. Though they are essentially distributing creative control of their likeness to the masses, artists can now turn their fan base into collaborators with a win-win open licensing model.

Generating AI audio is getting easier every day. Here’s a look at the latest AI audio tool from Serp AI:

The Newest Text-to-Audio tool: Bark

Bark, a cutting-edge text-to-audio model, builds on the GPT-style models to generate highly realistic, multilingual speech, music, background noise, and simple sound effects.

What makes Bark unique is its ability to produce nonverbal communications such as laughter, sighs, and crying.

The highly expressive and emotive voices generated by Bark capture intricate nuances like tone, pitch, and rhythm, providing a fantastic listening experience akin to that of human speech.

Bark's multilingual capabilities enable speech generation in Mandarin, French, Italian, Spanish, and other languages with remarkable clarity and accuracy.

Innovations like Bark will be used to create high-quality voice content, including podcasts, audiobooks, video game sounds, and more.

How does Bark work?

GPT-Style Audio Generation: Bark utilizes GPT-style models to generate audio from scratch, embedding the initial text prompt into high-level semantic tokens without using phonemes. This allows for generalization to arbitrary instructions beyond speech found in the training data, such as music lyrics and sound effects.

Multilingual Support: Bark automatically detects and generates speech in various languages, attempting to employ the native accent for each language. While English quality is currently the best, the quality of other languages is expected to improve with scaling.

Music Generation: Bark can generate all types of audio, including music. Users can add music notes around their lyrics to guide Bark in generating text as music.

Voice and Audio Cloning: Bark can fully clone voices, including tone, pitch, emotion, and prosody. It also preserves music and ambient noise from input audio. However, to mitigate misuse, audio history prompts are limited to a set of Suno-provided, fully synthetic options for each language.

Speaker Prompts: Users can provide specific speaker prompts such as NARRATOR, MAN, WOMAN, etc., although these prompts may not always be respected if conflicting audio history prompts are given.

Text-to-audio tools like Bark open up a new world of AI audio generation. We may be in the last few months of AI-generated audio being distinguishable from authentic recordings.

sponsored post

Plask AI 3D Modeling

Plask AI is a powerful tool that allows users to create stunning images using 3D models, camera angles, and AI-powered pose recognition.

How Plask AI works:

Browse a Community: The Plask community is filled with posts that showcase beautiful images and 3D model parameters created by other users. If users find it challenging to extract and modify poses or set up camera angles, they can quickly find suitable references in the community by searching with keywords or browsing through available categories.

Web-based Workspace: When users come across a post in the community that catches their attention, they can click the "Bring to My Workspace" button. This action applies the associated pose, camera angle, and parameters to the user's workspace, allowing them to view and work with the selected content according to their preferences.

Render with Your Style: Users can create their own unique images by modifying the prompt and clicking the Render button. Once the images are generated, users can choose their favorite, save it, and share it with the community.

Decoding AI: A Non-Technical Explanation of Artificial Intelligence is now available as an ebook!

Chapter List

Introduction to Artificial Intelligence: What is AI, and Why Does it Matter
Defining Artificial Intelligence
The History of AI
Key Organizations in AI
Fundamentals of AI: Algorithms, Data, and Machine Learning
Deep Learning and Neural Networks
Reinforcement Learning
Types of AI: From Rule-Based Systems to Neural Networks
How AI Learns: The Training Process for Machine Learning Algorithms
Applications of AI: Real-World Examples of AI in Action
The Impact of AI: Pros and Cons of Artificial Intelligence
The Ethical and Societal Implications of AI
AI Regulation and Policy
AI in the Future: The Domino Effect
Narrow AI vs. Artificial General Intelligence (AGI)
How to Get Started with AI: Tips and Resources for Exploring Artificial Intelligence

Top 5 AI Research Papers This Week

Note: arXiv is a free online library where scientists share their research papers before they are published. These are the 5 most viewed papers related to AI in the last week.

  • Learning to Program with Natural Language
    Researchers have developed a method that allows LLMs to use natural language as a programming language to solve complex tasks, such as math problems, by generating and learning programs that are understandable to both humans and the models.

  • Learning to Compress Prompts with Gist Tokens
    Researchers have developed a method called "gisting" that allows language models to compress long prompts into shorter "gist" tokens, making the models more efficient and faster while maintaining the quality of their output.

  • Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes
    Researchers have developed a system called EVAPORATE that uses LLMs to automatically convert semi-structured documents into queryable tables, and they've found a way to improve its accuracy while reducing the amount of data the model needs to process, making it more efficient and effective than current methods.

  • ChemCrow: Augmenting large-language models with chemistry tools
    ChemCrow is a tool that enhances LLMs with expert-designed chemistry tools, enabling the models to effectively tackle tasks in organic synthesis, drug discovery, and materials design, with the potential to aid chemists, make chemistry more accessible to non-experts, and advance scientific research.

  • NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers
    The researchers developed a new text-to-speech system called NaturalSpeech 2 that can generate diverse and high-quality speech, including singing, by using advanced techniques and a large dataset, and it performs better than previous systems, especially when generating speech for new speakers it hasn't seen before.


Trending AI Tools

  • WonderChat
    Easily create and customize an AI chatbot by providing a link or PDF file from your knowledge base, allowing it to learn from your website's content. Have it ready in just 5 minutes, reflecting your brand identity through role and profile photo customization. Embed the chatbot on your site with a simple line of code, and monitor its performance through chatlogs, customer feedback, and analytics. Train your chatbot by providing model answers to improve its responses, ensuring a seamless customer experience.

  • TalkFace
    A 1-on-1 AI Tutor for personalized language learning. Unlike traditional tutors, Talkface offers a unique curriculum tailored to your needs.

  • ReviewHero
    Review Hero is a tool that utilizes ChatGPT to summarize Amazon product reviews, helping both sellers and customers quickly understand the overall sentiment. It displays a review summary at the top of the Product Detail Page and supports various Amazon-supported languages, depending on the country.

  • Her
    Just like the movie. The virtual romantic partner app, available on the App Store.

3x the information, for less than $2/week

Stay informed, stay ahead: Your premium AI resource.

AI Breakfast Business Premium: a comprehensive analysis of the latest AI news and developments for business leaders and investors.

Email schedule:

Monday: All subscribers
Wednesday: Business Premium
Friday: Business Premium

Business Premium members also receive:

-Discounts on industry conferences like Ai4
-Discounts on AI tools for business (Like Jasper)
-Quarterly AI State of the Industry report
-Free digital download of our upcoming book Decoding AI: A Non-technical Explanation of Artificial Intelligence available April 18th

Thank you for reading today’s edition.

Your feedback is valuable.
Respond to this email and tell us how you think we could add more value to this newsletter.

Read by employees from