• AI Breakfast
  • Posts
  • AI's $1 Trillion Year and Multilingual Voice Cloning

AI's $1 Trillion Year and Multilingual Voice Cloning

Good morning. It’s Monday, May 1st.

Open-source language models continue to be deployed as the tech giants announce they’re all in on AI. The race between open and closed AI systems could define the next two years.

In today’s email:

  • 2023 Could be a $1T Year for AI

  • Cohesive AI

  • Multilingual Voice Cloning

  • Trending AI Tools

  • Top 5 Research Papers

You read. We listen. Share your feedback by replying to this email, or DM us on Twitter.

also…

2023 Could be a $1T Year for AI

Leading tech giants Alphabet, Microsoft, and Meta have reported outstanding revenue growth in their recent earnings calls.

What's even more notable is the emphasis they all placed on advancing the frontiers of AI, which as lead to hundreds of billions in market cap growth over the past six months.

Google, Microsoft, NVIDIA, and Meta have each had an explosive Q1

During their earnings calls, Alphabet mentioned AI an astonishing 50 times, while Meta and Microsoft were not far behind with 49 and 46 mentions, respectively.

The most prominent players in the industry have a significant advantage in driving innovation, given the high costs of compute and scale necessary to develop advanced language models. These tech giants have been utilizing AI to enhance their existing products and services, while also creating new ones.

Microsoft took a massive step in investing billions of dollars in OpenAI, a move that has paid off handsomely, with the company reporting an increase in revenue to $52.9 billion, a 7% increase.

Google has faced challenges in keeping up with AI innovation, despite being the research leader for the past decade. This development suggests that the company has to double down on its efforts to stay ahead in the AI race, and refine their chatbot Bard to match or exceed the quality of ChatGPT. Investors still seem optimistic, as Google continues to be unmatched with 4+ billion users.

Meta, on the other hand, is in a promising position, having recently released better-than-expected earnings that propelled the company's stock higher.

Focusing on efficiency, the company has put itself in a strong position to continue innovating in the field of AI, and tying it in to their big bet on the Metaverse.

Check out this latest demo of AI-generated Meta Avatars that went viral over the weekend:

The recent earnings calls of Alphabet, Microsoft, and Meta have demonstrated that AI is at the forefront of their plans for future growth and success. These tech giants' continued investment in AI technology underscores the industry's potential to shape our lives and transform businesses in unprecedented ways.

Sponsored post

Cohesive AI: Custom AI Content Generation

This powerful AI editor enables you to create magical content at the speed of thought, making lightning-fast long-form content creation a reality with multilingual content magic in 10+ languages. You can also enrich your content with unique and stunning visuals. Cohesive AI offers powerful editing with ease anytime, anywhere.

Powerful & Intuitive AI editor

Explore endless Possibilities with Ask Cohesive AI — an intuitive prompt box as your companion that goes beyond summarizing, expanding and rephrasing content 10X faster.

Cohesive AI supports long-form content and is the fastest way of creating content up to 1000 words with just a few input words. Now experience infinite wordplay at your fingertips.

Generate unique & stunning visuals with Cohesive AI

Craft captivating visuals and designs that seamlessly align with your content, making a lasting impression on your audience. Elevate your content using our innovative image-generation feature.

Multiple Language Support

From Japanese to Spanish, they've covered more than 10+ language support in our product. Set your default language, change it anytime, and translate content effortlessly with Cohesive AI.

Cohesive Mobile

Get the same powerful editing tools and user-friendly interface optimized for mobile devices. You can easily create on-the-go with the mobile-friendly experience of Cohesive Mobile.

Thank you for supporting our sponsors

Multilingual Voice Cloning: You Can Now Speak 7 More Languages

ElevenLabs has released a new speech synthesis model, Eleven Multilingual v1, which supports seven new languages, including French, German, Hindi, Italian, Polish, Portuguese, and Spanish.

The model is an improvement over its predecessor, providing delivery adjustments based on context and conveying intent and emotions hyper-realistically, making it perfect for content creators, game developers, publishers, and educational institutions looking to create immersive, localized experiences, produce audio content in different languages, and empower people with visual impairments or learning difficulties.

Based on in-house research, the model can identify multilingual text and articulate it appropriately, preserving each speaker's unique voice characteristics. It is also compatible with other VoiceLab features, such as Instant Voice Cloning and Voice Design, and available on all subscription plans.

Despite its advanced technology, Eleven Multilingual v1 has some limitations, such as defaulting to English for numbers, acronyms, and foreign words when prompted in a different language. However, the company's main vision is to democratize voice and make human-quality AI voices available in every language to foster greater creativity, innovation, and diversity.

Multilingual Voice Cloning Opens Content to a Global Audience

Multilingual voice cloning technology has the potential to revolutionize the media landscape by making movies, videos, podcasts, and audiobooks accessible to a wider audience across the world.

With this technology, creators can produce content in different languages, spoken in their natural voice, enabling them to reach more people and bridge cultural divides. We can expect to see this develop into a more diverse and inclusive media landscape, with creators using it to break down language barriers and connect with audiences in new ways.

Trending AI Tools

NVIDIA releases an open-source 2B parameter LLM, that could potentially run on a mobile device in the near future.

Chat YouTube is a new tool that allows users to engage with any YouTube video by summarizing it, asking questions and more.

FlowiseAI is an open source UI visual tool to build your customized LLM flow using LangchainJS, written in Node Typescript/Javascript.

Aragon creates AI-generated headshots for you.

MCM LLM enables language models to be deployed natively on diverse hardware backends and native applications, optimized for specific use cases with no server support required, and accelerated with local GPUs on laptops and phones.

Top 5 AI Research Papers This Week

Note: arXiv is a free online library where scientists share their research papers before they are published. These are the 5 most viewed papers related to AI in the last week.

  • Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations
    This paper discusses the challenges of deploying large AI models on mobile devices due to limited computational and memory resources, and presents a series of optimizations that make it possible to deploy them on these devices with benefits such as improved user privacy, offline functionality, and lower server costs.

  • We're Afraid Language Models Aren't Modeling Ambiguity
    This paper presents a benchmark called AmbiEnt, which contains annotated examples to evaluate the ability of pre-trained language models to recognize and disentangle different meanings. The study found that this is a challenging task, and demonstrates how ambiguity-sensitive tools can be used to detect misleading political claims.

  • Emergent autonomous scientific research capabilities of large language models
    This paper discusses how large language models that use transformers are advancing in machine learning research and can be applied in natural language, biology, chemistry, and computer programming, and presents an Intelligent Agent system that combines multiple large language models for autonomous scientific experiment design, planning, and execution with successful examples showcased, but with a discussion on safety implications and measures to prevent their misuse.

  • TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings
    This paper discusses the development of the TPU v4, Google's third supercomputer for machine learning models, which uses optical circuit switches to improve scalability, availability, utilization, modularity, deployment, security, power, and performance, and includes SparseCores to accelerate models using embeddings, resulting in significant improvements in speed and energy efficiency compared to other similar-sized systems.

  • Learning to Program with Natural Language
    This paper discusses the use of natural language as a new programming language for Large Language Models to better complete complex tasks and proposes the Learning to Program (LP) method to enable LLMs to learn natural language programs from a training dataset of complex tasks, which they can then use to guide their inferences, demonstrating the approach's effectiveness on the AMPS and Math datasets.

3x the information, for less than $2/week

Stay informed, stay ahead: Your premium AI resource.

AI Breakfast Business Premium: a comprehensive analysis of the latest AI news and developments for business leaders and investors.

Email schedule:

Monday: All subscribers
Wednesday: Business Premium
Friday: Business Premium

Business Premium members also receive:

-Discounts on industry conferences like Ai4
-Discounts on AI tools for business (Like Jasper)
-Quarterly AI State of the Industry report
-Free digital download of our upcoming book Decoding AI: A Non-technical Explanation of Artificial Intelligence

Thank you for reading today’s edition.

Your feedback is valuable.
Respond to this email and tell us how you think we could add more value to this newsletter.

Read by employees from