- AI Breakfast
- Posts
- Nvidia's Music Generator Creates 'Never before heard sounds'
Nvidia's Music Generator Creates 'Never before heard sounds'
Good morning. It’s Wednesday, November 27th.
Did you know: We’re skipping the history facts today for this: Marc Andreessen’s interview on The Joe Rogan Experience yesterday is a must-watch for technologists.
In today’s email:
Neuralink Trial
Runway’s Custom Worlds
NVIDIA’s Fugatto Music Generator
Text to 3D Assets
Perplexity’s Hardware Play
OpenAI Sora Leak
3 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
The future of presentations, powered by AI
Gamma is a modern alternative to slides, powered by AI. Create beautiful and engaging presentations in minutes. Try it free today.
Today’s trending AI news stories
Musk's Neuralink to launch feasibility trial with brain implant, robotic arm
Neuralink, Elon Musk’s neurotech venture, is moving the needle with a feasibility trial for its brain-computer interface and surgical robotic arm. Building on its PRIME study, the trial focuses on patients with quadriplegia, enabling device control via neural signals. Early U.S. participants have already showcased thought-driven feats like gaming, web navigation, and 3D design, offering a glimpse of its transformative potential.
Across the border, Health Canada has greenlit Neuralink’s first international trial, recruiting six participants to test the implant’s safety and real-world utility. By bridging neural activity with external systems, the company aims to establish a blueprint for brain-machine interoperability. As Neuralink refines its tech stack and expands its trials, the implications could ripple far beyond assistive tech, positioning it at the nexus of neuroscience and engineering innovation. Read more.
Runway launches Frames — a new AI image generator that creates custom worlds
Runway’s latest foundation model, Frames, redefines image generation with precise stylistic control and heightened visual fidelity. By resolving the persistent challenge of maintaining consistency across creative outputs, it enables users to design immersive, cohesive visual worlds with remarkable accuracy.
Introducing Frames: An image generation model offering unprecedented stylistic control.
Frames is our newest foundation model for image generation, marking a big step forward in stylistic control and visual fidelity. With Frames, you can begin to architect worlds that represent… x.com/i/web/status/1…
— Runway (@runwayml)
2:01 PM • Nov 25, 2024
Available through Gen-3 Alpha and the Runway API, Frames demonstrates its capability in various applications, from retro album art to highly stylised compositions. Its combination of realism and aesthetic detail provides creative professionals with an advanced toolset for creating visually cohesive and engaging images. Read more.
Nvidia's new music generation model Fugatto creates 'never before heard sounds'
Nvidia's Fugatto, the Foundational Generative Audio Transformer Opus 1, pushes the boundaries of audio synthesis by blending and reinterpreting sound in ways previously unimagined. It doesn’t just generate music; it morphs existing audio—turning a piano melody into a human voice or transforming a recording's mood and accent. Fugatto’s ability to fuse distinct sounds, like a train’s rumble with orchestral music, produces truly original soundscapes.
🎵 ✨The world’s most flexible sound machine?
With text and audio inputs, this new #generativeAI model, named Fugatto, can create any combination of music, voices, and sounds.🎹
Read more in our blog by @richardkerris ➡️ blogs.nvidia.com/blog/fugatto-g…
#NVIDIAResearch
Note: Some… x.com/i/web/status/1…
— NVIDIA AI Developer (@NVIDIAAIDev)
2:21 PM • Nov 25, 2024
Although trained on millions of open-source samples, Nvidia is holding back public access, citing concerns over safety and copyright risks. With comparisons to the disruptive influence of synthesizers, Fugatto positions itself as a powerful tool for reshaping not just music, but also broader creative fields like gaming and media production. However, its release will be cautious—balancing innovation with responsibility. Read more.
Nvidia's Edify 3D turns text and images into 3D assets
Nvidia’s Edify 3D is a game-changer in asset creation, turning text or images into fully realized 3D models and textures in under two minutes. Using a diffusion model, it generates multiple views of an object, which a reconstruction model then weaves together into a polished, topologically sound 3D asset.
The results are high-quality meshes with UV maps, ready for refinement. It doesn’t stop at individual objects; Edify 3D can create entire 3D environments by stitching related assets into cohesive scenes. This promises to streamline workflows for industries like gaming, AR, and film production. While the technology impresses, Nvidia’s silence on public release raises questions about when, or if, this powerhouse tool will make its way to the masses. Read more.
Perplexity weighs a step into the hardware game
Perplexity, the AI-centric search engine, is toying with the idea of hardware via a compact, sub-$50 device aimed at facilitating seamless voice-based Q&A exchanges. Aravind Srinivas, its founder, catalysed interest through a social media challenge, promising development upon reaching 5,000 likes.
This reflects the AI industry’s growing hardware fixation, from MidJourney’s team to OpenAI’s Jony Ive project. Yet pitfalls abound—Rabbit’s R1 quickly oversaturated, and Humane’s AI Pin collapsed after poor sales and recalls.
While Perplexity’s coffers are flush—rumoured to be bolstered by a $500 million raise—success in this domain demands more than ambition. Historical missteps by others serve as a cautionary tale. Read more.
OpenAI's Sora video generator appears to have leaked
A group of early beta testers has leaked access to OpenAI’s Sora video generator, sparking a sharp critique of the company's early access practices. Through a frontend built on Hugging Face, users were able to generate short video clips from text prompts.
OMG OpenAI Sora has been leaked!
Free to use now on Huggingface, link in comment
It can be shut down anytime, try it now! It can generate 1080P and up to 10s video! And the results are incredible!
9 Examples:
— el.cine (@EHuanglu)
4:25 PM • Nov 26, 2024
In an open letter, the group accuses OpenAI of exploiting artists for promotional purposes, with creative freedoms severely limited. They claim that all outputs must receive OpenAI's approval before being shared, stifling genuine artistic expression.
An OpenAI spokesperson clarified that artists have no formal obligations beyond using Sora “responsibly” and maintaining confidentiality, though the company refrained from defining what constitutes responsible use or which details are considered confidential.
The leaked version of Sora appears to be a "turbo" variant, faster and with indications of style control and limited customization. Read more.
Luma expands Dream Machine AI video model into full creative platform, mobile app
Amazon's new Trainium2 AI chip aims to take on Nvidia with 4x speed and 3x memory boost
Baidu’s supercheap robotaxis should scare the hell out of the US
Researchers conduct systematic review on whether AI could help predict brain aneurysms
AI and astronomy: Neural networks simulate solar observations
New AI tool generates realistic satellite images of future flooding
Zoom 2.0 relaunches as an AI-first company without video in its name
New startup named /dev/agents led by Ex-Google, Meta tech leaders raises $56M for AI agents
Multilingual and open source: OpenGPT-X research project releases large language model
OLMo 2 from Ai2 Competes with Meta’s Llama with Open-Source Transparency
Stable Diffusion 3.5 Large Gets New ControlNets: Blur, Canny, and Depth
Uber is building a fleet of gig workers to label data for AI models
Google's Gemini AI takes a tiny step toward an all-purpose assistant with new Spotify integration
Inflection AI CEO says it's done trying to make next-generation AI models
Thomson Reuters’ CoCounsel redefines legal AI with OpenAI’s o1-mini model
3 new AI-powered tools from around the web
Seeking impartial news? Meet 1440.
Every day, 3.5 million readers turn to 1440 for their factual news. We sift through 100+ sources to bring you a complete summary of politics, global events, business, and culture, all in a brief 5-minute email. Enjoy an impartial news experience.
arXiv is a free online library where researchers share pre-publication papers.
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on X!