- AI Breakfast
- Posts
- OpenAI's Sneak Peak of GPT-o1
OpenAI's Sneak Peak of GPT-o1
Good morning. It’s Friday, October 18th.
Did you know: On this day in 1985, Nintendo released the original Nintendo Entertainment System in New York City.
In today’s email:
OpenAI's Noam Brown presents "Learning to Reason with LLMs" video
Meta and OpenAI criticized for open-source and patent claims
Open Sora releases version 1.3 of video generation model
Nvidia's new AI model surpasses GPT-4
AI software corrects eye contact in videos for 10¢/min
Microsoft and OpenAI partnership shows strain
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
Today’s trending AI news stories
Video of a presentation by OpenAI's Noam Brown: "Learning to Reason with LLMs"
In "Learning to Reason with LLMs," OpenAI's Noam Brown presents the o1 model, which enhances reasoning in large language models through reinforcement learning that generates a hidden chain of thought. The model consistently outperforms prior state-of-the-art models in reasoning benchmarks, including mathematics and programming contests, with improved performance linked to increased compute resources. Brown discusses the implications of further scaling this approach.
Meta and OpenAI Face Criticism Over Open-Source Claims and Patent Pledges
Meta is under fire for labeling its Llama AI models as “open-source,” with the Open Source Initiative (OSI) accusing the company of obscuring the term's true meaning. OSI chief Stefano Maffulli highlighted that while developers can access model weights, crucial components remain proprietary, diverging from open-source principles of full access to software. Critics warn that such terminology dilution may stifle genuine innovation. Other tech giants like Google and Microsoft have adapted their language in response to these concerns, leaving Meta's approach under scrutiny.
In a separate development, OpenAI announced a pledge to avoid offensive patent usage, claiming a commitment to “broad access” and “collaboration.” The company stated it would utilize patents defensively unless threatened. However, experts have called this pledge a little more than ‘virtue signaling,’ suggesting it serves more as a public relations strategy than a genuine effort to enhance competition in the AI sector. Read more.
Open Sora Plan Has Released the 1.3 Version Of Their Video Generation Model
Open Sora Plan (not affiliated with OpenAI’s Sora) has launched version 1.3.0 of its video generation model, following the August release of v1.2.0, which adopted a 3D full attention architecture to improve spatial-temporal feature capture. However, the substantial computational demands and unclear training strategies hindered progress.
Open Sora Plan has released the 1.3 version of their video generation model.
github.com/PKU-YuanGroup/…
Love this #BlackMythWukong story illustrated by AI!
— Tiezhen WANG (@Xianbao_QIAN)
2:09 PM • Oct 16, 2024
Version 1.3.0 introduces five significant features: a more powerful and cost-efficient Wavelet VAE (WFVAE) that decomposes videos into sub-bands for improved learning; a Prompt Refiner, a large language model that enhances short text inputs; a high-quality data cleaning strategy that retains only 27% of the panda70m dataset; DiT with new sparse attention for efficient learning; and dynamic resolution and duration capabilities to optimize videos of varying lengths.
Open Sora Plan will be open-sourced, allowing the community access to all code, data, and models to advance video generation development. Read more.
Nvidia Drops New AI Model Beating GPT-4
Nvidia has quietly launched the Llama-3.1-Nemotron-70B-Instruct AI model, outperforming industry giants like OpenAI and Anthropic. Available on Hugging Face, this model has garnered attention for its impressive benchmark scores, including 85.0 on the Arena Hard test and 57.6 on AlpacaEval 2 LC.
By enhancing Meta's open-source Llama 3.1 model with advanced training techniques, Nvidia aims to provide businesses with a potent and cost-effective alternative for language processing tasks. The model stands out for its strong alignment capabilities, delivering precise, contextually relevant responses that enhance customer satisfaction.
Nvidia offers free hosted inference through its platform and provides an OpenAI-compatible API, broadening access to advanced AI solutions. However, enterprises must exercise caution, as the model is not optimized for specialized fields demanding high accuracy. Read more.
AI Software Fixes Eye Contact In Videos For 10 cents/min
Sieve, an AI startup, has launched an API that automatically corrects eye contact in videos, enhancing viewer engagement for 10 cents per minute. The technology uses an AI model to analyze the eye region and head position in three dimensions, adjusting gaze direction in real time to create the appearance of direct eye contact.
It processes the eye region through a neural network to estimate the viewing angle and modify eye positioning accordingly. The correction adapts based on head orientation and accounts for natural behaviors like blinking. Read more.
Microsoft and OpenAI’s Close Partnership Shows Signs of Fraying
Microsoft and OpenAI’s partnership is under strain as both companies navigate conflicting priorities. OpenAI, facing a projected $5 billion loss, has pushed for more computing power and reduced costs, while Microsoft has grown cautious about its dependence on the AI firm.
OpenAI negotiated a deal with Oracle for additional resources, signaling a shift from its exclusive reliance on Microsoft. Meanwhile, Microsoft has diversified by investing in rival AI talent, including hiring key staff from Inflection. Disagreements over development timelines and resource allocation have fueled tensions.
Despite the recent renegotiations, the partnership faces ongoing friction as both companies weigh their options for the future of AI. Read more.
Mistral releases new AI models optimized for laptops and phones
Combining next-token prediction and video diffusion in computer vision and robotics
Boston Dynamics teams with TRI to bring AI smarts to Atlas humanoid robot
Amazon goes nuclear, to invest more than $500 million to develop small modular reactors
Sam Altman's Worldcoin becomes World and shows new iris-scanning Orb to prove your humanity
Google restructures AI efforts, moves Gemini back to DeepMind
Archetype AI’s Newton model learns physics from raw data—without any help from humans
Tiny open-source image model Meissonic offers impressive image quality for its size
Apple's local AI agent framework paves the way for more useful Apple Intelligence
Autonomous AI agents may be available to Singapore firms by 2025
Engineering research discovers critical vulnerabilities in AI-enabled robots
Musk's X is changing its privacy policy to allow third parties to train AI on your posts
A data bottleneck is holding AI science back, says new Nobel winner
Meta is laying off employees at WhatsApp, Instagram, and more
5 new AI-powered tools from around the web
arXiv is a free online library where researchers share pre-publication papers.
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on X!