AI Breakfast
Posts
OpenAI's Sneak Peak of GPT-o1

OpenAI's Sneak Peak of GPT-o1

AI Breakfast
October 18, 2024

Good morning. It’s Friday, October 18th.

Did you know: On this day in 1985, Nintendo released the original Nintendo Entertainment System in New York City.

In today’s email:

OpenAI's Noam Brown presents "Learning to Reason with LLMs" video
Meta and OpenAI criticized for open-source and patent claims
Open Sora releases version 1.3 of video generation model
Nvidia's new AI model surpasses GPT-4
AI software corrects eye contact in videos for 10¢/min
Microsoft and OpenAI partnership shows strain
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

Today’s trending AI news stories

Video of a presentation by OpenAI's Noam Brown: "Learning to Reason with LLMs"

In "Learning to Reason with LLMs," OpenAI's Noam Brown presents the o1 model, which enhances reasoning in large language models through reinforcement learning that generates a hidden chain of thought. The model consistently outperforms prior state-of-the-art models in reasoning benchmarks, including mathematics and programming contests, with improved performance linked to increased compute resources. Brown discusses the implications of further scaling this approach.

Meta and OpenAI Face Criticism Over Open-Source Claims and Patent Pledges

Meta is under fire for labeling its Llama AI models as “open-source,” with the Open Source Initiative (OSI) accusing the company of obscuring the term's true meaning. OSI chief Stefano Maffulli highlighted that while developers can access model weights, crucial components remain proprietary, diverging from open-source principles of full access to software. Critics warn that such terminology dilution may stifle genuine innovation. Other tech giants like Google and Microsoft have adapted their language in response to these concerns, leaving Meta's approach under scrutiny.

In a separate development, OpenAI announced a pledge to avoid offensive patent usage, claiming a commitment to “broad access” and “collaboration.” The company stated it would utilize patents defensively unless threatened. However, experts have called this pledge a little more than ‘virtue signaling,’ suggesting it serves more as a public relations strategy than a genuine effort to enhance competition in the AI sector. Read more.

Open Sora Plan Has Released the 1.3 Version Of Their Video Generation Model

Open Sora Plan (not affiliated with OpenAI’s Sora) has launched version 1.3.0 of its video generation model, following the August release of v1.2.0, which adopted a 3D full attention architecture to improve spatial-temporal feature capture. However, the substantial computational demands and unclear training strategies hindered progress.

Open Sora Plan has released the 1.3 version of their video generation model.
github.com/PKU-YuanGroup/…
Love this #BlackMythWukong story illustrated by AI!
— Tiezhen WANG (@Xianbao_QIAN)
2:09 PM • Oct 16, 2024

Version 1.3.0 introduces five significant features: a more powerful and cost-efficient Wavelet VAE (WFVAE) that decomposes videos into sub-bands for improved learning; a Prompt Refiner, a large language model that enhances short text inputs; a high-quality data cleaning strategy that retains only 27% of the panda70m dataset; DiT with new sparse attention for efficient learning; and dynamic resolution and duration capabilities to optimize videos of varying lengths.

Open Sora Plan will be open-sourced, allowing the community access to all code, data, and models to advance video generation development. Read more.

Nvidia Drops New AI Model Beating GPT-4

Nvidia has quietly launched the Llama-3.1-Nemotron-70B-Instruct AI model, outperforming industry giants like OpenAI and Anthropic. Available on Hugging Face, this model has garnered attention for its impressive benchmark scores, including 85.0 on the Arena Hard test and 57.6 on AlpacaEval 2 LC.

By enhancing Meta's open-source Llama 3.1 model with advanced training techniques, Nvidia aims to provide businesses with a potent and cost-effective alternative for language processing tasks. The model stands out for its strong alignment capabilities, delivering precise, contextually relevant responses that enhance customer satisfaction.

Nvidia offers free hosted inference through its platform and provides an OpenAI-compatible API, broadening access to advanced AI solutions. However, enterprises must exercise caution, as the model is not optimized for specialized fields demanding high accuracy. Read more.

AI Software Fixes Eye Contact In Videos For 10 cents/min

Sieve, an AI startup, has launched an API that automatically corrects eye contact in videos, enhancing viewer engagement for 10 cents per minute. The technology uses an AI model to analyze the eye region and head position in three dimensions, adjusting gaze direction in real time to create the appearance of direct eye contact.

Eye Contact 1.0: eye gaze redirection for developers

We discuss a new gaze redirection pipeline designed to make the eyes in talking head videos look directly at the camera.

www.sievedata.com/blog/eye-contact-correction-gaze-correction-api

It processes the eye region through a neural network to estimate the viewing angle and modify eye positioning accordingly. The correction adapts based on head orientation and accounts for natural behaviors like blinking. Read more.

Microsoft and OpenAI’s Close Partnership Shows Signs of Fraying

Microsoft and OpenAI’s partnership is under strain as both companies navigate conflicting priorities. OpenAI, facing a projected $5 billion loss, has pushed for more computing power and reduced costs, while Microsoft has grown cautious about its dependence on the AI firm.

OpenAI negotiated a deal with Oracle for additional resources, signaling a shift from its exclusive reliance on Microsoft. Meanwhile, Microsoft has diversified by investing in rival AI talent, including hiring key staff from Inflection. Disagreements over development timelines and resource allocation have fueled tensions.

Despite the recent renegotiations, the partnership faces ongoing friction as both companies weigh their options for the future of AI. Read more.

5 new AI-powered tools from around the web

Feta - Better stand-ups, retros, syncs and more

Feta helps product and engineering teams capture meeting context, automate post meeting tasks, and focus only on high-impact work.

feta.io

CodeAnt AI - AI Code Reviewer

AI code reviewer that helps you find and fix critical code quality issues and security vulnerabilities in 30+ languages. Start your 7-day free trial today!

www.codeant.ai

Convo - AI-led interviews & surveys

Convo is the AI-led interviews & surveys platform that helps you to make better decisions, faster. Create, share, and analyze surveys and interviews with AI-powered insights.

getconvo.ai

Prismy – Your product wordings translated on autopilot

Release your product in many languages, on auto pilot. Localisation solution, AI powered with your custom prompt, minimalist and delightful user experience, one-click install. Made for fast delivering product teams.

www.prismy.io

DataMonkey - Your GeoAI

The first solution to easily blend your own location-based data with public source data via natural language

datamonkey.tech/feature-eng