- AI Breakfast
- Posts
- 'o1 model' Reportedly Handles Images, 200K Tokens
'o1 model' Reportedly Handles Images, 200K Tokens
Good morning. It’s Monday, November 4th.
Did you know: On this day in 1982, the Compaq Portable was announced
In today’s email:
o1 model reportedly handles images and 200K tokens
Claude 3.5 Sonnet can now analyze PDFs and images
Perplexity elections tracker
Runway precise camera control
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
Today’s trending AI news stories
OpenAI's full o1 model reportedly handles images and 200K tokens
Reports on 𝕏 reveal that users briefly accessed OpenAI's advanced o1 model via "chatgpt.com/?model=o1" before OpenAI quickly restricted it. This full version, able to process 200,000 tokens and interpret images, is touted by OpenAI as its most powerful yet, designed for complex tasks requiring creativity and nuanced reasoning.
Currently, only scaled-down mini and preview versions are available publicly. OpenAI has not provided a release date for the complete model, but a debut later this year is anticipated, distinguishing it from GPT-5’s timeline. Read more.
@legit_rumors@Jaicraft39 o1 - "Our most capable model, great for tasks that require creativity and advanced reasoning." with 196,608 max tokens and can accept only image files, for now
— Tibor Blaho (@btibor91)
10:18 AM • Nov 2, 2024
Anthropic's Claude 3.5 Sonnet can now analyze PDFs and images inside them
Anthropic’s Claude 3.5 Sonnet AI, now in public beta, brings PDF analysis into sharp focus, handling both text and detailed visuals such as charts and tables through a structured three-step approach: text extraction, page-to-image conversion, and dual-layer analysis.
Designed for tackling dense financial and legal documents, Claude’s PDF tool processes files under 32 MB and 100 pages, with standard token usage per page between 1,500 to 3,000. Claude’s PDF capability integrates with other model features, allowing precise data extraction with tool input.
Currently accessible via Claude Chat and API, Anthropic’s tool is slated to expand to Amazon Bedrock and Google Vertex AI. Anthropic recommends ensuring clean, properly aligned pages for optimal results, and advises splitting larger files and caching prompts for repeat analysis, ensuring both efficiency and accuracy. Read more.
It's a big day for Claude's PDF capabilities.
We're rolling out visual PDF support across claude dot ai and the Anthropic API.
Let me explain:
— Alex Albert (@alexalbert__)
4:55 PM • Nov 1, 2024
Perplexity launches an elections tracker
Perplexity AI has launched a U.S. election tracker, powered by data from The Associated Press and Democracy Works, to provide live updates and key insights on presidential, Senate, and House races. Accessible through a dedicated hub, the tool also covers voting requirements, polling times, and offers AI-generated summaries on candidates, policy stances, and ballot measures.
This approach contrasts with competitors like OpenAI and Anthropic, which avoid election result predictions to prevent misinformation. Perplexity's tracker is framed as a reliable entry point for election information. Read more.
It’s almost election day! While the presidency gets the most attention, Perplexity’s here to help you make an informed vote on all ballot items, including statewide and local.
We’ve curated a trusted set of sources to answer all election-related queries: perplexity.ai/elections
— Perplexity (@perplexity_ai)
9:19 PM • Nov 1, 2024
Runway adds precise camera control to Gen-3 Alpha Turbo AI video generator
Runway has introduced precise camera control in its Gen-3 Alpha Turbo AI video generator, enhancing user command over camera movement within AI-generated scenes. The update enables creators to dictate the direction, speed, and type of camera movements, including panning, zooming, and forward/backward shifts, to construct intricate visual narratives.
These controls can be combined to produce complex movement sequences, offering video creators a nuanced level of customization in content generation. Runway states that these new features provide heightened creative precision, allowing users to fine-tune AI-generated video aesthetics. This capability is now accessible to all users of the Gen-3 Alpha Turbo model. Read more.
Python becomes most-used programming language on GitHub amid AI surge
Open-source Moonshine speech recognition model is up to five times faster than OpenAI's Whisper
Chinese researchers develop AI model for military use on back of Meta's Llama
New AI web navigation system uses world models to predict outcomes and boost success rates
New memory chip controlled by light and magnets could one day make AI computing less power-hungry
AI-generated game Oasis now turns images into playable 3D worlds
Revealing causal links in complex systems: New algorithm reveals hidden influences
Walt Disney forms business unit to coordinate use of AI, augmented reality
5 new AI-powered tools from around the web
arXiv is a free online library where researchers share pre-publication papers.
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on X!