- AI Breakfast
- Posts
- Claude 3 Surpasses GPT-4's Capabilities
Claude 3 Surpasses GPT-4's Capabilities
Good morning. It’s Wednesday, March 6th.
Did you know: On this day in 1975, the Homebrew Computer Club held its first meeting? Steve Wozniak would eventually demonstrate a prototype of the Apple 1 at the club in 1976.
In today’s email:
Claude 3 AI models surpass GPT-4 on benchmarks
Copilot for OneDrive summarizes files, generates content
ChatGPT gets "Read Aloud" feature in 37 languages
Emails show Musk supported for-profit OpenAI
Nvidia developing 1,000W GPU, may not need liquid cooling
Perplexity AI nears $1B valuation, challenges Google search
Vimeo launches AI video hub to improve work collaboration
OpenAI hires quantum computing expert with DoD patent
AI helps find new patterns in elliptic curves
Claude-3 achieves 100+ IQ score, leading AI race
4 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
Today’s trending AI news stories
Advancements in AI Models and Tools
> Anthropic has introduced Claude 3, a family of AI models with varying capabilities that surpass GPT-4 in most benchmark evaluations. Users can choose between Haiku, Sonnet, and Opus, offering different balances of intelligence and cost. Opus, the flagship model, outperforms competitors on industry benchmarks, demonstrating near-human comprehension. Sonnet prioritizes speed for tasks like customer support, while Haiku excels in responsiveness. Across the board, Claude 3 models offer improved accuracy, fewer refusals, and enhanced vision capabilities. Anthropic emphasizes responsible design, working to address biases and safety concerns. Now available through the Claude API and cloud platforms. | Read more
> Copilot for OneDrive will fetch your files and summarize them, including documents, presentations, and spreadsheets. Users can ask Copilot questions about file contents and request customized summaries. The tool will also generate outlines, tables, and lists based on existing documents. Early access for OneDrive users begins this month, with full availability for Microsoft 365 work and school customers expected in late April. | Read more
> OpenAI has introduced a "Read Aloud" feature for ChatGPT. This allows users to hear the chatbot's responses in five different voices through both the web and mobile app versions. The feature supports 37 languages and automatically detects the language of the text. This new capability aligns with OpenAI's push towards multimodal AI experiences, following a similar feature recently added to Anthropic's AI models. The "Read Aloud" option further enhances accessibility and flexibility for users, aligning with OpenAI's broader goals for AI interaction. | Read more
> OpenAI has released emails showing Elon Musk supported creating a for-profit company, a key point of contention in his lawsuit against the AI startup. Musk alleges that OpenAI's partnership with Microsoft violates an agreement to make AI advancements freely available. However, OpenAI founders claim Musk wanted to raise billions of dollars and move to a for-profit model to accelerate AI development for humanity's benefit. Musk's proposed terms, which included majority ownership and a CEO role for himself, were rejected due to concerns about concentrated control. This conflict highlights governance challenges within OpenAI despite its commercial success. | Read more
> Dell exec reveals Nvidia has a 1,000-watt GPU in the works. Nvidia's upcoming B100 GPU, with its anticipated 1,000-watt power consumption, raises concerns about energy use and cooling. However, Dell's COO, Jeff Clarke, hints that traditional liquid cooling might not be necessary for this powerful accelerator. There's speculation that the B200 could be the GB200 Superchip, merging Nvidia's Grace CPU and B100 GPU. This aligns with Nvidia's shift to a one-year GPU release cycle. While power consumption remains a concern, Nvidia's roadmap promises advancements in networking, highlighting their commitment to AI infrastructure. The B100 is expected in late 2024, following the earlier launch of H200 GPUs, amidst ongoing supply chain constraints. | Read more
> Perplexity AI is nearing a $1 billion valuation as it closes a funding round, putting it on the cusp of "unicorn" status. This rapid doubling of its value reflects strong investor belief in its AI technology. Perplexity AI seeks to disrupt web search with ad-free, AI-generated answers, challenging established giants like Google. This news aligns with a wider surge in AI startup investments, including companies like Lambda and Sierra. | Read more
Industry Initiatives and Partnerships
> Vimeo takes on Teams with introduction of its AI-powered video hub. Addressing the challenges of screen fatigue in hybrid and virtual workplaces, Vimeo Central offers centralized, accessible video tools with features like AI-generated summaries, highlights, and chapters. These tools are designed to enhance productivity and engagement. IDC research confirms the increasing use of video for collaboration and employee engagement. Vimeo Central targets companies seeking a video-first solution for improved knowledge sharing, with early adopters like Starbucks and eBay already leveraging the platform | Read more
> OpenAI just hired this Photonic Quantum Computing researcher and his recent Air Force/DoD-sponsored patent is extremely interesting. Reddit user MassiveWasabi points out that OpenAI has hired Ben Bartlett, a researcher specializing in photonic quantum computing. Bartlett's recent work, described in a patent titled "Deterministic photonic quantum computation in a synthetic time dimension," introduces several innovations in scalability and efficiency. Notably, the patent outlines a design using a single atom to control multiple photonic qubits, a breakthrough for scalable quantum computing . The use of a synthetic time dimension is also key to the approach. OpenAI's hiring of Ben Bartlett hints at the company's exploration of quantum computing to accelerate advancements in AI. | Read more
Scientific and Technological Discoveries
> Elliptic Curve ‘Murmurations’ Found With AI Take Flight. Mathematicians from Europe and North America have discovered unexpected patterns in elliptic curves, fundamental to cryptography and number theory. Utilizing statistical techniques and AI, they found patterns initially dubbed "murmurations" for their resemblance to starling flock formations. Further investigation revealed the underlying causes and demonstrated these patterns occur broadly in elliptic curves. Crucially, AI algorithms identified the patterns due to their high dimensionality, which obscured them from direct human perception. This discovery impacts various mathematical areas and ongoing research is generating new insights and tools, demonstrating the powerful synergy between AI and mathematical exploration. | Read more
> AIs ranked by IQ, passes 100 IQ for first time, with release of Claude-3. This modification dramatically improved AI performance, with Claude-3 achieving an IQ score exceeding the human average – a major milestone in AI development. Lott's analysis establishes Claude as the current leader in AI intelligence, followed by ChatGPT and Bing Copilot. His study highlights the extraordinary pace of AI advancement, prompting urgent questions about society's preparedness for the implications of increasingly intelligent artificial systems. | Read more
4 new AI-powered tools from around the web
OSO AI offers a censorship-free AI search engine and chat platform. Their AI scans the internet for information, providing summaries and enabling unrestricted access.
D-ID Agents redefine digital connections, blending advanced LLM intelligence with face-to-face interaction. Utilizing voice cloning, video generation, and RAG tech, Agents provide personalized, engaging, and human-like experiences.
Corgea automatically fixes code vulnerabilities, reducing engineer workload by up to 80% and enhancing application security.
Parallel AI crafts tailored AI employees infused with knowledge from your documents, ensuring accuracy and security. Integrates with Slack and Google Docs.
arXiv is a free online library where researchers share pre-publication papers.
Researchers have taken an innovative approach to humanoid robot control by treating it as a language modeling problem. Using a transformer model trained on sensor data, they predict the next action a robot should take. This approach works with various data sources, including motion capture and even YouTube videos. Surprisingly, the model allowed a full-sized humanoid robot to walk in San Francisco with minimal real-world training data. It even adapted to new commands, such as walking backward. This research suggests a powerful new way to teach robots complex, real-world tasks using generative modeling.
This paper explores how scaling factors impact the fine-tuning of large language models (LLMs), examining model size, pretraining data size, PET parameter size, and finetuning data size. The study reveals a multiplicative joint scaling law, indicating that finetuning data size interacts in a power-based way with other scaling factors. Results show that scaling the LLM model size benefits fine-tuning more than scaling pretraining data, while scaling PET parameters is generally ineffective. The best fine-tuning method (full-model vs. PET) depends on the task and available data. Interestingly, fine-tuning can also improve zero-shot generalization to related tasks, with PET methods often demonstrating an advantage in this scenario.
The RT-Sketch approach introduces a unique method for specifying goals in imitation learning using hand-drawn sketches. Sketches offer a balance between the ambiguity of natural language and the excessive detail of images, making them an ideal middle ground for goal specification. The RT-Sketch architecture builds upon the RT-1 transformer to process visual goals, enabling flexible use of sketches, images, or other visual representations. Researchers created a dataset by translating existing demonstrations into sketches. Experiments demonstrate that RT-Sketch performs on par with existing methods in simple scenarios, while outperforming them when faced with ambiguous language or distracting visual elements. Additionally, RT-Sketch can effectively handle sketches with varying levels of detail.
Researchers at Amazon address the lack of diverse conversational data for large language models (LLMs) by introducing MAGID, an automated system for generating synthetic multimodal datasets. MAGID enhances text-based dialogues with high-quality images generated using a diffusion model, ensuring the images align with the conversation. A quality assurance module safeguards the process, evaluating image-text alignment, aesthetics, and safety. The pipeline leverages an LLM to identify suitable phrases, utilizes prompt engineering techniques, and allows control over LLM output through HTML-like formatting. MAGID demonstrates success in generating high-quality datasets, comparable or superior to existing ones. Future work will address copyright concerns, expand to other modalities, improve image consistency, and refine the quality assurance module for even more realistic results.
This study introduces EasyQuant, a new method to compress Large Language Models (LLMs) without sacrificing performance. Unlike other methods, EasyQuant doesn't require any training data or complex calculations. It works by adjusting the numerical range used to represent the model's internal values (weights) and carefully handling unusual values (outliers). EasyQuant can quickly compress even huge LLMs (over 100 billion parameters) in minutes, making it more efficient than data-dependent methods. Importantly, EasyQuant maintains the model's original performance, demonstrating the value of data-free methods for preserving accuracy. The researchers suggest future work to explore further improvements and address any limitations.
ChatGPT Creates Comics
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.