- AI Breakfast
- Posts
- See These AI Powered "Smart Glasses"
See These AI Powered "Smart Glasses"
Plus, a surprise investment from Microsoft
Good morning. It’s Wednesday, February 28th.
Did you know: On this day in 1998, Apple discontinued the Newton line of products, just 5 years after the first model was introduced in 1993.
In today’s email:
Advancements in AI Models and Platforms
Ethical and Regulatory Issues in AI Development
Global Collaborations and Strategic Partnerships
5 New AI Tools
Latest AI Research Papers
ChatGPT Creates Comics
You read. We listen. Let us know what you think by replying to this email.
In partnership with BITGRIT
BitGrit: Where AI meets Web3 in a groundbreaking online competition platform for data scientists
Bitgrit is democratizing the field with a blockchain-powered ecosystem that rewards innovation, collaboration, and expertise. Join a global movement that brings together data scientists, businesses, and data providers in a transparent, unified platform. It's not just about participating; it's about leading the charge in integrating AI into our lives and work.
Why join bitgrit?
Transformative Community: Dive into competitions, connect with peers, and turn your AI solutions into opportunities.
Collaborative Marketplace: Access an expansive network where your skills can solve real-world problems and be crowd-funded by businesses and the community alike.
Empowerment and Innovation: Leverage our platform to showcase your talents, engage with cutting-edge challenges, and monetize your contributions.
Today’s trending AI news stories
Advancements in AI Models and Platforms
> Smart Glasses: Oppo has revealed a new prototype of its Air Glass smart glasses. The Air Glass 3 features Oppo's own AI assistant, powered by its AndesGPT model, enabling voice control and access to various services. The glasses also utilize video vision transformers for seamless interaction. Oppo emphasizes improved display quality, brightness, and audio technology in this prototype. While still in development, the Air Glass 3 demonstrates Oppo's commitment to AI-powered features and suggests a promising future for smart glasses in the mainstream market.
> Paris-based startup Mistral AI introduces Mistral Large, a new large language model designed to rival GPT-4 and Claude 2. The company is shifting from its open-source roots and will offer Mistral Large through a usage-based API model, priced below GPT-4 Turbo. Additionally, Mistral AI is launching Le Chat, a ChatGPT-like chatbot with initial free access and plans for enterprise versions. A partnership with Microsoft will integrate Mistral models into Azure, potentially expanding its reach. While Mistral AI claims superior benchmark performance, its true capabilities will require independent testing.
Moreover, in a strategic move, Microsoft expands its AI portfolio with a strategic partnership and minor investment with Mistral, valuing the company at €2 billion. This marks Microsoft's second significant AI collaboration beyond OpenAI.
> GitHub Copilot Enterprise is now available, promising to boost developer productivity with AI assistance tailored to an organization's codebase and workflows. It integrates generative AI into the code editor, aiding code navigation, comprehension, and feature implementation. Copilot Enterprise assists developers of all experience levels, streamlines pull requests with AI-generated summaries, and integrates Bing search for external information. Priced at $39 per user per month, Copilot Enterprise deeply integrates with GitHub platforms, promoting collaboration and innovation in software development.
> Quantum memory breakthrough may lead to a quantum internet: Scientists have developed "quantum memories" that store and retrieve photonic qubits, the units of quantum information, at room temperature. Quantum memory offers denser information storage than traditional networks, potentially leading to faster and more secure communication. This achievement, in collaboration with OpenAI, brings us closer to realizing a scalable quantum internet infrastructure, revolutionizing communication and computing.
> Klarna, a Swedish payment provider, reports that its AI assistant, powered by OpenAI, managed two-thirds of customer service chats in a month, equivalent to the work of 700 full-time employees. Handling 2.3 million conversations, the AI maintains customer satisfaction comparable to human agents while reducing query resolution errors by 25%. Customers now resolve issues in under two minutes, down from eleven minutes. Available in 23 markets and supporting over 35 languages, the assistant handles tasks from customer service to refunds. The AI is expected to contribute $40 million to Klarna's profits by 2024.
> Former Twitter engineers are building Particle, an AI-powered news reader: The platform seeks to address concerns about AI's impact on news ecosystems by compensating authors and publishers fairly. Despite similarities to previous ventures like Artifact, Particle's experienced founding team and innovative approach set it apart. However, questions remain about its business model and the potential implications for publishers.
Ethical and Regulatory Issues in AI Development
> Google to re-launch Gemini AI Image Generator after receiving criticism for inaccuracies. The tool, part of Google's suite of AI models, was taken offline due to controversies surrounding its historical inaccuracies. Led by Google DeepMind CEO Demis Hassabis, efforts are underway to address these issues and restore functionality. The controversy underscores the importance of ethical considerations in AI development and the need for ongoing refinement and improvement in this evolving field.
> OpenAI accuses The New York Times of "hacking" ChatGPT and AI systems to fabricate evidence for a copyright lawsuit. OpenAI claims the Times used deceptive prompts to reproduce its material, violating terms of use. The Times refutes the hacking accusation, asserting it used OpenAI's products to uncover evidence of copyright infringement. The lawsuit alleges OpenAI and Microsoft utilized NYT articles to train chatbots without permission. Tech companies argue AI systems fairly use copyrighted material, vital for industry growth, while courts grapple with the fair-use question. OpenAI asserts the Times took numerous attempts to generate anomalous results and predicts AI companies will prevail in their fair-use cases.
Global Collaborations and Strategic Partnerships
> Pika introduces Lip Sync, a new feature enabling users to add spoken dialog to AI-generated videos with synchronized mouth movements. Powered by ElevenLabs, Lip Sync supports text-to-audio and uploaded audio tracks, expanding creative possibilities. Early access is available to Pika Pro users and Super Collaborators. While Pika's AI video quality may trail OpenAI's Sora and Runway, Lip Sync disrupts traditional filmmaking barriers, enhancing narrative film creation. Meanwhile, Runway updates its Multi Motion Brush with region detection, simplifying object selection for motion effects.
> Apple has scrapped its long-running Project Titan, ending its pursuit of an autonomous electric car to focus on generative AI. Since its launch in 2014, the ambitious project faced technical setbacks and leadership changes. Apple's COO Jeff Williams and VP Kevin Lynch announced the decision internally, redirecting part of the car team to the AI division under John Giannandrea. Investors reacted positively, pushing Apple's shares up by 1%. This move aligns with industry trends as EV demand plateaus, prompting automakers to refocus on hybrid vehicles. With heavy R&D investments, notably $113 billion over five years, Apple acknowledges AI's long-term profitability potential.
> Google is reportedly paying news publishers to utilize an unreleased generative artificial intelligence (AI) platform through its Google News Initiative (GNI). The program offers beta access to a suite of gen AI tools, and in return, publishers are expected to produce a fixed volume of content for 12 months. The AI platform allows under-resourced publishers to efficiently aggregate content by summarizing and publishing reports from other organizations. Critics express concerns about potential negative impacts on original sources and question the alignment of such practices with the mission of GNI. The move raises discussions about Google's revenue extraction from the publishing world.
5 new AI-powered tools from around the web
Saner.AI enhances note-taking with instant saving, auto-organizing, and semantic search. With AI assistance, users capture, find, and develop ideas, boosting productivity and work quality.
DaLMatian streamlines data analysis by instantly answering business stakeholders’ ad-hoc questions. Analysts can focus on high-impact analysis, boosting productivity. On-prem/local deployment, quick setup in 5 minutes.
SkimAI is an AI-powered email assistant streamlining email management. With AI-generated summaries, draft responses, and inbox sorting, professionals can quickly identify crucial messages, saving time.
Thinkbuddy AI for MacOS transforms user experience with smooth integration of ChatGPT, offering voice/text commands, customized prompts, and AI model selection.
Post Cheetah utilizes AI to generate comprehensive SEO strategies by merging Google Data with automated analysis, empowering users to optimize websites effortlessly. Integrates with popular CMS platforms.
arXiv is a free online library where researchers share pre-publication papers.
MobiLlama presents a fully transparent, efficient Small Language Model (SLM) for resource-constrained devices. By employing a shared feed-forward network (FFN) design across transformer blocks, it reduces model parameters while maintaining accuracy and performance. Evaluated on nine benchmarks, MobiLlama surpasses existing SLMs, offering a notable reduction in training cost and model size. Future enhancements may focus on improving context comprehension and addressing potential biases to bolster the model's robustness. This initiative addresses the need for accessible and efficient language models, particularly in scenarios where computational resources are limited. MobiLlama's approach marks a significant step towards democratizing access to advanced language processing capabilities while ensuring transparency and performance.
This paper investigates whether Large Language Models (LLMs) engage in latent multi-hop reasoning. They meticulously explore how LLMs handle prompts requiring multiple deductions, examining their recall and utilization of knowledge. Findings reveal evidence of latent multi-hop reasoning, particularly in specific fact composition types, though its prevalence varies across prompts. Notably, the first hop of reasoning demonstrates substantial evidence, with larger model sizes correlating with enhanced performance. However, evidence for the second hop and multi-hop traversal is less consistent. The study offers insights into LLM capabilities and proposes avenues for future research to bolster latent reasoning abilities, crucial for parameter efficiency and controllability.
This paper from Playground AI presents significant advancements in text-to-image generative models, focusing on enhancing aesthetic quality. Three critical aspects are addressed: improving color and contrast, accommodating various aspect ratios, and refining human-centric fine details. By refining the noise schedule and adopting the EDM framework, the model achieves vibrant color, faithful prompt-image alignment, and the ability to produce images with pure-colored backgrounds. Furthermore, it demonstrates versatility in generating images across different aspect ratios, ensuring consistency and fidelity. Through extensive user studies and benchmarking against state-of-the-art models, it establishes itself as a leading contender in text-to-image generation, offering valuable insights for researchers and practitioners.
The paper introduces Sora, a video generation model demonstrating remarkable geometric consistency. In the absence of established metrics for quantitatively evaluating its fidelity to real-world physics, the authors propose a benchmark assessing video quality based on adherence to physical principles. Leveraging 3D reconstruction, they gauge geometric fidelity by transforming Sora's generated videos into 3D models. Comparisons with baselines, Pika and Gen2, reveal Sora's significant advantages in geometry consistency. The paper presents detailed methods for 3D reconstruction, sparse matching, and evaluation metrics. Experimental results demonstrate Sora's superiority in geometric quality, supported by visualizations and matching analyses.
The paper introduces a novel approach to enhancing the proficiency of large language models (LLMs) in interpreting and utilizing structured data sources like tables, graphs, and databases. Despite the demonstrated capabilities of LLMs in plain text tasks, their performance in structured data tasks remains limited. To address this, the authors present StructLM, a series of models trained on a comprehensive instruction tuning dataset comprising 1.1 million examples. The models, ranging from 7B to 34B parameters, surpass task-specific models on various structured knowledge grounding (SKG) tasks, establishing new state-of-the-art achievements. Notably, StructLM demonstrates exceptional generalization across novel SKG tasks and reveals insights into the impact of model size on performance, suggesting that structured knowledge grounding remains a challenging task requiring innovative solutions.
ChatGPT Creates Comics
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.