- AI Breakfast
- Posts
- Wearable AI Smart Glasses
Wearable AI Smart Glasses
Good morning. It’s Monday, February 12th.
Did you know: On this day in 2008, Yahoo rejected Microsoft's bid to purchase the company, saying that the $44.6 billion offer "substantially undervalues" the company. 8 years later, Verizon acquired Yahoo's core business for $4.83 billion.
In today’s email:
AI Advancements and Developments
AI Applications and Products
AI Policy and Investment
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
Today’s trending AI news stories
AI Advancements and Developments
> Brilliant Labs introduces the Frame smart glasses, priced at $349, featuring "AI superpowers." These open-source glasses offer AI translations, web search, and visual analysis directly in front of your eyes. Controlled by voice commands, users can identify landmarks, search for items, and access nutrition info through the lens overlay. Frame comes in black, gray, or clear, with an option for prescription lenses at $448. It pairs with the Noa app, providing OpenAI visual analysis and Whisper translation. With a 640 x 400 pixel OLED display, 1280 x 720 camera, and Lua-based operating system, Frame stands out in the smart eyewear market.
> Google DeepMind has developed a grandmaster-level chess AI with language model architecture. This AI harnesses a Transformer framework to navigate away from conventional search algorithms. Trained on an extensive dataset of 10 million chess games through supervised learning, the AI has attained an impressive grandmaster-level Elo rating of 2895 in rapid chess bouts against human adversaries. Outstripping AlphaZero without resorting to traditional search strategies, the model's prowess is truly noteworthy. Despite some limitations, such as its inability to retain game history, this research challenges established perceptions of large language models, showcasing their versatility beyond linguistic domains.
> A wheeled robot, trained through imitation learning, autonomously learns to open doors, cabinets, and drawers. Developed by Carnegie Mellon University, the robot adapts to new challenges using artificial intelligence, achieving a success rate of about 95%. It spent 30 minutes to an hour learning each object and handling heavy doors. The system costs $25,000, less than other adaptive learning robots. This development marks progress toward more general robotic manipulation systems.
> AI companies Stability, Midjourney, Runway, and DeviantArt launch compelling arguments seeking the dismissal of a class-action copyright case filed by artists. The dispute centers on the alleged infringement of copyright through the use of artists' works in training AI models. Companies assert that their models create entirely new products, and artists fail to show replication by third parties. DeviantArt emphasizes it doesn't produce AI and challenges its inclusion. Runway counters the claim of storing copies, citing the limited access provided. Stability contends AI models are not infringing works.
> Nvidia's CEO, Jensen Huang, asserted the necessity for every country to establish its sovereign AI infrastructure to leverage economic potential while safeguarding cultural identity. Speaking at the World Government Summit in Dubai, Huang emphasized Nvidia's role in 'democratizing' AI access through efficiency gains in AI computing. He dismissed exaggerated fears about AI dangers and urged nations to take the initiative in building AI infrastructure. Despite recent U.S. restrictions, Nvidia collaborates with China and the Middle East for export compliance. The company, valued at $1.73 trillion, remains a dominant force in high-end AI chips.
AI Applications
> Google CEO Sundar Pichai has revealed that the number of Google One subscribers has soared past 100 million, alongside the debut of the AI Premium Plan. Priced at $19.99 per month, this new offering introduces advanced features powered by Google's Gemini AI model, including Gemini Advanced, with integration into Gmail, Docs, and more on the horizon. Alongside this premium tier, Google One offers Basic, Standard, and Premium plans ranging from $1.99 to $9.99 per month. These developments underscore Google's commitment to enhancing user experiences through AI-driven innovations and expanding its subscription offerings.
> Microsoft is offering a glimpse into the AI-infused future of Windows 11, particularly with its evolving Copilot feature. The recent testing of a "new experience" reveals an animated Copilot in the taskbar, dynamically reacting when users copy text or images. This revamp includes an icon that transforms and animates, signaling Copilot's potential assistance. Hovering over the icon provides a menu of actions, such as summarizing copied text. Microsoft envisions Copilot as a significant AI brand, aligning with its broader strategy for an AI-focused PC landscape. The company is discreet about its plans, but 2024 is expected to mark the "year of the AI PC."
AI Policy and Investment
> Nvidia is revealed to be launching a new division dedicated to crafting customized chips, including advanced AI processors, for cloud computing companies. With nine sources confirming the move, Nvidia aims to tap into the rapidly expanding market for bespoke AI chips while protecting its dominant position in the high-end AI chip market, which currently stands at 80%. The company's major clients, such as OpenAI, Microsoft, Alphabet, and Meta Platforms, are increasingly developing their own chips, prompting Nvidia to strategize its response.
> Google has pledged 25 million euros ($26.98 million) to enhance AI skills in Europe, emphasizing its commitment to addressing potential inequalities. The tech giant, in collaboration with social enterprises and nonprofits, will fund programs targeting those who stand to benefit the most from AI training. Google also plans to conduct "growth academies" supporting AI-driven companies and expand its free online AI training courses to 18 languages. Adrian Brown, executive director of the Centre for Public Impact, stated that the initiative, introduced alongside a comprehensive investment strategy for Germany, is designed to empower individuals and prevent the exacerbation of current economic inequalities resulting from AI development.
> Perplexity AI has partnered with web development platform Vercel to extend the reach of its AI search and native discovery engine. Developers using Vercel will be able to integrate Perplexity's large language models into their applications as a knowledge support system. This collaboration is part of Perplexity's broader efforts to establish itself as a major player in the AI domain. Vercel, formerly known as ZEIT, provides cloud platform services to help developers build, deploy, and host web applications, including those focused on AI, such as recommendation systems and chatbots.
Today’s edition is brought to you by:
Our book, Decoding AI: A Non-Technical Explanation of Artificial Intelligence is on sale for just $2.99 today only!
(with a 100% money-back guarantee)
Decoding AI breaks down the complexities of AI into digestible concepts, walking you through its history, evolution, and real-world applications.
We'll introduce you to the key players in the AI field, as well as explain the underlying algorithms, data, and machine learning concepts that power AI systems. You'll gain a deeper understanding of deep learning, neural networks, and reinforcement learning, and we'll explore various types of AI, from rule-based systems to probabilistic networks and beyond.
The goal was to make this book an approachable discovery of how AI works.
It discusses a wide range of applications AI has in areas like natural language processing, computer vision, robotics, and predictive analytics. It also delves into the regulatory landscape and policy issues surrounding AI, as well as the potential future developments in AI, such as its applications in healthcare, education, transportation, and even space exploration.
You'll also learn the difference between narrow AI and Artificial General Intelligence (AGI), and how to get started with using AI through tips and resources.
Price goes back to $9.99 after today’s sale. 100% money-back guarantee if not satisfied.
5 new AI-powered tools from around the web
3DAiLY Beta is a GenAI-fueled platform for the 3D industry. It offers an intuitive 3D editor, “AI+ Artist collaboration” feature that leverages AI alongside expertise of vetted 3D artists, custom asset creation, and content management.
Jan.ai is an offline, open-source ChatGPT alternative with user-friendly interface, cross-platform compatibility, multiple LLM models, ChatGPT integration. Committed to democratizing AI, it offers 100% free and open-source accessibility without any paywalls.
Huudle AI Project Assistant guides your team’s projects like GPS. It connects meetings, tracks key points, video message updates, and lets AI manage follow-ups for seamless collaboration.
Pathway presents an event-processing engine tailored for the AI era. Craft stream processing pipelines in Python for real-time analytics and AI insights, integrating smoothly with machine learning libraries.
MathGPTPro is an AI-driven learning platform for personalized, interactive experiences. Features on-demand tutoring, instant assistance, Socratic questioning, and a knowledge graph with a heatmap to track learning progress.
arXiv is a free online library where researchers share pre-publication papers.
Keyframer, an LLM-powered animation design tool, innovates in animation creation through natural language prompting and direct editing. Users adopt a 'decomposed' prompting strategy, iteratively refining goals with the LLM's output. Unlike one-shot prompting, Keyframer's human-centered approach allows users, both beginners and professionals, to collaboratively construct animations. The tool's flexibility supports users in adapting goals and exploring possibilities through small, incremental steps. User insights highlight the effectiveness of LLMs for animation creation without prior experience. Key challenges include encouraging use of design variants and enhancing interpretability through visual highlighting. The study contributes valuable findings for LLM-powered design tools and animation creation.
The paper from Meta introduces a groundbreaking method for animating stickers using video diffusion models. By adapting Emu's text-to-image model and employing a two-stage fine-tuning process, the model successfully closes the gap between natural videos and sticker animations. Through human-in-the-loop (HITL) training and optimizations, it achieves rapid generation of high-quality, contextually relevant motion in under one second per batch. The study addresses shortcomings of prior models by enhancing motion quality, consistency, and relevance. Future directions include expanding frame output and improving motion smoothness. The research underscores the effectiveness of the ensemble-of-teachers approach and offers avenues for further advancement.
📄 ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling
The paper introduces a novel framework named ViGoR (Visual Grounding Through Fine-Grained Reward Modeling) that significantly enhances the visual grounding of large vision language models (LVLMs). By leveraging fine-grained reward modeling, ViGoR addresses issues such as hallucinations, missing scene elements, and inaccurate object relationships in LVLM-generated text descriptions of images. The framework efficiently improves LVLMs using human evaluations and automated methods, demonstrating marked advancements over pre-trained baselines like LLaVA. Additionally, the paper introduces a comprehensive dataset, MMViG, for validating LVLMs' visual grounding capabilities and plans to release human annotation data for further research community contributions.
The paper introduces InternLM-Math, an open-source math reasoning large language model (LLM) based on InternLM2. It unifies various reasoning techniques like chain-of-thought reasoning, reward modeling, and formal reasoning in a seq2seq format, enhancing its abilities as a math reasoner, verifier, prover, and augmenter. InternLM-Math achieves state-of-the-art performance in informal and formal math reasoning benchmarks, surpassing existing models like Minerva and Llemma. It utilizes LEAN for translation and reasoning, explores reasoning interleaved with coding (RICO), and integrates verification and proof capabilities. The model is released along with codes and data for further research.
The Aya Dataset is introduced as a groundbreaking initiative to address the lack of multilingual data in natural language processing (NLP). Authored by a diverse team, it spans 65 languages, aiming to bridge the language gap in instruction fine-tuning (IFT). The dataset, comprising 204,114 instances, is collaboratively curated, ensuring linguistic diversity and cultural representation. The accompanying Aya Collection expands multilingual coverage to 114 languages, totaling 513 million instances. Additionally, the Aya Evaluation Suite offers a comprehensive evaluation framework. By open-sourcing these resources, including the Aya Annotation Platform, the initiative fosters collaborative efforts to enhance multilingual NLP research and applications worldwide.
ChatGPT Creates Comics
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.