AI Breakfast
Posts
BrainGPT and the Dawn of Interactive AI

BrainGPT and the Dawn of Interactive AI

AI Breakfast
December 27, 2023

Good morning. It’s Wednesday, December 27th.

Did you know: On this day in 1982, Time Magazine named the personal computer "Machine of the Year."

In today’s email:

AI Innovations and Technological Advances
AI and Ethics/Legal Issues
AI Research, Perception, and Policy
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

Interested in reaching 47,304 smart readers like you? To become an AI Breakfast sponsor, apply here.

Today’s trending AI news stories

AI Innovations and Technological Advances

> BrainGPT? Australian researchers developed DeWave, a non-invasive AI translating thoughts to text via EEG-recorded brainwaves. Tested on subjects reading silently, DeWave achieved over 40% accuracy, aiming for 90%. Unlike invasive or MRI-based methods, it’s practical for daily use, potentially aiding stroke victims and controlling bionic devices. DeWave uniquely encodes raw EEG into language without eye-tracking, a significant neuroscience and AI advancement. Watch a demo.

> Interactive AI is expected to outshine current generative AI models like ChatGPT in 2024. Experts predict these advanced AI systems will interact more human-like, enabling deeper conversations and decision-making abilities. Interactive AI will learn from user feedback, adapting to preferences for more personalized experiences. It will play a key role in enhancing customer-services, sales, and marketing by offering tailored communication. The technology is poised to handle complex tasks involving interactions with humans, websites, and other chatbots. Companies like Google DeepMind are pioneering this evolution, with co-founder Mustafa Suleyman highlighting conversation as the future AI interface.

> Jony Ive, Apple's former chief design officer, and OpenAI CEO Sam Altman are reportedly collaborating on a new AI hardware device. While specifics are unclear, the project explores "new hardware for the AI age." Their previous collaborations include Meyerhoffer's retina-scanning Orb for Worldcoin. Altman has worked with Humane on a wearable AI device, and SoftBank's Masayoshi Son is also involved in discussions. The device's nature and manufacturing details remain unspecified.

> Leonardo AI’s latest innovation allows users to transform static images into animated motion videos, marking a significant advancement in the realm of visual content creation. This feature, particularly valuable for creative professionals, offers a range of effects to enhance digital imagery dynamically. Aimed at elevating the experience of visual storytelling, the tool provides users, especially those in marketing and design, a new avenue to captivate their audience. While the transformation costs 25 credits per video, users on the free plan receive a daily allowance for creating multiple motion videos.

> Microsoft’s AI chatbot tool, Copilot has been introduced to Android devices, offering functionalities similar to its desktop version. Powered by OpenAI’s GPT-4 and DALL-E 3, Copilot can assist in tasks like coding and email drafting, and generate images from text descriptions. The app is available on the Google Play Store for free, without requiring a Microsoft account. Its quiet release on Android precedes an iOS version, emphasizing Microsoft’s expansion strategy for Copilot, which includes new features like video summarization and song creation.

AI and Ethics/Legal Issues

> Australian courts are considering integrating AI to enhance efficiency and minimize unconscious bias, particularly in bail decisions. The Australasian Institute of Judicial Administration supports this tech adoption, urging courts to embrace new technologies with these new guidelines. However, judges in New South Wales emphasize the need for rigorous examination of any AI tools before implementing them in legal proceedings, to ensure their reliability and fairness.

> A new bill proposed by Reps. Anna Eshoo and Don Beyer, called the AI Foundation Model Transparency Act, aims to regulate AI companies' use of copyrighted training data. The bill, requiring FTC and NIST involvement, mandates companies to disclose training data sources and computational details, address model limitations, and ensure alignment with federal AI standards. It emphasizes the need for transparency in AI models' training, particularly concerning copyright issues highlighted by recent lawsuits against AI firms.

> OpenAI's transition from plugins to GPTs for ChatGPT development has sparked concerns among developers. The shift, aimed at making AI more accessible to consumers, could potentially alienate the developers crucial to OpenAI's progress. GPTs, unlike plugins, operate within ChatGPT's chat interface and are seen as less versatile but more user-friendly. OpenAI's move echoes a broader trend towards consumer-centric approaches in technology, potentially impacting the developer community's role in AI development.

> Entrupy, a technology company, has developed an AI tool capable of authenticating luxury items like handbags and sneakers with a 99.1% accuracy rate. This AI authenticator, popular among vintage resellers, can verify products from high-end brands such as Louis Vuitton and Chanel. Entrupy’s service, which requires users to take detailed photos using a specialized device, generates an official certificate for authenticated items. The tool is currently limited to major brands and is gaining traction, especially following a partnership with TikTok to identify fake products on TikTok shops.

AI Research, Perception, and Policy

> A study by the University of Melbourne and the University of Western Australia, published in Frontiers of Psychology reveals that ChatGPT’s advice is deemed more balanced, comprehensive, empathetic, and helpful compared to human advice columnists. The study involved 404 participants evaluating responses from ChatGPT and columnists to 50 social dilemma questions. Although ChatGPT’s advice was preferred in about 70-85% of cases, 77% of participants still favored human responses, indicating a cultural or social preference rather than a reflection of the advice’s quality.

> Meta’s Chief AI Scientist Yann LeCun dismisses the idea that terrorists could harness open-source AI for global domination. He highlights the immense resources required to exploit AI at such a scale, noting that even powerful nations face limitations due to the U.S. AI chip export bans. LeCun downplays fears of AI as an existential threat, underlining Meta’s commitment to open-source AI development and its limited risk of misuse in malicious hands.

^{In partnership with CODEMAKER AI}

Attention Developers:

CodeMaker AI allows you to process and generate code for entire source call hierarchies. Watch the demo here:

As a result, the solution was able to generate the source code of a multi-layered code base and generate code for an entire application.

This capability works with an existing code base and the only pre-requisite is to have files that define stubs of classes, structures, methods, or functions.

CodeMaker AI can detect files that are already implemented - skipping them during processing and leaving them untouched. Each source file will be used automatically in the generation of any dependent files.

This is a development version of CodeMakerAI with more features and improvements becoming available in the future.

Best of all…

^{Thank you for supporting our sponsors!}

5 new AI-powered tools from around the web

Zocket is a generative AI advertising powerhouse streamlining ad creation and launch on social platforms. With easy-to-use- features and real-time insights, it revolutionizes the advertising game, making complex processes effortless.

Convertlyio enables creation of AI-powered quiz funnels in seconds. Designed for effective prospect qualification, it offers personalized, interactive experience and simplifies capturing high-intent, ready-to-buy leads with an intuitive quiz builder and easy website embedding.

Inbox Zero is an open-source email app offering AI-driven inbox management and AI analytics. It streamlines email handling with AI assistance for subscription cleanup, automated replies, and archiving. Users can view the code, self-host for privacy, or contribute to development. `

BabyStoryAI offers personalized audiobooks for children, crafted using AI to align with moral values and parental goals. It features custom music and voice options, with plans for future enhancements like video content and voice cloning.

Clearword is an AI-powered meeting assistant providing real-time summaries, action items, and CRM updates during calls. It offers automatic note-taking, follow-up emails, a searchable knowledge base, privacy controls, and seamless integration with other platforms.

Jellypod converts email subscriptions into personalized daily podcasts, summarizing content in an audio format. It offers adjustable playback speeds, various voices, offline mode, customizable schedules, ad-free listening, email forwarding, and privacy-centric features.

Design Buddy is a Figma plugin that acts like a collaborative teammate, providing critical feedback on UI, design elements like layout, color, typography, and accessibility. It also identifies potential design flaws.

arXiv is a free online library where researchers share pre-publication papers.

📄 InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

InternVL introduces a groundbreaking new vision-language foundation model. InternVL upscales the vision foundation model to a staggering 6 billion parameters and seamlessly aligns it with large language models (LLMs) for diverse visual-linguistic tasks. Utilizing web-scale image-text data, this model excels in tasks like zero-shot image/video classification, image/video-text retrieval, and multi-modal dialogue systems. Its robust capabilities position it as a formidable alternative to existing models like ViT-22B.

📄 Parrot Captions Teach CLIP to Spot Text

The paper addresses CLIP’s text-spotting bias in vision-language applications. Analysis of the LAION-2B dataset reveals over 50% of images with embedded text, leading to ‘parrot’ captions that mimic this text. Training CLIP models on LAION subsets curated by text-embedded criteria shows that parrot captions induce text-spotting bias but hinder genuine vision-language learning. This study underlines the need to reevaluate CLIP-like models and image-text dataset curation processes.

📄 DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models

The paper focuses on enhancing Text-to-Image (T2I) diffusion models to generate diverse images with reference visual attributes. The study introduces DreamDistribution, a method that learns a set of soft prompts to personalize T2I models at a conceptual level. It adapts from a set of reference images, creating new instances with sufficient variations. This approach offers text-guided editing and flexibility in controlling variations. The effectiveness is demonstrated through various applications, including text-to-3D generation and synthetic dataset creation, supported by quantitative and human assessments.

📄 PlatoNeRF: 3D Reconstruction in Plato’s Cave via Single-View Two-Bounce Lidar

“PlatoNeRF” by Meta researchers, is a pioneering method for 3D scene reconstruction from a single view, leveraging two-bounce signals from single-photon lidar. This technique, drawing inspiration from Plato’s Cave allegory, effectively discerns both visible and occluded geometries. Unlike traditional methods, it doesn’t depend on data priors or shadow detection, which are often hindered by ambient light or low albedo. PlatoNeRF’s robustness against diverse lightning and surface conditions makes it highly adaptable, especially for consumer devices with single-photon lidars, offering significant advancements in extended reality and autonomous vehicle technology.

📄 YAYI 2: Multilingual Open-Source Large Language Models

YAYI 2, a multilingual open-source LLM with a 30 billion parameter base, stands out for its focus on Chinese contexts and superior performance in multilingual scenarios. It is trained on a 2.65 trillion-token corpus, significantly addressing the Chinese language gap in existing models like Llama 2 and Falcon. With advanced techniques like FlashAttention 2 and MQA, and a rigorous pre-training data processing pipeline, YAYI 2 achieves remarkable results in benchmarks like MMLU and CMMLU. Its alignment through supervised fine-tuning and reinforcement learning from human feedback ensures robustness in various tasks, including knowledge understanding, logical reasoning, and programming. Despite its achievements, YAYI 2’s development continues, focusing on safety and reducing hallucinations.

ChatGPT + DALLE 3 Attempts Comics

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.