AI Breakfast
Posts
Scientists Uncover Biological Echoes in Powerful AI Transformer Models

Scientists Uncover Biological Echoes in Powerful AI Transformer Models

AI Breakfast
August 21, 2023

Good morning. It’s Monday, August 21st.

Did you know: 30 years ago today, NASA lost contact with the Mars Observer spacecraft?

In today’s email:

UK invests $130M in AI chips
FDA approves AI for X-ray analysis
Allen Institute launches Dolma dataset
AI-created art not copyrightable
ElevenLabs partners with ScienceCast for summaries
Google's AVIS improves image search
Nestle and Unilever use generative AI
AI transformers likened to astrocyte-neuron networks
NCSoft debuts AI suite for gaming
Journals reevaluate AI-assisted papers
IBM: 40% workers to re-skill due to AI
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think of this edition by replying to this email, or DM us on Twitter.

Today’s edition is brought to you by:

Tired of explaining the same thing over and over again to your colleagues?

It’s time to delegate that work to AI.

Guidde is a GPT-powered tool that helps you explain the most complex tasks in seconds with AI generated documentation.

Turn boring documentation into stunning visual guides

Save valuable time by creating video documentation 11x faster
Share or embed your guide anywhere for your team to see
Simply click capture on our browser extension and the app will automatically generate step-by-step video guides complete with visuals, voiceover and call to actions.

The best part? The Guidde extension is 100% free.

Try it here

Today’s trending AI news stories

UK to spend $130M on AI chips amid scramble to buy up computing power: The UK allocated £100 million ($130 million) for AI chip purchases to establish an AI Research Resource by 2024, aiming to address a global computing power shortage. Government sources chips from NVIDIA, Intel, and AMD, but the funding falls short of ambitions, possibly leading to more funding pressure. A report highlights companies’ struggles to deploy AI due to resource constraints. S&P Global’s AI trend reports emphasize the significance of adequate computing power in AI development.

FDA clears AI-powered software that pinpoints suspicious findings on chest X-rays: FDA approves Imidex Inc’s AI-powered software, VisiRad XR, to detect suspicious findings on chest X-rays. Developed using curated global data sets and advanced machine learning, the tool aids radiologists by identifying potential missed modules and masses. VisiRad XR integrates into existing imaging workflows, supporting radiologists, and enhancing patient care. In large-scale studies, the software achieved a sensitivity of 83% and could potentially identify hundreds of ling nodules or masses, benefiting hospitals performing numerous chest X-rays annually.

Allen Institute Unveils Dolma, Largest Open Training Dataset for Large Language Models: The Allen Institute introduces Dolma, a colossal open-source dataset with 3 trillion tokens for training AI language models. Dolma aims to balance transparency, size, and risk mitigation, setting itself apart by offering accessibility while addressing concerns about large-scale model datasets. Its unique licensing terms allow access while preventing misuse. The Allen Institute plans to expand Dolma with more data sources and languages, fostering openness in AI research.

AI-Created Art Isn't Copyrightable, Judge Says in Ruling That Could Give Hollywood Studios Pause: A federal judge upheld the U.S. Copyright Office’s finding that a piece of art created by AI is not eligible for copyright protection. The judge’s order denied a challenge by Stephen Thaler, CEO of neural network firm Imagination Engines, who claimed an AI system he developed called the Creativity Machine should be recognized as the sole creator of an artwork. The ruling highlighted that copyright law protects only works of human creation and emphasized the importance of human authorship in copyrightability. This decision has implications for AI-generated creative works in various industries, including entertainment.

ElevenLabs’ Collaboration with ScienceCast and arXiv Generates Digestible Videos for Open Access Research: ElevenLabs has partnered with the open-access science video platform ScienceCast to create narrated summaries of scientific papers. ElevenLabs’ voiceover technology now powers ScienceCast’s automatic video tool, condensing research papers from arXiv into 3 to 5-minute “elevator pitches.” This collaboration enhances accessibility to scientific research, making content more digestible for researchers with limited time and visually impaired individuals. The partnership exemplifies AI’s role in enhancing accessibility to scientific results and promoting open science efforts.

AVIS showcases Google's progress in AI-powered image search: Google’s AVIS, an Autonomous Visual Information Seeking method, integrates large language models with computer vision, web search, and image search tools to autonomously find answers to complex questions about images. AVIS’s components include a planner for action decisions, a working memory, and a reasoner for processing output. It adapts actions based on real-time feedback and utilizes computer vision, web search, and image search tools. AVIS achieved high accuracy on datasets outperforming fine-tuned models. The team plans to explore its framework for other reasoning tasks and experiment with lighter language models.

From Mad Men to machines? Big advertisers shift to AI: Major advertisers like Nestle and Unilever are embracing generative AI software to enhance productivity and cost-efficiency in their advertising campaigns. Nestle is exploring the use of ChatGPT 4.0 and DALL-E 2 to enhance its marketing efforts, according to Aude Gandon, its Global Chief Marketing Officer. The AI engine generates ideas aligned with brand strategy, which are then refined by the creative team for content production. Unilever also employs generative AI for product descriptions, but concerns about biases and intellectual property persist. WPP, the world’s largest advertising agency, is already leveraging generative AI for campaigns, expecting significant savings.

Scientists Uncover Biological Echoes in Powerful AI Transformer Models: New research suggests that the AI transformer architecture and biological astrocyte-neuron networks share surprising similarities. A collaborative study by MIT, MIT-IBM Watson AI Lab, and Harvard Medical School proposed that astrocytes, brain cells that support neurons, could replicate transformers’ core computations. The study explains how astrocytes’ signal integration mirrors the spatial and temporal memory required for self-attention, a key aspect of transformers. While these insights bridge neuroscience and AI, understanding human cognition’s intricacies remains a challenge, requiring interdisciplinary effort.

NCSoft's new AI suite is trained to streamline game production: NCSoft, the South Korean game developer, has introduced four AI large language models (LLMs) called VARCO to enhance game development. VARCO includes LLM, Art, Text, and Human models for generating text, images, and managing digital humans. While LLM will be released in Korean initially, English and bilingual versions will follow. The models will facilitate game development, allowing human designers to focus on complex tasks while AI handles repetitive ones. NCSoft also plans to apply AO to diverse fields beyond gaming such as fashion, health, robotics, and content.

Academic journals scramble to adapt as AI-assisted papers sneak into publications: Publishers and researchers are developing policies to address concerns about credibility and plagiarism. While AI tools can aid non-native English speakers and improve academic writing, they also raise the risk of “paper mills” producing low-quality research. The rise of multimodels, such as Google DeepMInd’s Gemini, could further impact the generation and analysis of various types of data. Striking a balance between policy and technology integration remains a challenge for publishers.

40% of workers will have to reskill in the next three years due to AI, says IBM study: IBM’s study reveals that 40% of workers, around 1.4 billion individuals, need to reskill within three years due to AI’s impact on job roles. However, IBM emphasizes that generative AI will likely enhance roles rather than replace them, with 87% of surveyed executives sharing this sentiment. The study indicates that adaptability and people skills are now prioritized over technical skills, marking a shift in the workforce’s skill paradigm.

🎧 Did you know AI Breakfast has a podcast read by a human? Join AI Breakfast team member Luke (an actual AI researcher!) as he breaks down the week’s AI news, tools, and research: Listen here

5 new AI-powered tools from around the web

Transform research papers into usable code. Engage in insightful conversations with papers, automate method translation to Python, and run example cases. Designed for engineers, researchers, and academics to understand, implement, and innovate faster.

AI Judge is a groundbreaking platform using AI to deliver unbiased verdicts from conflicting parties’ arguments. Submit your dispute, and AI meticulously assesses evidence and legal principles for a fair decision, considering argument coherence, evidence strength, and legal consistency. Opt for legal expert review for added rigor.

AI Music Generator: Songburst empowers everyone with AI song creation. Users can craft music for content, podcasts, or mixes. Describe desired sound, and let AI compose unique tracks. Enhance prompts with Songburst Prompt Enhances, and download unlimited wav/mp3 files.

TabHub is your next-gen tab manager. Effortlessly save, organize, and collaborate on tabs while optimizing memory. Enhance productivity with Time Tracker and Explore Community features. Experience intentional browsing, from managing windows to sharing links.

AffordHunt empowers indie hackers and SMBs with budget-friendly AI & SaaS alternatives. Navigate the rising costs of tech tools with curated options on AffordHunt. Discover quality tools without compromising innovation.

arXiv is a free online library where scientists share their research papers before they are published. Here are the top AI papers for today.

📄 Text Injection for Capitalization and Turn-Taking Prediction in Speech Models

The paper introduces a groundbreaking method to enhance Automatic Speech Recognition (ASR) tasks using text injection. Leveraging unpaired text data, the study employs the Joint End-to-End and Internal Language Model Training (JEIT) algorithm to train an ASR model for auxiliary tasks like capitalization and turn-taking prediction. This approach fills the gap where E2E ASR models lack access to plentiful text-only data. Results demonstrate remarkable improvements in capitalization accuracy, especially for rare words, and enhanced turn-taking prediction recall. The research suggests a promising avenue to enrich auxiliary task performance and bring about a paradigm shift in the integration of text injection with ASR systems.

📄 Dual-Stream Diffusion Net for Text-to-Video Generation

This paper presents an innovative approach to enhance text-to-video synthesis using diffusion models. Addressing the challenge of maintaining content consistency and motion dynamics in generated videos, the proposed Dual-Stream Diffusion Net (DSDN) introduces separate streams for video content and motion. These streams operate independently while being aligned through a cross-transformer interaction module, resulting in smoother and more visually continuous videos. Motion decomposer and combiner components further enhance video motion operations. The research showcases qualitative and quantitative results demonstrating improved video generation quality, showcasing its potential to advance text-to-video synthesis. The study bridges the gap between text and dynamic visual content, contributing to the evolving landscape of AI-generated multimedia.

📄 Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation

The paper introduces Revive-2I, a novel method for zero-shot image-to-image translation that transforms fossil images into images of living animals based on text prompts. The work focuses on translating skulls into living animals (Skull2Animal) across large domain gaps. Traditional methods, including unguided GANs, struggle with this task due to lacking domain understanding. Instead, Revive-2I employs guided diffusion models and text prompts, demonstrating successful translation results by encoding target domain knowledge. The introduced Skull2Animal dataset comprises skull and living animal images, emphasizing a constrained longI2I process. The study’s contributions include proposing the Skull2Animal task, benchmarking existing methods, and introducing Revive-2I for zero-shot longI2I with promising results.

📄 RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models

In this paper, the authors investigate the in-context learning capabilities of retrieval-augmented encoder-decoder language models, particularly focusing on the ATLAS model. They identify limitations in ATLAS’ in-context learning due to a mismatch between pretraining and testing and a restricted context length. To address these issues, they propose RAVEN, a model that combines retrieval-augmented masked language modeling and prefix language modeling. They introduce Fusion-in-Context Learning to enhance few-shot performance and allow the model to leverage more in-context examples. Through experiments, RAVEN outperforms ATLAS in various settings and showcases potential for in-context learning with retrieval-augmented encoder-decoder language models.

📄 Link-Context Learning for Multimodal LLMs

The paper proposes Link-Context Learning (LCL) as a novel approach to enhance the learning capabilities of Multimodal Large Language Models (MLLMs). While current MLLMs struggle with recognizing unseen images and understanding novel concepts, LCL strengthens the causal relationship between the support set (demonstration) and the query set, enabling MLLMs to grasp both analogies and underlying causal associations. A new dataset, ISEKAI, is introduced for evaluation. Unlike traditional In-Context Learning (ICL), LCL fosters the ability to learn and retain acquired knowledge for accurate question-answering. The paper details the distinction between ICL and LCL and presents experimental results demonstrating the superior performance of LCL-MLLM on the ISEKAI dataset.

Thank you for reading today’s edition.

Your feedback is valuable.

Respond to this email and tell us how you think we could add more value to this newsletter.