Google’s AI assistant can now read your emails

Plus, OpenAI releases new GPT-3.5 Instruct model

Good morning. It’s Wednesday, September 20th.

Did you know: On this day in 1989, Apple released the Macintosh Portable laptop.

Today’s trending AI news stories

AI Product Updates & Features

Google’s AI assistant can now read your emails, plan trips, and “double-check” answers with various Google apps like Gmail, Docs, Drive, Maps, YouTube, and more. A new “double-check button” allows users to verify Bard’s responses against web content to address accuracy issues. The update also introduces Bard Extensions, enabling the AI to access Google’s services, such as reading emails in Gmail or providing real-time flight information. Google emphasizes privacy, assuring that content from Gmail, Docs, and Drive will not be viewed by human reviewers or used for advertising.

OpenAI is gearing up to release its multi-modal language model, Gobi, ahead of Google’s Gemini. Multi-modal models like Gobi combine text and visual elements for enhanced capabilities, such as generating stories from images and answering visual questions without OCR. OpenAI had initially limited access due to misuse concerns but is now set to launch GPT-Vision more widely. Google’s Geminin boasts proprietary data, potentially giving it an edge. OpenAI’s CEO hinted at GPT-4 improvements, with Gobi’s potential as GPT-5 still uncertain. This competition parallels the iPhone versus Android rivalry in the AI world, leaving users eager to see the outcome.

Suno AI's new text-to-music model generates impressive songs. Chirp v1 can convert various music genres like rock, pop, and K-pop, as well as style descriptions like melodic or fast, into music. It also offers the ability to structure lyrics using commands like [verse] and [chorus]. The song creation process is integrated with Discord, and Suno provides 250 free credits per month for users to try it out. Additional credit plans are available, with a Pro plan offering 1000 credits for $10 per month.

OpenAI releases new language model InstructGPT-3.5 designed to efficiently handle instructions. It replaces several previous instruct models and will retire others by January 4, 2024. This new model performs at the cost and performance level of GPT 3.5 models with 4K context windows and was trained similarly to previous instruct models. OpenAI claims that GPT-4, performs better in following complex instructions and producing higher-quality output compared to GPT-3.5 while being faster and more cost-effective. Gpt-3.5-turbo-instruct is optimized for direct question answering and text completion, not for chat applications.

TikTok debuts new tools and technology to label AI content. Creators can now label their AI-created videos and the platform is testing automatic AI content labeling. This move aims to prevent confusion and misleading content on TikTok. The company will also rename effects that use AI, and educational resources will be provided to help users understand AI. TikTok’s efforts align with industry trends promoting AI content labeling for transparency and responding to concerns about deepfakes and manipulated media.

What to expect from Microsoft’s ‘special’ Surface and AI event where it’s expected to unveil three new Surface devices and AI-powered features for Windows, Office, Bing, and more. This event comes shortly after the resignation of Panos Panay, former head of Windows and Surface. The product announcements include a rumored Surface Laptop Studio 2 and updates to the Surface Go and Surface Laptop Go. Microsoft will also showcase AI-powered features in Windows and Surface, potentially transforming the way users interact with their devices. Additionally, Microsoft’s Copilot plans for Office apps and Bing Chat Enterprise may be on the agenda.

AI Security & Privacy Concerns

Microsoft AI researchers accidentally exposed terabytes of internal sensitive data, including private keys and passwords, while publishing open-source training data on GitHub. Cloud security startup Wiz discovered a GitHub repository belonging to Microsoft’s AI research division that exposed 38 terabytes of sensitive information, including personal backups. Microsoft Teams messages, and more. The exposure occurred due to a misconfigured URL with overly permissive access, allowing potential data tampering. Microsoft has addressed the issue and expanded GitHub’s secret spanning service to prevent such incidents. No customer data or internal services were compromised.

OpenAI launches a red teaming network to make its models more robust to enhance AI model risk assessment and mitigation. Red teaming helps identify and address biases and safety issues in AI models. The network aims to deepen collaboration with scientists, research institutions, and civil society organizations. While some argue for “violet teaming” to address AI’s potential harm, red teaming remains a significant step toward ensuring AI model robustness. OpenAI seeks a diverse group of experts, including those without prior AI experience, to participate in the network. The future of AI safety may rely on such collaborative efforts.

AI in Healthcare

Oracle integrates generative AI in healthcare, unveils new clinical digital assistant, aiming to reduce manual work and enhance patient care. This solution leverages generative AI with voice commands, allowing patients to perform self-service actions using simple voice commands, such as scheduling appointments. It aims to simplify administrative tasks for physicians, enabling them to focus more on patient interactions. The digital assistant automates note-taking, proposes context-aware next actions, and responds to conversational voice commands from providers, enhancing the overall healthcare experience.

DeepMind’s New AI Can Predict Genetic Diseases with 90% accuracy. Missense variants are single-letter DNA changes that can result in different amino acids being produced, potentially leading to genetic diseases. Deepmind’s AlphaMissense was trained on human and primate biology language and assigns a "pathogenicity score" to missense variants, aiding researchers in identifying disease-causing mutations faster and improving our understanding of genetic variants. DeepMind worked with Genomics England to verify the model’s predictions. While promising, AlphaMissense should be used to guide further research, not as a standalone diagnostic tool.

Google and the Department of Defense are building an AI-powered microscope to help doctors spot cancer Collaborating on an Augmented Reality Microscope (ARM) powered by artificial intelligence, this technology overlays cancer locations and severity indicators for pathologists, improving accuracy and efficiency. With 13 ARMs in existence, initial research shows promise, particularly for smaller labs and remote locations facing workforce shortages. While not meant to replace digital pathology systems, the ARM offers an affordable alternative, costing health systems between $90,000 to $100,000. Further testing and scaling are underway to enhance cancer diagnosis and patient care.

AI in Global Politics & Strategy

China Aims To Replicate Human Brain in Bid To Dominate Global AI, aiming to outpace the West in the development of AGI technologies. Unlike the West, China’s efforts are more centralized, with substantial state funding and support. Chinese scientists are working diligently on AGI, which goes beyond narrow AI systems, with the goal of achieving systems that can outthink humans in many tasks. The Chinese government sees AGI as a strategic asset, emphasizing its importance in its national plans and highlighting its potential to provide a competitive advantage.

AWS joins the UN summit on generative AI discussing how generative AI can contribute to achieving the UN Sustainable Development Goals (SDGs). AWS Vice President Swami Sivasubramanian highlighted the importance of democratizing AI access and the quality of information. U.S. Secretary of State Antony Blinken stressed protecting human progress and welfare. The event emphasized international cooperation for AI’s responsible use and its potential to address societal challenges.

5 new AI-powered tools from around the web

Klu AI offers unified and interactive data management allowing users to connect apps like Slack, Notion, and Google Drive. With lightning-fast search capabilities and AI-powered chat, Klu streamlines data retrieval and communication. Users can chat with their data and enjoy an all-in-one platform.

Querio 1.0 offers self-service AI analytics designed for non-tech users, simplifying data retrieval across various platforms without requiring SQL or Excel expertise. Users can connect apps like HubSpot and database and converse with their data using AI-driven chat. Querio prioritizes data security and privacy and is eager to receive user feedback for continuous improvement.

FusionArt AI combines art and AI to create captivating spirals, illusions, and patterns. This innovative tool pushes artistic boundaries and offers a unique creative experience. Users can explore the fusion of art and technology, embracing AI’s potential for artistic expression. FusionArt AI promises endless artistic possibilities and invites users to join its creative journey.

PlotGPT is a Data Analyst tool designed to visualize and analyze ChatGPT data efficiently. It offers intuitive visualizations and customizable analysis options, making data interpretation more accessible for users.

MindMeldCanvas AI is a fact-checking copy creator that empowers users with robust AI writing capabilities. This tool offers academic writing with citations, images, code, and expert AI chatbots, making it a comprehensive AI writer. Explore the potential of your mind with this innovative writing tool.

arXiv is a free online library where scientists share their research papers before they are published. Here are the top AI papers for today.

The research investigates the effects of continued pre-training on large language models using domain-specific corpora. It discovers that this training approach while imparting domain knowledge, hampers the model’s performance in question answering. To mitigate this approach, a novel method is proposed, transforming raw corpora into reading comprehension texts enriched with various tasks related to content. This approach consistently enhances model performance across tasks in biomedicine, finance, and law domains. Impressively, the adopted language model achieves competitive results against domain-specific models of much larger scales. Additionally, this technique shows promise for developing general models across diverse domains.

This paper from Microsoft Research provides a comprehensive survey of multimodal foundation models, focusing on their evolution from specialist models to general-purpose assistants in computer vision and vision-language domains. It categorizes research into specific-purpose models, including visual understanding and generation, and general-purpose assistants, which aim to perform diverse vision tasks. The paper discusses trends in building general-purpose AI gents with multi-modality, alignment with human intents, and capabilities such as planning, memory, and tool use. It highlights the rapid evolution in this field and the potential to advance AI systems for real-world applications by bridging computer vision and broader AI communities.

In a comprehensive investigation, researchers explored the relationship between predictive models and lossless compression. They scrutinized the offline compression capabilities of large language models, demonstrating their competence not only in text but also across various data modalities, such as images and audio. These models outperformed specialized compressors like PNG and FLAC. Additionally, the study shed light on scaling laws, indicating that compression performance is constrained by dataset size and model parameters. Tokenization techniques were also explored, revealing their impact on compression. The research highlights the intertwined nature of model size and dataset size in achieving optimal compression results.

This study explores the Distributed Shampoo Optimizer’s PyTorch implementation for training neural networks at scale, focusing on performance optimizations. Shampoo is a novel algorithm that strikes a balance between diagonal and full-matrix preconditioning for adaptive gradient methods. It reduces memory and computational costs, making it practical for large-scale deep learning. The paper describes the implementation’s key features, including the distribution of memory and computation, achieving a 10% performance reduction compared to standard diagonal scaling-based methods. They validate their approach through an ImageNet ResNet50 ablation study, demonstrating Shampoo’s effectiveness with minimal hyperparameter tuning.

The paper introduces MINDAGENT, an infrastructure for evaluating Large Language Models (LLMs) in multi-agent collaboration across various gaming domains. It focuses on assessing LLM’s planning and coordination abilities within a virtual kitchen scenario called CUISINEWORLD. The infrastructure allows LLMs to coordinate multiple agents to complete tasks, with a Collaboration Score (CoS) metric quantifying their efficiency. By simplifying complex tasks into a game, this research aims to advance LLMs’ multi-agent planning capabilities, potentially contributing to the development of more engaging and interactive AI-driven gaming systems and seamless human-AI collaboration.

