AI Breakfast
Posts
Google's Next-Gen 'Gemini' AI to go Head to Head With ChatGPT

Google's Next-Gen 'Gemini' AI to go Head to Head With ChatGPT

AI Breakfast
August 30, 2023

Good morning. It’s Wednesday, August 30th.

Did you know: On this day in 1997, Netflix was founded in Scotts Valley, CA?

In today’s email:

AI Models & Innovations
Ethical & Legal Issues
Tech & Infrastructure Investments
AI in Business Applications
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think of this edition by replying to this email, or DM us on Twitter.

Today’s trending AI news stories

AI Models & Innovations

AI Wars: Google's Next-Gen 'Gemini' AI Goes Head to Head With ChatGPT Google’s next-gen AI model, Gemini, is claimed to surpass GPT-4 in terms of computing power. Despite the claim, OpenI’s CEO, Sam Altman, questioned the idea that greater computing power automatically leads to better AI performance. Google is collaborating with companies like Meta and Anthropic to bring AI models to Google Cloud. The Gemini model is expected to be released in 2023 and is designed to handle various modalities, while ChatGPT is text-based. The competition between AI models is intensifying, making 2023 a crucial year for large language models. However, Google also faces a lawsuit over alleged misuses of personal data for AI training.

Google DeepMind Launches SynthID, a Groundbreaking Watermarking Tool for AI-Generated Images Google DeepMind has introduced SynthID, a novel watermarking tool for AI-generated images. SynthID embeds invisible watermarks directly into pixels of AI-generated images, making them imperceptible to the human eye yet detectable by software. It helps tag images as AI-created even after cropping, color adjustments, or compression. This tool aims to identify synthetic media and counter misinformation. While some experts are cautiously optimistic, they highlight that no watermarking technique is foolproof. Google plans to gather feedback during the beta phase, with potential expansion. SynthID represents a step toward reliable AI content attribution, despite certain limitations.

Google launches BigQuery Studio, a new way to work with data Google has introduced BigQuery Studio, a new tool for working with data that enhances data analysis and insights. As companies increasingly invest in big data and AI, BigQuery Studio provides a more streamlined and efficient way to manage and extract insights from large datasets. This tool demonstrates Google’s commitment to providing advanced data analytics solutions and catering to the growing demand for data-driven decision-making.

Introducing ChatGPT Enterprise OpenAI introduces ChatGPT Enterprise, offering advanced security, unlimited high-speed GPT-4 access, longer context windows, data analysis capabilities, customization, and more. Over 80% of Fortune 500 companies have adopted ChatGPT, with industry leaders like Block, PwC, and Zapier using it to enhance productivity. ChatGPT Enterprise prioritizes data privacy, offers admin controls, and provides powerful features like unlimited GPT-4 access and advanced data analysis. It has proven to increase productivity in various tasks, from coding to data analysis. OpenAI plans to offer more features, including customization and solutions for specific roles.

Watch out, Midjourney! Ideogram launches AI image generator with impressive typography Ideogram, a new generative AI image startup founded by former Google Brain researchers, has entered the scene with a $16.5 million seed funding round led by a16z and Index Ventures. What sets Ideogram apart is its focus on reliable text generation within images, particularly for typography and lettering. The startup offers various preset image generation styles, including 3D rendering, painting, fashion, and more. While Ideogram has gained attention for its typographic capabilities, it currently lacks some features seen in rival image generators and is still in beta.

Ethical & Legal Issues

US Copyright Office wants to hear what people think about AI and copyright The US Copyright Office is seeking public comments on AI and copyright issues, addressing concerns around AI-generated content. It’s asking questions about how AI models should use copyrighted data in training, whether AI-generated material can be copyrighted without human involvement, and how copyright liability works with AI. The agency is also interested in AI’s potential violation of publicity rights. The copyright status of AI training data and generative AI output has become a pivotal topic for regulation and litigation, prompting discussions about intellectual property use in AI models. Comments are due by October 18th.

Behind the AI boom, an army of overseas workers in 'digital sweatshops' In the Philippines, a vast and largely unregulated workforce of over 2 million people engaged in labor-intensive tasks to annotate and refine data for AI models used by American companies. This workforce, referred to as “taskers,” is spread across the global south and often experiences low pay, delayed payments, and exploitations. Scale Ai, a San Francisco startup, owns the platform Remotasks, where many Filipino taskers work. While Scale AI claims to pay a living wage, reports reveal significant payment issues. The situation raises ethical concerns about labor exploitation in the AI industry’s underbelly.

Tech & Infrastructure Investments

Tesla's New Supercomputer Accelerates Its Ambition to Be an AI Play Alongside Nvidia Tesla’s new $300 million AI computing cluster, equipped with 10,000 Nvidia H100 GPUs, aims to accelerate the development of self-driving car technology. This move highlights Tesla’s commitment to AI advancement and its plan to spend over $2 billion on AI training in 2023 and 2024. The cluster underscores the evolution of the AI ecosystem, with Nvidia’s hardware and software being essential for AI computing. Tesla’s investment aims to enhance computing capabilities for its full self-driving technology and capitalize on the potential of AI-powered applications like self-driving cars.

Intel says new 'Sierra Forest' chip to more than double power efficiency Intel has unveiled its upcoming data center chip, the “Sierra Forest,” promising over 240% better performance per watt than its current generation of data center chips. The push for energy efficiency comes as technology firms, including AMD and Ampere Computing, including AMD and Ampere Computing, respond to the need to reduce electricity consumption in data centers. Intel’s “Sierra Forest” is slated to be released next year, as the company splits its data center chips into two categories, focusing on performance and efficiency, respectively. The move aims to address the growing demand for greater computing work done per chip in the era of high-power data centers.

AI in Business Applications

Uber Eats is reportedly developing an AI chatbot that will offer recommendations, speed up ordering Uber Eats is working on an AI chatbot that will offer uses recommendations and streamline the ordering process, as discovered by developer Steve Moser in the app’s hidden code. The chatbot will inquire about budget and food preferences to facilitate orders, aligning with Uber CEO Dara Khosrowshahi's earlier mention of an AI chatbot. This move follows a trend of delivery apps integrating AI, with DoorDash unveiling voice ordering tech, Instacart deploying AI search tools, and Uber’s chatbot poised to cater to users’ delivery preferences.

China's medical AI sector booms with LLM-based applications China’s medical AI sector is experiencing rapid growth, with over 18 medical models based on LLMs already developed. The sector is projected to expand by about 40% annually from 2020 to 2025, reaching a market size exceeding $4.1 billion by 2025. Notably, MedGPT, an AI application by Medlinker, achieved impressive diagnostic scores in a competition, indicating its potential for use in diagnostics, telemedicine, and decision support. While AI is advancing, medical professionals emphasize the importance of combining AI with human expertise, particularly in fields like radiology.

🎧 Did you know AI Breakfast has a podcast read by a human? Join AI Breakfast team member Luke (an actual AI researcher!) as he breaks down the week’s AI news, tools, and research: Listen here

5 new AI-powered tools from around the web

Notably is an AI-powered platform that helps users conduct quick and accurate research. It offers a research repository, video transcription, cluster analysis, and digital sticky notes. With AI templates, users uncover insights from data swiftly. Access a knowledge base, blog, and roadmap for a comprehensive research experience.

Brainwave streamlines customer service with AI automation. Automate responses, lead collection, and meeting scheduling to address over 70% of inquiries. Crafted for fast-growing companies, Brainwave leverages generative AI for efficient resolution.

Gradient is a platform for AI innovation designed for developers, data scientists, and AI enthusiasts. Its API lets users craft personalized, private LLMs. Fine-tune with a single API call, leveraging cutting-edge open-source models like Llama 2. Maintain data ownership and security while developing seamlessly.

Buildinpublic.ai is a platform designed to elevate your indie journey. Embrace progress sharing, engage a feedback-rich community, and foster a pre-launch audience. Navigate product development, amplify connections, and accelerate your path to product-market fit.

AutoApplyAI by Wonsulting streamlines job applications. It automates form filling and uses advanced AI to craft personalized responses, optimizing efficiency while maintaining a personal touch. With AutoApplyAI, you can simplify your job search and focus on meaningful opportunities.

arXiv is a free online library where scientists share their research papers before they are published. Here are the top AI papers for today.

📄 Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers

The paper introduces a method to mitigate computations costs in VIsion Transformers for video tasks. Leveraging temporal redundancy, their “Eventual Transformers” selectively update tokens that have changed significantly over time, reducing computation while maintaining accuracy. The approach enables adaptive control compute costs at runtime, enhancing resource efficiency. This novel method offers substantial computational savings in video recognition tasks, making Vision Transformers more viable for resource-constrained scenarios without compromising accuracy. The paper contributes to video processing by optimizing the utilization of intermediate computations across frames or clips.

📄 MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records

The paper presents the MedAlign dataset, curated by 15 clinicians from diverse specialties. This benchmark dataset comprises 983 real-world clinical instructions and responses, offering insights into language model (LM) performance for healthcare tasks. Leveraging 276 longitudinal Electronic Health Records (EHRs), the dataset evaluates LM-generated responses using clinician rankings and automated NLG metrics. The work addresses the gap in evaluating LM utility for complex clinical tasks and facilitates research in healthcare NLG by providing an authentic and comprehensive instruction-response dataset. The dataset’s development process and evaluation strategies are detailed, aiming to transform EHR interactions with LLMs.

📄 OMNIQUANT: Omnidirectionally Calibrated Quantization for Large Language Models

OmniQuant revolutionizes LLM quantization with its novel approach. This technique freezes full precision weights while integrating learnable parameters combining the benefits of quantization-aware training (QAT) and post-training quantization (PTQ). Two innovative components, Learnable Weight Clipping (LWC) and Learnable Equivalent Transformation (LET), address challenges in both weight and activation quantization. Omniquant excels in diverse settings and outperforms existing methods, achieving quantization-aware training performance with PTQ efficiency. Notably, OmniQuant introduces no extra computational cost or parameters after quantization, making it highly applicable for real-world LLM deployment.

📄 Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

This study presents a comprehensive benchmark evaluation of Large Language Models (LLMs) for the Text-to-SQL task. The authors explore various prompt engineering strategies, including question representation, example selection, and organization. They propose an integrated solution named DAIL-SQL, achieving 86.6% execution accuracy on the Spider leaderboard. The study emphasizes token efficiency and explores the potential of open-source LLMs for the task, showing promising results. The work contributes insights into prompt engineering and offers a systematic comparison of LLMs for Text-to-SQL, inspiring further research in this domain.

📄 ORES: Open-vocabulary Responsible Visual Synthesis

The study introduces Open-vocabulary Responsible Visual Synthesis (ORES), a challenging task aiming to generate images while avoiding specific concepts yet preserving user inputs. The proposed Two-stage Intervention (TIN) framework addresses this problem by combining rewriting with learnable instruction through a large-scale language model (LLM) and synthesizing with prompt intervention. The approach demonstrates effectiveness in reducing risky image generation and showcases the potential of LLMs in responsible visual synthesis. A benchmark, dataset, and baseline models are provided for evaluation. The paper contributes a novel method to responsible visual synthesis and offers insights into the role of LLMs in this domain.

Thank you for reading today’s edition.

Your feedback is valuable.

Respond to this email and tell us how you think we could add more value to this newsletter.