• AI Breakfast
  • Posts
  • Sites scramble to block ChatGPT web crawler

Sites scramble to block ChatGPT web crawler

Good morning. It’s Monday, August 14th.

Did you know: Google DeepMind’s CEO says their new AI model "Gemini" will far surpass the capabilities GPT-4?

In today’s email:

  • AI Regulation and Governance

  • AI Innovations and Breakthroughs

  • AI in Nature

  • AI Business Developments

  • 5 New AI Tools

  • Latest AI Research Papers

You read. We listen. Let us know what you think of this edition by replying to this email, or DM us on Twitter.

Today’s trending AI news stories

AI Regulation and Governance

FEC moves toward potentially regulating AI deepfakes in campaign ads: The move aims to combat the spread of misleading information that could manipulate voters’ perceptions. As AI-powered tools make it easier to create convincing deepfakes, concerns over the potential impact on elections have risen. The FEC’s procedural vote signals its intent to address the issue, with public comments expected before any rules are established. While this approach may address some challenges, it may not cover all aspects of deceptive content in the digital age.

DOD Announces Establishment of Generative AI Task Force: The U.S. Department of Defense (DoD) has announced the establishment of a generative AI task force, Task Force Lima, to responsibly harness the power of AI. Led by the Chief Digital and Artificial Intelligence Office (CDAO), the task force will analyze and integrate generative AI tools, such as large language models, across the DoD. The initiative aims to enhance national security while adopting cutting-edge AI technologies. Task Force Lima will assess and employ generative AI capabilities, while also considering potential risks and disruptions posed by adversaries. The DoD recognizes AI’s potential to improve operations, emphasizing responsible implementation.

AI Innovations and Breakthroughs

Prototype 'Brain-like' chip promises greener AI, says tech giant: A prototype “brain-like” chip developed by IBM could revolutionize AI by making it more energy-efficient. The chip, inspired by human brain connections, could lead to battery-saving AI chips for smartphones and vehicles, while reducing energy costs and carbon footprints for cloud providers. Using components called memristors, the chip can store a range of numbers, emulating the brain’s analogue functions. While IBM’s breakthrough holds promise, challengers remain for widespread adoption and cost-effective manufacturing. This innovation could transform AI applications, extending battery life and potentially reducing energy consumption in data centers.

Microsoft Research says GPT-4 is good enough for medical tasks: Microsoft Research finds that GPT-4 holds potential for medical tasks by efficiently structuring large unstructured data, such as clinical trials and patient information. Despite being trained on generic internet data, GPT-4 outperforms task-specific models in structuring complex clinical studies. Microsoft envisions “precision health copilots” using language models like GPT-4 to accelerate medical care, research, and link clinical observation with decision-making. The model’s capabilities offer promise for improving medical processes including drug development and patient data analysis.

Watch "The Last Artist," a sci-fi short generated with Pika Labs' AI video platform: Machine_Mythos explores AI-driven video creation using text-to-video models like Runway and Pika Labs. The AI director shares insights on the workflow, utilizing music, images, and text prompts to craft cohesive scenes for AI-generated films. The YouTube channel’s recent AI short film. “The Last Artist,” is entirely generated with Pika Labs, a text-to-video platform controlled via Discord. The director anticipates a surge in hybrid movies combining filmed and AI-generated scenes, ultimately envisioning high-quality AI-generated content becoming prominent in the filmmaking landscape.

Supertone AI is an expressive text-to-audio platform that brings back Freddie Mercury's voice: Supertone AI introduces Nuvo, an expressive text-to-audio platform, pushing the boundaries of text-to-speech synthesis with emotion capture. The Controllable Voice Conversion (CVC) technology allows real-time voice transformation, catering to various characters’ voices. Ideal for audiobooks and beyond, Supertone’s suite includes tools like GOYO Voice Separator. Access is restricted to authorized partners submitting business proposals. Founded in 2020, Supertone aims to revolutionize voice content production, offering innovative capabilities for creators in South Korea.

AI in Nature

‘Only AI made it possible’: scientists hail breakthrough in tracking British wildlife: Researchers have employed AI-controlled cameras and microphones to track and identify various species in the wild, offering a breakthrough in biodiversity preservation. AI monitors captured sounds and images at Network Rail-owned test sites, enabling the identification of animals and birds, including dozens of bird species, foxes, deer, hedgehogs, and bats. The technology, which analyzes thousands of hours of data, is critical in managing vegetation, assessing specific movements due to climate change, and protecting biodiversity.

AI Business Developments

Sites scramble to block ChatGPT web crawler after instructions emerge: OpenAI’s web crawler GPTBot, used to train AI models like ChatGPT, faces resistance as websites rush to block its access. OpenAI recently revealed details about GPTBot in its documentation, stating that crawled webpages may enhance future AI models. However, some sites have decided to prevent GPTBot’s access though this comes too late to affect existing training data. OpenAI claims safeguards exist, to avoid certain content, but the potential impact on AI model training remains. While some sites are eager to block GPTBot, the choice presents complexities, potentially impacting knowledge dissemination and future AI interfaces.

Google-backed Anthropic raises $100m from South Korea's SK Telecom: South Korea’s SK Telecom plans to invest $100 million in US-based Anthropic, an AI firm specializing in building foundational models. Anthropic, backed by investors including Google and Spark Capital, aim to enhance its telecommunications-driven AI business. The collaboration aims to develop a global telecommunications-oriented multilingual large language model and an AI platform. SK Telecom’s previous investment in Anthropic remains undisclosed.

Meetkai creates a digital twin of sprawling $2B Silicon Box chip packaging factory: Meetkai, owned by tech business group billionaires Welli Dai and Sehat Sutardja, creates a digital twin of a $2 billion Silicon Box chip packaging factory. Leveraging metaverse and AI technologies, Meetkai’s virtual replica aids in training, recruitment, and visualizing factory operations. Digital twins, similar to the industrial metaverse, enable companies to design and perfect factories before construction. Meetkai expands its AI-powered virtual solutions for improved engagement in customer, client, and employee interactions, bridging the consumer and enterprise metaverse. The platform aims to facilitate autonomous agent interactions, enhancing metaverse’s functionality.

🎧 Did you know AI Breakfast has a podcast read by a human? Join AI Breakfast team member Luke (an actual AI researcher!) as he breaks down the week’s AI news, tools, and research: Listen here

5 new AI-powered tools from around the web

ChatGPT for Excel is an AI-powered add-in revolutionizing productivity. Automate tasks, gain insights and save time. Create content, cleanse, format, and extract from unstructured sources, and translate. Free and compatible with Microsoft Excel.

Quivr is your cloud-based second brain connected to your data that can interact with the world on your behalf in a knowledgeable and trustworthy way. Store and retrieve unstructured information effortlessly. Handle text, images, and code snippets. It is open source and free.

PromptWave is your free hub to save, organize, and share AI prompts. Immerse in prompt exploration, seamless sharing, and nurturing of your ideas.

Clay Nexis is your AI navigator for seamless network navigation. With flawless context recall, Clay offers instant leverage, saving time and reducing the stress of it all. Discover opportunities and connections effortlessly and streamlines your network experience.

Vaizz is an AI platform poised to transform content creation with seamless generation of captivating stories, videos, and voices. Vaizz empowers creators by enhancing creativity and streamlining the process. From films to game development, its flexible plans cater to diverse content creation ambitions.

arXiv is a free online library where scientists share their research papers before they are published. Here are the top AI papers for today.

OpenProteinSet, a collaborative effort involving Harvard and Columbia Universities, introduces a revolutionary open-source dataset of 16 million, Multiple Sequence Alignments (MSAs). These MSAs hold vital biological insights for protein design and structure prediction. Unlike prior resources, OpenProteinSet mirrors AlphaFold2’s training set, enhancing the potential for breakthroughs. With structural homologs from the Protein Data Bank and AlphaFold2 predictions, it’s a valuable tool for diverse protein-focused tasks and multimodal machine learning research. Its availability under Creative Commons license promises to democratize bioinformatics, enabling researchers to unravel complex protein structures and functions.

The PIPPA dataset is poised to reshape the landscape of conversational AI systems. The brainchild of PygmalionAI, this innovative resource confronts the limitations of existing datasets by offering a rich tapestry of over a million interactions, distributed across 26,000 sessions. These carefully crafted exchanges are the result of a collaborative effort, harnessing the creativity of role-play enthusiasts. PIPPA’s inception in 2022 marked a pivotal moment in AI evolution, setting a new standard for persona-driven, contextually rich conversations.

Microsoft Research, in collaboration with Boston University and Rice University, presents a transformative solution to streamline network management complexities. Traditional hurdles like steep learning curves, errors, and data privacy concerns are addressed by this novel approach. The system leverages LLMs to seamlessly generate task-specific code from natural language queries, enabling network operators to scrutinize and implement code solutions. Benchmark applications showcase the accuracy, efficiency, and potential for further refinements.

Meta AI has unveiled an innovative strategy, dubbed “instruction backtranslation,” to enhance instruction-following language models. The method involves a seed model that augments its training data by generating instruction prompts for web content, followed by a curation process to select high-quality examples. Through iterative refinement, this technique produces a model named Humpback, showcasing superior performance on the Alpaca leaderboard compared to other non-distilled models. This approach demonstrates Meta AI’s strides in advancing language models’ ability to comprehend and execute instructions effectively, presenting promising implications for the future of AI-driven language understanding and application.

Researchers from Google DeepMind, IRIT, and University of Toulouse has unveiled a groundbreaking paradigm in neural network advancement. Addressing the formidable compute and time demands of modern neural networks, the study introduces six versatile transformations that enable incremental expansion while safeguarding functionality. This innovation allows for the efficient scaling of transformer-based models, eliminating the need to restart training from scratch. By establishing exact function preservation under minimal initialization constraints, the research ushers in a new era of efficient training pipelines and heightened model capacities.

Thank you for reading today’s edition.

Your feedback is valuable.


Respond to this email and tell us how you think we could add more value to this newsletter.