- AI Breakfast
- Posts
- AI Text Detector with 99% accuracy
AI Text Detector with 99% accuracy
Plus, Romania Puts AI on Government Staff
Good morning. It’s Monday, June 12th.
It’s happening: The Romanian Government has developed an AI named “ION” that will be on staff, in charge of monitoring social channels to gauge citizen sentiment on key political issues and advise leadership.
In today’s email:
AI Text Detector with 99% Accuracy?
Palantir Rejects AI Pause
Altman Calls For China Colab
Meta’s AI Music Generator
Microsoft Moves AI Research Out of China
AI church service in Germany draws crowd
Romania Appoints AI on Gov Staff
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email, or DM us on Twitter.
🎧 Did you know AI Breakfast has a podcast read by a human? Join AI Breakfast team member Luke (an actual AI researcher!) as he breaks down the week’s AI news, tools, and research: Listen here
Today’s trending AI news stories
Researchers Develop AI Text Detector With 99% Accuracy
Researchers from University of Kansas have developed a novel method to differentiate text generated by ChatGPT, from academic content created by human scientists. This method utilizes well-known and widely accessible supervised classification techniques and yields results far superior to other AI content detectors.
Their approach consists of identifying unique features that distinguish human writing from AI-generated text. For instance - it has been observed that human scientists often write lengthy paragraphs and use nuanced language, frequently incorporating words like "but," "however," and "although." Using these and 17 other distinct features, the team has built a model that can accurately classify whether a given text is authored by a human or an AI with over 99% precision.
The developed methodology offers scalability and can be further optimized and refined by anyone with basic skills in supervised classification. This opens the door to a wide array of highly accurate, targeted models for detecting AI-generated content in academic writing and other fields, which will likely be a crucial tool for combatting the over-reliance on AI for students and employees.
Palantir CEO Rejects Calls to Pause AI Development
Palantir CEO Alex Karp dismissed calls to pause AI development, arguing that those calling for a pause lack actual AI products. Karp emphasized the importance of maintaining the West’s commercial and military advantages in AI. China is taking a prominent role in AI regulation, and debates continue over the potential benefits and risks of slowing advancement while the call would be ignored in the East. Read More
OpenAI CEO Calls for Collaboration With China to Counter AI Risks
OpenAI CEO Sam Altman advocates for collaboration between the United States and China to address the risks associated with AI development. Despite U.S. sanctions aimed at restraining China’s progress (including a sales ban of the NVIDIA A100 chip to China), Altman emphasizes engagement as a more effective approach. He delivered the opening keynote at a conference hosted by the Beijing Academy of Artificial Intelligence. Read More (paywall)
Meta's Open Source AI MusicGen Turns Text and Melody Into New Songs
Meta has developed an open-source AI model called MusicGen, capable of generating new music pieces based on text prompts and existing melodies. MusicGen, based on a Transformer model, predicts the next section of a music piece similar to how a language model predicts the next characters in a sentence. Meta has released the code and models as open source for research and commercial uses. Read More
Microsoft Evacuates Top AI Experts From China to New Lab in Canada
Microsoft is relocating some of its top AI researchers from China to Canada, a move that could impact China’s tech talent pool. Microsoft Research Asia (MSRA) is seeking visas for its AI experts to move from Beijing to Vancouver, aiming to establish a new lab staffed by global experts. The move is seen as a response to geopolitical tensions between the US and China and an attempt to prevent domestic tech groups from poaching top talent. The decision, named the “Vancouver Plan,” could strain relations with Beijing, which has been actively attracting Chinese researchers back to the mainland. Read More
AI Church Service in Germany Draws Crowd
An AI church service led by the ChatGPT chatbot took place in Nuremberg, Germany, as part of the convention of Protestants in Bavaria. The service, created by ChatGPT, and a theologian from the University of Vienna, featured an AI-generated sermon, prayers, music, and blessings. The event drew significant interest, with attendees forming a long queue before the service began. While some found the AI service intriguing, others felt it lacked emotion and spirituality. Read More
Romania's PM Hires AI Government Adviser
Romania’s prime minister has introduced the world’s first AI government adviser named Ion. Ion will analyze the opinions of Romanian citizens, monitor social media, and provide real-time feedback to assist the government. Read More
Sponsored post
Humata: The Premier AI Document Analyzer
Humata.ai is an AI-driven tool that is designed to help users analyze and understand their documents more efficiently. It offers an impressive range of features that are intended to speed up research, learning, and report creation.
It's like having an intelligent assistant at your disposal, offering instant Q&A, automatically generating summaries of complex technical papers, and creating content for reports and tasks in an instant.
This tool truly understands context, nuance, and detail. And above all, it takes data security seriously, encrypting your documents in secure cloud storage.
Its free version is robust and offers a great starting point with a 60-page limit, while the Pro Plan unlocks even more powerful features, like querying across multiple documents simultaneously.
Having used all of the GPT-PDF analyzers available, Humata stands out to me as the best one.
Thank you for supporting our sponsors
5 new AI-powered tools from around the web
No-code API maker: Backengine
Backengine is a no-code AI-powered API platform, offering a code-free workspace, secure endpoints, team collaboration, and storage. Users can swiftly create, test, and deploy backend APIs with sophisticated application logic in under a minute, with a free sign-up for two AI-powered endpoints. Try it here
AI Language Tutor: Giglish
Giglish is a cost-effective AI language teaching tool that enhances fluency and confidence in multiple languages. It offers features like speech recognition, adjustable speed, grammar, feedback, translations, and engaging suggestions. It supports various languages, including English, Dutch, French, German, Hungarian, Italian, Norwegian, Portuguese, Russian, Spanish, and Swedish. Try it here
AI Text-based RPG: Saga
Saga is a web application for creating and playing text-based adventures with AI-enhanced characters. Enjoy dynamic conversations, free-form writing, and AI-generated illustrations. It has customizable settings and cross-platform compatibility. Try it here
Github workflow tracker: Pipeline
Pipeline.green is a platform that offers enhanced observability for GitHub workflows. It provides features such as easy integration, slow job detection, visual insights, data range summary, and job breakdown to streamline workflows and improve performance. With direct links to GitHub, track pull request status and access detailed information. Try it here
Course Creator: Notion4Teachers
Notion4Teachers’ Course Creator is a tool for creating educational courses in minutes. It offers modular course building, educational course creation, eBook compilation, resource library organization, prompt guide design, product tutorials, and customizable templates. Try it here
arXiv is a free online library where scientists share their research papers before they are published. Here are the top AI papers for today.
MUSICGEN is a powerful Language Model (LM) designed for conditional music generation. It incorporates token interleaving patterns and a transformer-based architecture to produce exceptional music samples based on text or melodic features. Comprehensive evaluations have shown their superiority over baseline models in terms of quality and controllability. You can access the source code, pre-trained models, and music samples on GitHub, making it an invaluable resource for music enthusiasts and researchers alike.
GANeRF is a novel approach that enhances the realism of Neural Radiance Fields (NeRFs) for 3D scene reconstruction and novel new synthesis. It incorporates generative adversarial networks (GANs) to improve rendering constraints and refine the output images. By leveraging an adversarial loss formulation and a patch discriminator, GANeRF mitigates imperfections caused by limited observations, lighting changes, or reflective regions. Experimental results demonstrate significant improvements in rendering quality compared to prior works, making GANeRF a promising method of realistic 3D scene reconstruction and virtual/mixed reality applications.
This paper proposes a framework for evaluating the social impact of generative AI systems across different modalities such as text, image, video, and audio. The goal is to provide a comprehensive approach for researchers, auditors, and policymakers to assess the effects of these systems on people and communities. The framework covers various aspects of the system’s lifecycle, from training to deployment, and emphasizes the evaluation of marginalized communities and potential harms. The paper acknowledges the challenges in evaluating social impact and calls for a broader and more standardized evaluation suite to address these complex issues.
MIND2WEB is a dataset designed for developing and evaluating generalist agents for the web. It contains over 2,000 open-ended tasks from 137 real-world websites spanning 31 domains. The dataset addresses the limitation of existing datasets by offering diverse domains, real-world websites, and a wide range of user interaction patterns. The authors propose a solution using large language models (LLMs) for building generalist web agents, demonstrating decent performance even on unseen websites. The dataset, model implementation, and trained models are open-sourced to facilitate further research in this area.
In this work, the authors propose a benchmark dataset called CORR2CAUSE to evaluate the causal inference skills of large language models (LLMs). They curate a dataset of over 400K samples and evaluate seventeen existing LLMs on the task. The results show that LLMs have a limited ability to perform causal inference, achieving performance close to random. Finetuning helps to some extent, but generalization to out-of-distribution settings remains a challenge. The authors suggest that CORR2CAUSE can guide future research in improving LLMs’ causal reasoning skills and generalizability.
Thank you for reading today’s edition.
Your feedback is valuable.
Respond to this email and tell us how you think we could add more value to this newsletter.