- AI Breakfast
- Posts
- Could Midjourney Face Lawsuit for Stealing Artist Data?
Could Midjourney Face Lawsuit for Stealing Artist Data?
Good morning. It’s Wednesday, January 3rd.
Did you know: Bitcoin turned 15 years old today?
In today’s email:
AI in Technology and Business
AI in Healthcare and Medicine
AI in Legal, Ethical, and Social Contexts
6 New AI Tools
Latest AI Research Papers
ChatGPT + DALLE 3 Attempts Comics
You read. We listen. Let us know what you think by replying to this email.
Interested in reaching 47,612 smart readers like you? To become an AI Breakfast sponsor, apply here.
Today’s trending AI news stories
AI in Technology and Business
> Midjourney faces criticism for using a database of 16,000 artists including a six-year-old, Hyan Tran, to train its AI Image generator. This revelation, linked through social media platforms X and Bluesky, highlights Midjourney’s use of artists’ works across various styles, genres, and time periods for AI training, sparking legal concerns. The list includes notable names like Warhol, Kahlo, and Picasso. This practice raises issues about artwork scraping for AI training, leading to a class-action lawsuit against Stability AI, Midjourney, and DeviantArt. Amidst rising fears about AI’s impact on artistic careers, a digital tool called Nightshade from the University of Chicago plans to disrupt AI’s learning from massive image sets. Despite the controversy, a US Copyright Review Board ruling last September deemed Midjourney-generated images non-copyrightable due to their production method.
> Cloudflare’s Workers AI, launched as an AI inference-as-a-service platform in 2023, empowers organizations to deploy generative AI at the edge with minimal coding, utilizing GPUs across their global network. This service, designed for workloads too large for devices but not requiring cloud server farms, includes pretrained models like Meta’s Llama 2 and OpenAI’s Whisper, and plans expansion for customer models. Prioritizing privacy, Workers AI doesn’t train on customer data and includes a vector database, Vectorize, for user interactions. Cloudflare’s AI Gateway enhances application resilience and cost management, while their expansive GPU deployment addresses the demand amid GPU shortages, showcasing foresight in their infrastructure development.
> NVIDIA Senior Research Assistant Jim Fan foresees 2024 as a watershed year for robotics, likening it to the “ChatGPT moment” for physical AI agents. Overcoming Moravec's paradox, the AI community is making significant strides. Key developments include multimodal LLMs integrating robotics, advanced algorithms bridging high-level reasoning and low-level control, robust hardware progress by major tech players, and collaborative efforts in data curation like the RT-X dataset. Additionally, simulation and synthetic data are playing crucial roles in enhancing robot dexterity.
> IDC Global, a leading market intelligence provider, predicts that by 2026, GenAI-powered skills development will catalyze $1 trillion in productivity gains. This shift, fueled by 35% of global enterprises using GenAI for digital co-creation, is expected to double their revenue growth compared to non-users. Focused on enhancing sales, IT, and finance, IDC’s survey indicates a strategic move from cost-cutting to revenue expansion through GenAI, altering core business goals and customer solutions. This forward-looking approach will be discussed at IDC Directions 2024 in Dubai.
> Rabbit Inc. is set to unveil a game-changer in the AI world with its Rabbit R1 assistant at CES 2024 on January 9. This revolutionary tool, built on Rabbit OS and powered by a Large Action Model (LAM), is all about flipping the script: technology adapts to us, not the other way around. Rabbit r1 differentiates from existing assistants like Google Assistant or Siri by executing tasks, not just responding to queries. The product launch, accompanied by significant funding and expertise, signals a step forward in making AI more integrated and responsive in daily life.
> Samsung is set to reveal its latest Galaxy S24 series at the Galaxy Unpacked Event on January 17, 2024, in San Jose California, with a livestream on YouTube. The event teases the introduction of Samsung’s own artificial intelligence, hinting at advancements in on-device AI technology. The Galaxy S24 lineup, including the Ultra Model with a titanium frame and Snapdragon 8 Gen 3 chip, is expected to feature enhanced AI capabilities, such as AI-powered photo editing and live translation.
> MyShell, a Canadian AI startup, introduces OpenVoice, an open-source voice cloning model, at MIT and Tsinghua University. OpenVoice allows rapid, precise voice cloning with control over tone, emotion, and accent using minimal audio samples. Unlike existing models, OpenVoice doesn’t require specific text input and offers instant voice generation with emotional tone adjustments. MyShell aims for accessibility and community benefit, emphasizing ‘AI for All'.’ The model’s effectiveness is attributed to a simple yet powerful decoupling approach, differentiating it from competitors like Meta’s Voicebox.
AI in Healthcare and Medicine
> MIT has developed a new AI-driven approach to identify antibiotics capable of combating antibiotic-resistant superbugs like MRSA. This progress involves using deep learning to analyze compounds, leading to the discovery of potential antibiotics that are effective yet non-toxic to humans. The MIT team’s work, part of the Antibiotics-AI Project, aspires to create new antibiotic classes to fight the deadliest pathogens, marking a major advancement in the field that hasn’t seen critical innovation in over three decades.
> Konkuk University Medical Center (KUMC) teams up with AI tech firm Neurophet to revolutionize brain disease imaging, specifically targeting Alzheimer's diagnostics. This collaboration combines KUMC’s medical expertise with Neurophet’s AI-driven brain analysis tools, aiming to enhance Alzheimer's research and treatment. Their joint efforts will also expand into the digital healthcare sector and build international collaborations. KUMC’s recognition in dementia research, including innovative diagnostic markers, complements this strategic partnership.
> Logan Kilpatrick from OpenAI characterizes prompt engineering as more of a “bug” than an ideal feature in AI development. He anticipates a significant reduction in the need for elaborate prompts as AI systems evolve, a stark contrast to the present where tech giants like Google and Microsoft heavily rely on complex prompts for benchmark achievements. Kilpatrick envisions a future where advanced AI models naturally comprehend and respond to simpler prompts strategies less critical. This progression signifies a shift towards more intuitive AI interactions, especially in high-stakes applications like medical diagnostics where accuracy is paramount.
> IEEE Spectrum features a focus on humanoid robots making significant strides in the workforce. Companies like Agility Robotics, among seven others, are deploying robots in commercial settings to determine their readiness for practical tasks. For instance, Agility’s robot, Digit, designed for logistics and tasks like tote handling, offers a glimpse into the future of robots performing repetitive human-like jobs. Digit, capable of working efficiently in human-designed environments, is being tested in Amazon’s warehouses.
> New research from DeepMind demonstrates that subtle alterations designed to deceive AI image recognition systems can also subtly influence human perception. Adversarial images, intentionally modified to mislead AI models, show that while human vision isn’t as susceptible as AI to these changes, human decision-making can still be biased by them. This finding underscores a need for comprehensive research into the impact of such technologies on both AI systems and human perception, potentially guiding future AI safety and security research. This study, published in Nature Communications, suggests the importance of aligning AI vision models more closely with human visual processing to enhance the robustness and safety of AI.
In partnership with Mojji Custom GPTs
Mojju offers unique and powerful custom GPTs for OpenAI. Their portfolio includes a diverse range of GPTs including productivity tools, various assistants & guides, business & finance tools, and a lot more! All GPTs are free to use!
Crafted by a skilled AI team, Mojju offers a range of proven solutions, including GPTs integrated with Zapier, MailChimp and Stable Diffusion. Continuous support and updates are provided by Mojju’s team!
Unlike other products in the market that tend to aggregate all available GPTs, often leading to clutter and confusion, Mojju takes a different approach. Their library consists of reliable and tested GPTs developed by our in-house team. Mojju Team’s goal is to maximize the benefits of the emerging trend, ensuring users receive the utmost value from their efforts.
Thank you for supporting out sponsors!
6 new AI-powered tools from around the web
Polar Habits 🐻❄️ offers a unique, guilt-free approach to habit tracking, focusing on momentum rather than streaks. Its supportive system encourages resilience and long-term habit formation, eliminating the discouragement of missed days.
Sikey.io offers a no-code solution for entrepreneurs and developers to validate product ideas quickly with captivating landing page templates, easy sign-up monitoring, and audience engagement tools.
crewAI, is an advanced AI framework, enables sophisticated multi-agent interactions. It’s a free, open-source platform designed to foster collaborative intelligence among AI agents, enhancing their ability to tackle complex tasks.
OpenCV University is a 100-Day AI Career Challenge designed to jumpstart your AI career. It offers 30% discount on courses and a chance to earn rewards. Complete courses in 100 days, receive $100 back per course, and gain lifetime access to materials.
Leo AI, an AI-powered engineering design co-pilot, transforms words, sketches, and specs into DFMA-optimized assemblies. It streamlines ideation, optimizes CAD models, and boosts productivity using company-specific design guidelines.
WizyChat, a GPT-4 powered chatbot platform, enables creation of intelligent AI chatbots with over 5,000 integrations and support for 95 languages. Features include a smart site scanner, AI knowledge bases, and requires no coding.
arXiv is a free online library where researchers share pre-publication papers.
Researchers from Microsoft introduce a new method for generating high-quality text embeddings using synthetic data and less than 1k training steps. This approach leverages proprietary LLMs to generate a diverse range of tasks in 93 languages, circumventing the complex training pipelines and limited task diversity and language coverage of manual datasets. The synthetic data is used to fine-tune open-source decoder-only LLMs using standard contrastive loss. Their experiments show that this method achieves competitive performance on benchmarks like BEIR and MTEB without any labeled data, it sets new state-of-the-art results.
Researchers from Shanghai Jiao Tong University and Microsoft Corporation investigate methods to enhance Large Language Models (LLMs) like LLaMA/OPT with speech synthesis capabilities, using the text-to-speech model VALL-E. They explore three integration methods: fine-tuning LLMs directly, superposing LLMs and VALL-E layers, and coupling LLMs with VALL-E using LLMs as text encoders. Experiments reveal that direct fine-tuning of LLMs is less effective for speech synthesis. However, superposing LLMs and VALL-E improves speech quality, speaker similarity, and reduces word error rate (WER). The coupled methods, using LLMs as text encoders, outperforms other approaches with better speaker similarity and a significant reduction in WER.
DocLLM, developed by JPMorgan AI Research, is a generative language model tailored for multimodal document understanding. It uniquely processes enterprise documents like forms and invoices by integrating textual and spatial modalities. Unlike traditional models, DocLLM uses bounding box data instead of complex image encoders, focusing on text semantics and spatial layout. This approach, involving disentangled matrices and pre-training infilling text objectives, enhances the handling of irregular layouts and diverse contents, outperforming advanced models on various tasks and datasets.
The innovative differentiable model for detecting faint boundaries in images, “Boundary Attention,” is developed by researchers from Google Research and Harvard University. This model excels in identifying contours, corners, and junctions, even in noisy environments. It refines a field of variables around each pixel using a boundary-aware local attention operation. The model’s adaptability allows it to handle larger images and detail geometric features within different parts of an image with sub-pixel precision. Despite being trained on simple synthetic data, it generalizes well to real images, demonstrating its potential in computer vision, especially in handling images with high levels of noise.
En3D, developed by researchers from Alibaba Group’s Institute for Intelligent Computing and Peking University’s WangXuan Institute of Computer Technology, is a groundbreaking generative scheme for creating 3D human avatars from 2D synthetic data. This innovative approach overcomes the dependency on scarce 3D datasets or limited 2D collections, enabling the generation of visually realistic, geometrically accurate, and diverse 3D humans. En3D integrates a 3D generator for appearance modeling, a geometry sculptor for refining shapes, and a texturing module for detailed texture mapping. The avatars produced are highly adaptable for animation and editing, demonstrating advanced capabilities in 3D human modeling and potential for broad application in various 3D vision tasks.
ChatGPT + DALLE 3 Attempt Comics
Daily Comic written and drawn by AI
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.