AI Breakfast
Posts
Open-Source, Uncensored AI Model “Mistral” Makes Waves

Open-Source, Uncensored AI Model “Mistral” Makes Waves

AI Breakfast
December 13, 2023

Good morning. It’s Wednesday, December 13th.

Did you know: On this day in 1980, Apple went public? The initial stock price was just $22 per share, or $0.10 when adjusted for 5 stock splits.

In today’s email:

Open-Source AI Model “Mistral” Makes Waves
AI Policy & Business
New AI Innovations
6 New AI Tools
Latest AI Research Papers
AI Makes Comics

You read. We listen. Let us know what you think by replying to this email.

Interested in reaching 46,725 smart readers like you? To become an AI Breakfast sponsor, apply here.

Today’s trending AI news stories

Open-Source AI Model “Mistral” Makes Waves

> Mistral, a Paris-based AI startup, has made a significant impact with its Mixtral 8x7B model, which combines various models through a ‘mixture of experts’ technique. Remarkably, it matches or surpasses the performance of OpenAI’s GPT-3.5 and Meta’s Llama 2, with a smaller footprint that allows it to run locally on devices without dedicated GPUs. Available for commercial use under Apache 2.0 license, Mixtral 8x7B is notable for its lack of safety guardrails, offering more freedom but also posing potential regulatory challenges. Mistral is also hinting at more advanced models in development, following a substantial $415 million funding led by Andreessen Horowitz and Lightspeed Ventures.

Valued at $2 billion, the company, staffed by 22 ex-DeepMind and Meta researchers, specializes in open-source models for chatbots and generative AI tools. Mistral AI’s new open-source Mixtral MoE model, supporting multiple languages, demonstrates significant benchmarking prowess over GPT-3.5 and Llama 2 with 70B. The company has also launched beta API services, offering various text chat and embedding endpoints.

Follow Mistral on X for the latest updates from the company.

AI Policy & Business

> The EU has imposed stringent regulations on AI biometric surveillance by police and security agencies. Real-time biometric data usage now requires judicial authorization, except in urgent situations like terrorist threats, where approval is needed within 24 hours. The regulations, which aim to prevent predictive policing and racial profiling, mandate immediate data deletion if approval is denied. These rules, applicable in public and private areas, cover 16 serious crimes. Additionally, the EU bans AI systems that manipulate behavior, enable social scoring, or use emotional recognition in workplaces.

> OpenAI, valued at $86 billion, reported a modest revenue of $44,485 in 2022 for its non-profit entity, mainly from investment income. This figure, revealed in an IRS filing, contrasts with the private investor valuation due to the popularity of ChatGPT. The nonprofit status of OpenAI, controlling a massively valued company, has sparked criticism and calls for more transparency. OpenAI’s complex structure includes a capped-profit division for commercial ventures like ChatGPT, while the nonprofit arm focuses on broader AI advancements.

> NeurIPS 2023, a premier AI conference happening in New Orleans this week, announced its top research papers. Awards included the Test of Time Award and two Outstanding Paper Awards across three categories. Key highlights include "Privacy Auditing with One (1) Training Run" for efficient privacy checks in AI models, and "Are Emergent Abilities of Large Language Models a Mirage?" challenging the notion of emergent abilities in large language models. Other notable papers covered language model scaling with limited data, a new method for model fine-tuning, the ClimSim climate simulation dataset, and trustworthiness assessment of GPT models. The Test of Time Award was given to the influential "word2vec" paper.

> ZeniMax Studios, part of Microsoft and developer of ‘The Elder Scrolls Online,’ has reached an agreement with its union on AI use in the workplace. This deal, a first in the video game industry, includes Microsoft informing the union about AI’s impact on work and bargaining impacts upon request. Six guiding principles for AI use have been agreed upon: fairness, reliability, safety, privacy, inclusiveness, transparency, and accountability.

> Microsoft has signed a groundbreaking power purchase agreement with Helion Energy, a nuclear fusion startup backed by Sam Altman, to buy electricity in 2028. This deal marks a significant vote of confidence in fusion energy, a potential source of limitless and clean power. Helion, founded in 2013, has received substantial investment from Altman, who also co-founded OpenAI. The agreement aligns with Microsoft’s climate goals and is a strategic move to support innovative clean energy solutions.

New AI Innovations

> Spore.Bio, a French startup, has innovated a spectral device using Generative AI to detect harmful microbes in food factories in real time. This optical light-based pathogen detection system compares microbes on surfaces with training data from typical food processing environments, offering a rapid alternative to traditional petri-dish microbiological monitoring and lab testing. With €8 million in pre-seed funding, Spore.Bio aims to revolutionize cleanliness in food industries, ensuring faster and more efficient microbial monitoring compared to the usual 5-20 day testing period.

> Snapchat Plus subscribers can now utilize AI to create and share images generated from text prompts. This feature, accessible via an “AI” button, offers a range of pre-made prompt options for image generation. Additionally, users can extend photos using AI to enhance the background, making the subject appear further away. Snapchat has also integrated “Dreams,” an AI selfie feature, allowing theme-based photo transformations.

> Runway AI introduces General World Models (GWM), an ambitious research initiative focusing on AI systems that comprehend and simulate the visual world and its dynamics. GWM aims to create AI models capable of representing and simulating a wide array of real-world situations and interactions. These models will go beyond the capabilities of existing video generative systems like Gen-2, addressing challenges in generating consistent environmental maps and realistically modeling human behavior.

> Indiana researchers developed a biocomputing system using human brain cells that can perform basic speech recognition. This system, named “Brainoware,” involves growing brain organoids over several months, which are then placed on a microelectrode array for interaction. The organoids, consisting of up to 100 million nerve cells, successfully recognized an individual’s voice from 240 audio clips of Japanese vowel sounds. This proof-of-concept demonstration suggests a potential for more energy-efficient AI tasks compared to traditional silicon chips, although significant development is still required.

> YouTuber "Greg Technology" successfully replicated Google's staged Gemini AI demo in real-time using OpenAI's GPT-4 Vision, showcasing voice and vision interactions. Unlike Google's demo, which faced criticism for its staged presentation with post-recorded voice interactions, Greg's demonstration was executed live, featuring discussions about drawings, emoticon queries, and game identification. Although less refined than Google's version, it proved the real-time capabilities of GPT-4V. Greg has also shared his demo code on GitHub.

> Microsoft Research has released Phi-2, a compact yet powerful small language model AI with 2.7 billion parameters. Despite its smaller size, it outperforms larger models like Meta’s Llama 2-7B and Mistral 7-B, and even exceeds Google’s Gemini Nano 2 in performance, while exhibiting reduced toxicity and bias. Suitable for running on laptops and mobile devices, Phi-2’s current usage is limited to non-commercial research purposes under Microsoft’s specific licensing terms.

> Meta is introducing advanced AI features for its Ray-Ban smart glasses, starting with an early access test. These tests, showcased by Mark Zuckerberg, allow users to interact with a virtual assistant that can see and hear, providing information and suggestions based on visual and audio inputs. The AI can describe objects, offer fashion advice, translate text, and caption images. Initially, this test phase will be limited to a select group of US users opting in. Meta’s move integrates multimodal AI capabilities into everyday wearables, potentially transforming how users interact with their environment.

> Alter3, a humanoid robot powered by OpenAI’s GPT-4, marks a significant advancement in AI-driven robotics. Developed by the University of Tokyo, Alter3 can spontaneously adopt various poses and perform actions previously unattainable without explicit programming. This innovation bridges the gap between AI language models and physical movement, allowing the robot to interpret voice commands and convert them into physical actions, including complex motions like taking selfies. Alter3’s integration with GPT-4 enables it to understand and execute tasks based on verbal instructions.

> Microsoft’s new Medprompt+ prompting strategy has achieved a new score on the MMLU benchmark, surpassing Google's Gemini Ultra. Initially designed for medical applications, Medprompt+ combines a basic method with a simpler prompt strategy, effectively enhancing GPT-4’s performance.

^{In partnership with CODEMAKER AI}

Attention Developers:

CodeMaker AI allows you to process and generate code for entire source call hierarchies. Watch the demo here:

As a result, the solution was able to generate the source code of a multi-layered code base and generate code for an entire application.

This capability works with an existing code base and the only pre-requisite is to have files that define stubs of classes, structures, methods, or functions.

CodeMaker AI can detect files that are already implemented - skipping them during processing and leaving them untouched. Each source file will be used automatically in the generation of any dependent files.

This is a development version of CodeMakerAI with more features and improvements becoming available in the future.

Best of all…

^{Thank you for supporting our sponsors!}

5 new AI-powered tools from around the web

Hexus AI Inspired by Spotify Wrapped, Hexus allows you to recap your ChatGPT usage in for the year. Download your chat history zip and upload it below to get your ChatGPT usage stats.

Editby 2.0, is an AI-driven SEO tool, offering intuitive content optimization, AI analytics, and link architecture design. Ideal for diverse users, it guarantees Google ranking improvement, providing valuable insights and content ideas.

BoldDesk offers advanced, affordable customer service software with AI, robust ticketing, workflow automation, and analytics, ideal for all team sizes at half the cost of major competitors like Zendesk and Freshdesk.

Ideaflow, an app for instant thought capture with voice and AI, enhances productivity through features like automatic organization and compatibility with desktop, iOS, and offline use, catering to diverse user needs.

Modyfi, the next-gen motion design tool, revolutionizes design and animation. Easily turn designs into dynamic loops, create impressive motion effects quickly, and edit designs in real time.

DeepSwapper AI, offers a free, unlimited face-swapping tool for quick, high-quality, and realistic results in image editing. It doesn’t store uploaded images and supports various file formats.

arXiv is a free online library where researchers share pre-publication papers.

📄 LLM360: Towards Fully Transparent Open-Source LLMs

LLM360 aims to revolutionize Large Language Models by advocating for full transparency in open-source LLMs. Unlike most LLMs which release limited artifacts, LLM360 stresses the release of all training code, data, model checkpoints, and intermediate results. This initiative not only enhances open and collaborative AI research but also makes the entire training process of LLMs like AMBER and CRYSTALCODER transparent and reproducible. These models, pre-trained with 7B parameters, mark LLM360’s commitment to continually advancing LLMs through open-source efforts.

📄 Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

The study explores self-training for problem-solving with language models using ReST𝐸𝑀, an expectation-maximization-based method. It involves generating model samples, filtering with binary feedback, and iterative fine-tuning. Tested on MATH reasoning and APPS coding benchmarks with PaLM-2 models, it significantly outperforms training solely on human data. This approach suggests a reduced reliance on human-generated data for training language models, although it requires a well-designed training set and an effective reward function.

📄 Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior

Sherpa3D, a novel text-to-3D framework bridges 3D and 2D diffusion models to generate high-fidelity, diverse 3D content with multi-view consistency from text prompts. It leverages a coarse 3D prior from a 3D diffusion model, enhancing it with structural and semantic guidance for 2D lifting optimization. This approach maintains geometric fidelity and 3D coherence while exploiting the capabilities of 2D models for rich detailing and creativity. Extensive experiments validate Sherpa3D’s superiority in quality and consistency over existing methods. Its efficiency and generalizability promise advancements in user-friendly 3D content creation.

📄 FreeInit: Bridging Initialization Gap in Video Diffusion Models

FreeInit, a method for video diffusion models, significantly enhances temporal consistency in video generation without extra training or parameters. It addresses the training-inference gap by refining spatial-temporal low-frequency components of initial noise iteratively during inference. This approach improves subject appearance and video coherence, as confirmed by extensive experiments across various text-to-video models and prompts, thus effectively bridging the initialization gap in diffusion-based video generation.

📄 COLMAP-Free 3D Gaussian Splatting

COLMAP-Free 3D Gaussian Splatting (CF3DGS) is an innovative approach for novel view synthesis without pre-computed camera poses, leveraging explicit point cloud representation and video continuity. It progressively grows 3D Gaussians for each frame, optimizing camera poses and scene reconstruction simultaneously. Outperforming state-of-the-art methods, especially in large motion scenarios, CF3DGS offers rapid training and robust pose estimation, demonstrating its effectiveness on challenging 360-degree video scenes.

ChatGPT + DALLE 3 Attempts Comics

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.