- AI Breakfast
- Posts
- GPT Store Delay, Self-Operating Computer, and Crypto + AI Investments
GPT Store Delay, Self-Operating Computer, and Crypto + AI Investments
Good morning. It’s Monday, December 4th.
Did you know: As BTC crosses $40k, many are wondering if there are intersecting AI projects with cryptocurrency. Here’s a breakdown from Forbes of the leading AI crypto projects according to their market caps.
In today’s email:
AI Hardware and Infrastructure Developments
AI Software and Model Advancements
Innovative AI Applications
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think by replying to this email.
Interested in reaching 45,896 smart readers like you? To become an AI Breakfast sponsor, apply here.
Today’s trending AI news stories
AI Hardware and Industry Developments
> OpenAI has agreed to buy $51 million in AI chips from Rain, a startup Sam Altman has personally invested in. Rain’s neuromorphic processing unit aims to mimic the human brain’s functionality. This deal, however, mingles Altman’s personal investments with his professional role, perhaps contributing to his temporary dismissal. The transaction highlights OpenAI’s investment in advanced AI hardware amid global shortages of high-capacity GPUs.
Rain’s progress faces potential delays due to US national security reviews and investor reshuffles. Rain’s innovative chips, based on RISC-V architecture, target edge devices and promise significant computational power and energy efficiency.
The U.S. government has compelled Saudi Aramco's venture capital arm, Prosperity7, to divest its stake in Rain Neuromorphics. The Committee on Foreign Investment in the United States (CFIUS) mandated the sale due to national security considerations. Rain had previously raised $25 million with Prosperity7 as the lead investor.
> OpenAI has postponed the debut of its custom GPT store to early 2024. This delay follows the company’s recent internal turmoil, including CEO Sam Altman’s brief ouster and return. Originally announced at its developer conference in November, the store will enable users to create and monetize users to create and monetize their own GPTs. The postponement, revealed in a leaked memo, aligns with ongoing enhancements to GPTs based on user feedback.
> Google has postponed the release of its GPT-4 competitor, “Gemini” to next year. This delay, confirmed by The Information, was due to Gemini’s subpar performance in non-English language responses. While some aspects of Gemini match GPT-4, its multilingual capabilities need further refinement. The delay impacts Google’s products, particularly the Bard chatbot, which won’t receive Gemini upgrades until next year.
> CoreWeave, an AI-focused cloud computing provider, has raised its valuation to $7 billion after a minority stake sale led by Fidelity Management & Research Co. The deal, which also involved Investment Management Corp. of Ontario, Jane Street, and others, confirms CoreWeave's status among the most promising AI startups. CoreWeave specializes in Nvidia-based data center solutions for AI computing, illustrating the AI industry's growth and investment appeal.
AI Model Advancements
> OthersideAI, led by developer Josh Bickett, is innovating a "self-operating computer framework" using AI. This vision-based framework allows AI to control a computer like a human, using screenshots for input and executing mouse clicks and keyboard commands. Co-founder Matt Shumer likens it to a self-driving car for computers, with potential applications beyond traditional API-based approaches. The framework's open-source nature invites global collaboration for diverse applications. Imbue, an AI research company, is developing AI models with enhanced reasoning capabilities to further advance this technology. Watch a demo here on X.
> Nvidia has launched new cloud-based APIs to accelerate AI development in medical imaging. Announced at RSNA 2023, these APIs extend Nvidia's Monai framework, facilitating easier integration of AI into medical imaging tools. The key component, Nvidia’s VISTA-3D model, was trained on diverse CT scans, enhancing 3D segmentation for image analysis and model fine-tuning. This advancement is expected to streamline AI tool development for medical imaging, enabling more efficient and targeted clinical applications.
> Google has launched a new AI experiment, Instrument Playground, enabling the creation of music inspired by over 100 global instruments. Users provide a prompt with an instrument's name, possibly with an adjective, and receive a 20-second audio clip. However, the output is abstract and may not directly mimic the chosen instrument. The experiment offers customization options like “Ambient,” “Beat,” and “Pitch,” with an advanced mode for layering up to four tracks. Users can download their compositions as .wav files, though some adjectives are inexplicably rejected.
> DiffusionAvatars, developed by researchers from Munich, is a merged model network for creating high-quality 3D avatars with realistic facial expressions. This method merges 2D diffusion models with 3D neural networks, enabling avatar animation from input videos or generated expressions. While promising for VR/AR, videoconferencing, and entertainment, it faces limitations in real-time application due to computational intensity and lacks control over lighting aspects in the avatars.
> Wine Not? Scientists have developed an AI tool capable of detecting wine fraud by tracing wines back to their origins through chemical analysis. This machine learning algorithm distinguishes wines based on subtle variances in compound concentrations, allowing identification of the specific vine-growing region and even the exact estate. The tool, achieving 99% accuracy in identifying the correct chateaux, uses gas chromatography data from various Bordeaux estates. While highly effective in estate identification, it's less accurate (50%) in differentiating vintages.
In partnership with
Transform Your Business Idea into Success with AI-Powered Idea Validation
Fast-track your entrepreneurial journey with CodeMode’s efficient, AI-driven platform. Validate your business idea in minutes (not weeks) at a fraction of the cost.
Key Benefits:
Quick Validation: Get actionable insights swiftly.
Easy to Use: User-friendly interface, no jargon.
Comprehensive Insights: From industry trends to competitor analysis, CodeMode has you covered.
Download a personalized report tailored to your industry and audience today
Save 30% with code AIB30OFF
Thank you for supporting our sponsors!
5 new AI-powered tools from around the web
Magnific AI, an image upscaler and enhancer, boosts resolution and detail in images, featuring a “Creativity” slider for added depth. Ideal for photographers and designers.
iMean Shopping, an AI-powered shopping assistant, automates deal finding, price comparison, and data extraction for efficient, AI-driven online shopping.
Flowlie, an AI-driven Fundraising Hubs, aids early-stage founders in planning investor discovery, and outreach, offering a free consultation and targeting diverse, digital-native entrepreneurs.
Metawork is an AI-driven virtual office designed to enhance remote work with real-time status updates, natural communication flow, and personalized productivity insights, offered for free.
Ideaverse Pro offers a comprehensive knowledge management system in Obsidian, enhancing learning, memory, and creativity with linked notes, designed for lifelong use.
arXiv is a free online library where researchers share pre-publication papers.
“Mamba” is an advanced AI model that outperforms traditional Transformers in sequence modeling. It introduces selective state space models (SSMs) to efficiently handle long sequences. Mamba excels in content-based reasoning, selectively retaining or discarding information based on input. Its hardware-aware parallel algorithm enhances computational efficiency. Excelling in diverse fields like language, audio, and genomics, Mamba achieves linear scaling in sequence length, offers 5× faster inference than Transformers, and matches or surpasses their performance.
MoMask, a new framework for text-driven 3D human motion generation, utilizes a hierarchical quantization scheme to convert human motion into multi-layer discrete motion tokens. Incorporating a Masked Transformer for base-layer tokens and a Residual Transformer for subsequent layers, MoMask efficiently generates high-fidelity motions. It significantly outperforms existing methods in text-to-motion tasks, demonstrating versatility and improved performance without the need for model fine-tuning in related applications.
X-Dreamer, a pioneering framework, effectively bridges the gap between text-to-2D and text-to-3D synthesis, addressing limitations in using 2D diffusion models for 3D content creation. It introduces Camera-Guided Low-Rank Adaptation (CG-LoRA) and Attention-Mask Alignment (AMA) Loss. CG-LoRA dynamically incorporates camera information, enhancing alignment with the camera’s perspective. AMA Loss focuses on foreground object generation, guiding the diffusion model’s attention map. X-Dreamer improves 3D asset quality from textual descriptions, outperforming existing text-to-3D methods.
ExploreLLM introduces a new interaction pattern with LLMs, transforming complex tasks into structured sub-tasks and enhancing user control and personalization. It differs from linear, text-heavy chatbots by offering a schema-like GUI, enabling users to easily navigate and personalize tasks. A user study highlighted its efficiency in structuring tasks and personalization ease, with suggestions for more proactive preference elicitation and richer content integration.
Scaffold-GS introduces a dual-layered hierarchy of 3D Gaussians for photo-realistic 3D scene rendering, addressing redundancy and robustness issues in existing methods. It employs anchor points to distribute local 3D Gaussians, dynamically adapting attributes based on viewing angles and distances. Demonstrating superior in complex scenes and challenging viewing conditions, Scaffold-GS maintains rendering quality and speed with a more compact model, offering potential for diverse applications in large-scale scene modeling and interpretation.
Thank you for reading today’s edition.
Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.
Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.