OpenAI Employees Can Now Liquidate Shares

Sponsored by

Good morning. It’s Monday, January 19th.

Did you know: On this day in 2008, Toshiba announced the discontinuation of HD DVD development and manufacturing, a decision that ended the format war with Blu-ray?

In today’s email:

  • AI Investments and Ventures

  • AI Applications and Innovations

  • AI in Business and Industry

  • AI Regulation and Governance

  • 5 New AI Tools

  • Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

AI brews beer and your big ideas

What’s your biggest business challenge? Don’t worry about wording it perfectly or describing it just right. Brain dump your description into AE Studio’s new tool and AI will help you solve that work puzzle.

Describe your challenge in three quick questions. Then AI churns out solutions customized to you.

AE Studio exists to solve business problems. They build great products and create custom software, AI and BCI solutions. And they once brewed beer by training AI to instruct a brewmeister and then to market the result. The beer sold out – true story.

Beyond beer, AE Studio’s data scientists, designers and developers have done even more impressive things working 1:1 with founders and executives. They’re a great match for leaders wanting to incorporate AI and just generally deliver outstanding products built with the latest tools and tech.

If you’re done guessing how to solve work problems or have a crazy idea in your back pocket to test out, ask AI Ideas by AE Studio for free solutions, right now.

Today’s trending AI news stories

AI Investments and Ventures

> OpenAI has finalized a funding deal, pushing its valuation to an impressive $80 billion, nearly tripling its value in less than 10 months. Spearheaded by Thrive Capital, this tender offer enables employees to liquidate their shares, deviating from the conventional fundraising route. This investment surge underscores sustained interest in generative AI enterprises, fueled by the triumph of ChatGPT. OpenAI's latest deal comes despite a turbulent period of leadership crisis last year.

> SoftBank CEO Masayoshi Son seeks to raise $100 billion to establish a semiconductor venture named Izanagi, directly challenging Nvidia's dominance in the AI market. SoftBank plans a $30 billion investment, with additional $70 billion potentially sourced from Middle Eastern institutions. This project will leverage SoftBank's significant stake in Arm, whose chip designs complement Nvidia's in AI data centers. The move positions SoftBank to capitalize on the AI sector's growth and reflects a possible shift back to the firm's traditional high-risk, high-reward tech investment strategy following recent portfolio success.

After Bloomberg News reported the 66-year-old billionaire's efforts to obtain funding for an AI chip venture, shares rose by as much as 3.2%.

> LangChain secures $25 million in Series A funding led by Sequoia Capital and introduces LangSmith, a subscription-based LLMOps solution. LangSmith streamlines the lifecycle of Large Language Model (LLM) projects, from development to deployment and monitoring. Initially launched in limited beta, LangSmith has attracted over 70,000 signups and is used by 5,000 monthly users, including Rakuten, Elastic, and Moody’s. The platform aids in debugging, testing, and monitoring LLM applications, offering real-time insights and facilitating collaboration.

AI Applications and Innovations

> OpenAI's Sora goes far beyond typical text-to-video generation. The model creates interactive 3D worlds similar to video game environments, complete with dynamic camera movements and basic physics simulations. Harnessing synthetic data likely sourced from game engines, Sora skillfully navigates copyright concerns while achieving impressive 3D rendering. Though currently limited in handling complex physics and long-term consistency, Sora's scalability foreshadows its potential as a groundbreaking general-purpose world simulator within the AI field.

> University of Pennsylvania engineers have developed a revolutionary chip that uses light waves, not electricity, for complex mathematical computations crucial in AI training. This silicon-photonic (SiPh) chip, a product of collaborative research, merges nanoscale material manipulation with silicon technology, potentially transforming computer processing speed and energy efficiency. Published in Nature Photonics, the chip's design enables vector-matrix multiplication, a fundamental operation in neural networks. Its unique architecture, manipulating light propagation through silicon variations, promises unparalleled processing speeds. With potential applications in graphics processing units (GPUs) and improved privacy features, this chip represents a significant leap forward in AI computing.

> Google has open-sourced Magika, an AI-based tool enhancing file type identification accuracy for improved security in digital environments. Utilizing a custom deep-learning model, Magika showcases superior performance in recognizing various file types, including VBA, JavaScript, and Powershell, boosting accuracy by 30%. Employing the Open Neural Network Exchange (ONNX), the software ensures swift file identification, enhancing user safety in platforms like Gmail and Drive.

AI in Business and Industry

> Reddit has reportedly finalized a content licensing deal with an unnamed AI company. This $60 million annualized agreement gives the AI firm access to Reddit's extensive user-generated content for model training purposes. The deal was signed in advance of Reddit's anticipated IPO launch in March, underscoring the platform's strategy to capitalize on the rapidly expanding AI industry. While the terms of the agreement may change as IPO discussions progress, it establishes a framework for potential monetization of social media data within the AI space.

> Sierra, a fledgling AI firm led by Bret Taylor and Clay Bavor, seeks to transform customer interactions through empathetic chatbots. Using cutting-edge AI, Sierra's bots mimic human empathy, improving experiences for clients like WeightWatchers and Sonos. Employing a multi-model strategy, Sierra ensures accuracy and guards against misinformation. Their ambition goes beyond automation, aiming to elevate AI agents to the same level of importance as company websites. 

AI Regulation and Governance

> The US Patent and Trademark Office rejected OpenAI's trademark application for "GPT," deeming it as merely descriptive of the product's characteristics. OpenAI aimed to protect its GPT models but the USPTO argued consumers associate "GPT" with specific technologies. This marks the second denial by the USPTO. OpenAI can pursue further review or appeal. The decision underscores the importance of allowing descriptive language for marketing. Despite OpenAI's efforts to safeguard its GPT products, the denial indicates broader recognition of "GPT" within the industry, potentially impacting its exclusive use.

> California Senator Scott Wiener has introduced a bill calling for the creation of a new unit within the California Department of Technology. This unit, the Frontier Model Division, would focus on enforcing AI regulations. Key elements include mandatory testing of large AI models before they're used, built-in emergency shut-off systems, and protections against hacking. Wiener emphasizes the need for proactive measures to address the safety and security risks posed by AI. While the bill aims to shape AI governance, it lacks details on how it would work alongside existing policy initiatives or collaborate with technology officials.

5 new AI-powered tools from around the web

​​Aya is a powerful open-source AI model supporting 101 languages, enabling advanced language understanding, translation and summarization for researchers and developers.

Syncly centralizes feedback and leverages AI to surface pain points, prioritize issues, and streamline insights for proactive customer support.

Data Analyst AI connects Google Analytics with ChatGPT, delivering AI-powered eCommerce insights and automated weekly reports.

Apple MGIE is an open-source AI image editing tool developed by Apple and UCSB that allows users to make precise image edits using descriptive text commands.

MagicAds.ai automates video ad creation with AI, turning product URLs into engaging ads that outperform human-made content.

arXiv is a free online library where researchers share pre-publication papers.

PaLM2-VAdapter revolutionizes vision-language alignment for Large Vision-Language Models (LVLMs). Instead of intricate adapter architectures or pretraining, it employs a progressively trained, lightweight language model as the adapter. This innovative approach yields remarkable improvements in efficiency, convergence speed, and overall performance. PaLM2-VAdapter establishes state-of-the-art results across various image/video captioning and Visual Question Answering (VQA) tasks while using significantly fewer parameters (30-80% reduction) than comparable models. The key lies in its two-stage progressive alignment: initially, a tiny PaLM-2 language model is finetuned as the decoder, followed by the addition of a perceiver resampler to bridge the vision encoder with a larger PaLM-2 decoder. This strategy makes PaLM2-VAdapter a simpler, more powerful, and computationally efficient solution for multi-modal integration.

Linear Transformers offer computational advantages over traditional Transformers but often struggle with long text sequences. The ReBased model addresses this by introducing a refined linear attention mechanism with two key enhancements: a quadratic kernel function with learnable parameters (scale and shift) and Layer Normalization applied directly within the kernel computation. This design allows ReBased to flexibly attend to relevant information, including assigning zero attention to irrelevant tokens. ReBased outperforms the earlier Based model and other linear architectures on the Multi-Query Associative Recall (MQAR) task, showcasing improved In-Context Learning capabilities for long sequences. The results also demonstrate ReBased's superior overall language modeling performance on the Pile dataset.

SPAR tackles the challenge of extracting relevant information from extremely long user histories in content-based recommendation systems. It introduces session-based encoding with sparse attention for enhanced efficiency when handling extensive sequences. For comprehensive user interest modeling, SPAR uses poly-attention to merge local interests from sessions into global user embeddings. An LLM is employed to summarize user interests, adding rich context to the representation. To improve recommendation accuracy, post-fusion interaction is implemented with poly-attention layers, maintaining independence of user and item embeddings for deployment within large-scale systems. Experiments demonstrate SPAR's superiority over existing state-of-the-art techniques on benchmark datasets.

This paper explores the limitations of current transformer-based models (like GPT-4 and RAG) in processing extremely long documents. The authors introduce BABILong, a benchmark that hides essential reasoning facts within irrelevant text, making tasks progressively harder as document length increases. Results show that popular language models struggle to identify relevant information beyond certain input lengths. However, augmenting a smaller transformer (GPT-2) with recurrent memory and fine-tuning significantly improves performance. The study indicates that recurrent memory architectures hold great potential for handling very long sequences and highlights the need for models that can better extract and reason about information within extended contexts.

LAVE, developed by researchers at Meta Reality Labs, introduces a new paradigm for video editing where a conversational AI agent, powered by large language models (LLMs), assists users throughout the process. Videos are automatically given titles and summaries so the LLM can understand them. Users can issue natural language commands to the agent for tasks like idea generation, clip retrieval, and sequencing. LAVE offers both agent-driven and manual editing modes for flexibility and to preserve user control. A user study validated its effectiveness and revealed insights into how users perceive this AI-assisted approach, paving the way for future developments by Meta and others in the multimedia editing space.

ChatGPT Creates Comics

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.