Amazon Develops Olympus AI, Undercutting Dependence on Anthropic

Amazon is reportedly gearing up to showcase its Olympus AI model at AWS re:Invent. Designed as a multimodal large language model (LLM), Olympus can parse images, video, and text, enabling users to pinpoint, say, a pivotal basketball play through a simple prompt.

With its foray into generative AI, Amazon seems intent on lessening reliance on Anthropic’s Claude, following its substantial backing of the startup. This move signals Amazon’s recalibration in the AI arms race, where it’s often framed as playing catch-up to Google and Microsoft.

A key player in this effort is AWS' Annapurna Labs in Austin, a hub for developing AI chips like Trainium and Graviton. By closely integrating hardware and software teams, the lab accelerates development and prototyping in a collaborative environment. Its work ranges from creating energy-efficient chips to refining full-stack server systems. Read more.

Google’s Latest AI Experiment Turns Chess into a Creative Playground

Google's experimental arm, Google Labs, has launched GenChess, a web-based game integrating AI-driven image generation through Gemini Imagen 3. Players can customize their chess pieces by inputting text prompts, choosing between a traditional or abstract design.

— # (#)

Once the set is generated, users can fine-tune individual pieces to their preference. After crafting their ideal set, players can compete against a bot across three difficulty levels. This project highlights the synergy between AI, design, and gaming.

Additionally, Google’s collaboration with FIDE introduces coding challenges for AI chess engines, and the upcoming Chess Gem feature will allow users to play against a Gemini language model, though access will be limited to Gemini Advanced subscribers. Read more.

Chinese researchers unveil LLaVA-o1 to challenge OpenAI's o1 model

LLaVA-o1, developed by Chinese researchers, introduces a structured approach to vision-language models (VLMs) for improved multimodal reasoning, inspired by OpenAI's o1 model. It utilizes a four-stage reasoning process: Summary, Caption, Reasoning, and Conclusion, ensuring logical flow by independently managing each stage.

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

A Blog post by Mike Young on Hugging Face

huggingface.co/blog/mikelabs/llava-o1-let-vision-language-models-reason

This method ensures that the model maintains control over its logical flow, sidestepping the common errors of earlier VLMs. LLaVA-o1 also debuts a "stage-level beam search," refining inference-time scaling by generating multiple output candidates at each stage and selecting the best fit.

Trained on a curated dataset of 100,000 image-question pairs annotated by GPT-4o, it’s already outperforming both open-source and some closed-source models, showing a 6.9% increase in benchmark scores. The model’s success sets a new bar for multimodal reasoning, signaling a future where structured logic could redefine VLMs. Read more.

ElevenLabs Launches GenFM to Convert Text into AI-Generated Audio

ElevenLabs has upgraded its ElevenReader app, now integrating GenFM to generate personalized podcasts from a variety of text sources, including PDFs, articles, and ebooks. This feature, available on iOS, employs AI co-hosts in 32 languages to produce dynamic, contextually relevant podcasts.

ElevenLabs — GenFM podcasts on ElevenReader | ElevenLabs

Tune in as AI co-hosts generate smart podcasts from any of your PDFs, articles, ebooks and more. Now available in the ElevenReader App.

elevenlabs.io/genfm

Utilizing ElevenLabs' advanced AI audio models, GenFM curates detailed summaries, insightful book reviews, and study material explanations, offering users the ability to consume information while multitasking—ideal for commutes or workouts.

The app’s enhanced capabilities transform static text into engaging audio, supporting diverse learning and productivity needs. Android support for GenFM is forthcoming, further extending the app's reach. Read more.

Tesla Optimus Gets a New Hand with 22 Degrees of Freedom

Tesla has upgraded its Optimus humanoid robot with a redesigned hand, now featuring 22 degrees of freedom and an additional three in the forearm. The hand is coated with a soft, protective layer that preserves its tactile sensing abilities while enabling it to handle delicate objects with precision. All actuators are now embedded within the forearm, streamlining its design.

— # (#)

Tesla aims to complete the integration of tactile sensors, implement tendon-based fine control, and reduce the forearm's weight by year-end. This enhanced hand design will be standard across all future Optimus robots. Read more.

5 new AI-powered tools from around the web

llms.txt Generator

Generate a llms.txt file for your website to provide information to help LLMs use your website at inference time.

sitespeak.ai/tools/llms-txt-generator

Homepage

Connect more than 2,000 apps into pre-made cloud data modules, consolidate your data, create Single Source of Truth and then synchronize your data back - without coding and in few clicks ⭐

boost.space

Canvas by MindPal - An infinite canvas to run AI agents & multi-agent systems

Break free from linear chats. An infinite canvas to run AI agents & multi-agent systems side by side or in sequence—all in ONE space.

mindpal.space/canvas

Vratix

Easy to use Open Source modules that implement common API logic and can be used in your Node.js backend services

vratix.com

TwinMind

TwinMind is your AI sidekick that knows all the context. Ask anything about your browser tabs, PDFs, YouTube videos. Get real-time suggestions during meetings on what to say next. Write in any text field without typing. Search the web for personalized answers.

twinmind.com