Chinese AI Model Eclipses GPT-4o

Good morning. It’s Wednesday, July 10th.

Did you know: On this day in 1985, Coca-Cola Co. announced it will resume selling “old formula Coke,” following a public outcry and falling sales of its “New Coke.”

In today’s email:

  • China’s SenseNova-5o

  • Fastest LLM Yet?

  • Robotic Tele-Operation

  • 5 New AI Tools

  • Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

Today’s trending AI news stories

SenseTime unveils SenseNova 5o, China's first real-time multimodal AI model to rival GPT-4o

SenseTime has introduced SenseNova 5o, touted as China's first real-time multimodal AI model, directly competing with OpenAI's GPT-4o. Introduced at the World Artificial Intelligence Conference, SenseNova 5o processes audio, text, image, and video data for interactive user interactions, resembling natural conversation. Unlike OpenAI's broader multimodal capabilities, SenseTime's demo focused on real-time object recognition via smartphone cameras.

Simultaneously, SenseTime upgraded its SenseNova language model to version 5.5, boasting a 30% performance increase with enhancements in mathematical reasoning, English comprehension, and prompt following. The company targets over 3,000 government and corporate clients across tech, healthcare, finance, and programming sectors with its large model deployments.

SenseTime is also currently investing in edge-based language models like SenseChat Lite-5.5, optimizing inference times and supporting applications such as the Vimi AI avatar video generator, which crafts customizable video clips from single photos. Read more.

Groq unveils lightning-fast LLM engine; developer base rockets past 280K in 4 months

Groq has released a new LLM engine accessible through their website. This engine allows developers to perform queries and other tasks directly on the platform with impressive speed, exceeding capabilities of traditional GPUs. It can process up to 1256.54 tokens per second, up from 800 tokens in April.

The engine utilizes Meta's Llama3-8b-8192 model by default, but offers support for even larger models like Llama3-70b, Gemma (Google), and Mistral. Groq plans to integrate even more options in the future.

This development showcases the potential of LLMs for developers and non-developers alike, thanks to their speed and flexibility. Groq's CEO, Jonathan Ross, expects this user-friendly and fast engine to increase LLM adoption. Developers can easily switch their applications from OpenAI to Groq, promoting compatibility and a smooth user experience.

The efficiency of Groq's approach is another highlight. The engine consumes significantly less power compared to GPUs used for similar tasks. With over 282,000 developers joining the platform within just four months, Groq aims to become a major player in the global inference computing market by next year. Read more.

Open-TeleVision: Why human intelligence could be the key to next-gen robotic automation

Researchers at MIT and UCSD have introduced "Open-TeleVision," a groundbreaking teleoperation system for robots unveiled last week. This innovative system allows operators to experience immersive remote control, perceiving the robot’s surroundings in real-time while mirroring their hand and arm movements.

Unlike traditional autonomous robots, Open-TeleVision bridges human intelligence with robotic capabilities, emphasizing the strengths of human adaptability, intuition, and problem-solving skills in complex environments. Operated via a VR headset, the system streams operator movements to control the robot's actions, enhancing interaction precision and responsiveness.

With applications ranging from disaster response to telesurgery and industrial maintenance, Open-TeleVision leverages human expertise to navigate intricate tasks and hazardous environments. The system operates at 60 Hz, ensuring real-time feedback and remote operation capabilities over long distances, exemplified by demonstrations from MIT to UCSD. Challenges remain in latency and bandwidth for seamless long-distance control. Read more.

Etcetera: Stories you may have missed

5 new AI-powered tools from around the web

Octolens AI scans social platforms, identifies relevant posts using AI, and alerts B2B SaaS teams to engage in conversations, optimizing lead generation.

Upscayl is a free, open-source AI image upscaler for Linux, MacOS, and Windows, enhancing low-resolution images into high-quality visuals.

Snipcast generates AI-based summaries of podcasts from platforms like Spotify and YouTube, delivering concise content summaries to your email inbox.

Demu uses AI to deliver live, personalized product demos 24/7, eliminating the need for human sales reps and scheduling delays.

Auto Flowchart uses AI to automatically generate flowcharts with Mermaid.js syntax, serving as a Mermaid Online Live Editor for efficiency and customization.

arXiv is a free online library where researchers share pre-publication papers.

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.