AI Breakfast
Posts
Chinese AI Model Eclipses GPT-4o

Chinese AI Model Eclipses GPT-4o

AI Breakfast
July 10, 2024

Good morning. It’s Wednesday, July 10th.

Did you know: On this day in 1985, Coca-Cola Co. announced it will resume selling “old formula Coke,” following a public outcry and falling sales of its “New Coke.”

In today’s email:

China’s SenseNova-5o
Fastest LLM Yet?
Robotic Tele-Operation
5 New AI Tools
Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

Today’s trending AI news stories

SenseTime unveils SenseNova 5o, China's first real-time multimodal AI model to rival GPT-4o

SenseTime has introduced SenseNova 5o, touted as China's first real-time multimodal AI model, directly competing with OpenAI's GPT-4o. Introduced at the World Artificial Intelligence Conference, SenseNova 5o processes audio, text, image, and video data for interactive user interactions, resembling natural conversation. Unlike OpenAI's broader multimodal capabilities, SenseTime's demo focused on real-time object recognition via smartphone cameras.

Simultaneously, SenseTime upgraded its SenseNova language model to version 5.5, boasting a 30% performance increase with enhancements in mathematical reasoning, English comprehension, and prompt following. The company targets over 3,000 government and corporate clients across tech, healthcare, finance, and programming sectors with its large model deployments.

SenseTime is also currently investing in edge-based language models like SenseChat Lite-5.5, optimizing inference times and supporting applications such as the Vimi AI avatar video generator, which crafts customizable video clips from single photos. Read more.

Groq unveils lightning-fast LLM engine; developer base rockets past 280K in 4 months

Groq has released a new LLM engine accessible through their website. This engine allows developers to perform queries and other tasks directly on the platform with impressive speed, exceeding capabilities of traditional GPUs. It can process up to 1256.54 tokens per second, up from 800 tokens in April.

The engine utilizes Meta's Llama3-8b-8192 model by default, but offers support for even larger models like Llama3-70b, Gemma (Google), and Mistral. Groq plans to integrate even more options in the future.

This development showcases the potential of LLMs for developers and non-developers alike, thanks to their speed and flexibility. Groq's CEO, Jonathan Ross, expects this user-friendly and fast engine to increase LLM adoption. Developers can easily switch their applications from OpenAI to Groq, promoting compatibility and a smooth user experience.

The efficiency of Groq's approach is another highlight. The engine consumes significantly less power compared to GPUs used for similar tasks. With over 282,000 developers joining the platform within just four months, Groq aims to become a major player in the global inference computing market by next year. Read more.

Open-TeleVision: Why human intelligence could be the key to next-gen robotic automation

Researchers at MIT and UCSD have introduced "Open-TeleVision," a groundbreaking teleoperation system for robots unveiled last week. This innovative system allows operators to experience immersive remote control, perceiving the robot’s surroundings in real-time while mirroring their hand and arm movements.

Unlike traditional autonomous robots, Open-TeleVision bridges human intelligence with robotic capabilities, emphasizing the strengths of human adaptability, intuition, and problem-solving skills in complex environments. Operated via a VR headset, the system streams operator movements to control the robot's actions, enhancing interaction precision and responsiveness.

With applications ranging from disaster response to telesurgery and industrial maintenance, Open-TeleVision leverages human expertise to navigate intricate tasks and hazardous environments. The system operates at 60 Hz, ensuring real-time feedback and remote operation capabilities over long distances, exemplified by demonstrations from MIT to UCSD. Challenges remain in latency and bandwidth for seamless long-distance control. Read more.

Etcetera: Stories you may have missed

Meta AI develops compact language model for mobile devices

Apple reportedly preparing new Apple Watch with chip, display upgrades

YouTube will use AI to snip copyrighted music and not silence your whole video

Musk's xAI, Oracle end talks on $10B server deal, the Information reports

OpenAI board will not have Microsoft and Apple as observers

Anthropic's Claude adds a prompt playground to quickly improve your AI apps

Court ruling suggests AI systems may be in the clear as long as they don't make exact copies

Galaxy AI may soon power real-time translation in WhatsApp, and maybe Google Meet too

AI chatbots can pass certified ethical hacking exams, study finds

Writer drops mind-blowing AI update: RAG on steroids, 10M word capacity, and AI 'thought process' revealed

Stability AI Releases Stable Assistant Features

Ex-Googler joins filmmaker to launch DreamFlare, a studio for AI-generated video

China leads the world in adoption of generative AI, survey shows

With funding from Jeff Bezos and others, Skild AI company raises $300 million to build robot brains

5 new AI-powered tools from around the web

Octolens AI scans social platforms, identifies relevant posts using AI, and alerts B2B SaaS teams to engage in conversations, optimizing lead generation.

Upscayl is a free, open-source AI image upscaler for Linux, MacOS, and Windows, enhancing low-resolution images into high-quality visuals.

Snipcast generates AI-based summaries of podcasts from platforms like Spotify and YouTube, delivering concise content summaries to your email inbox.

Demu uses AI to deliver live, personalized product demos 24/7, eliminating the need for human sales reps and scheduling delays.

Auto Flowchart uses AI to automatically generate flowcharts with Mermaid.js syntax, serving as a Mermaid Online Live Editor for efficiency and customization.

arXiv is a free online library where researchers share pre-publication papers.

📄 LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages

📄 AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents

📄 ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild

📄 Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images

📄 DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, apply here.