Grok 4 Smashes Benchmarks

Good morning. It’s Friday, July 11th.

On this day in tech history: In 2010, IBM was tweaking Watson for its big Jeopardy! showdown (aired Feb 2011). It used natural language processing, knowledge tools, and machine learning to tackle tricky questions and pull answers from huge datasets. This work boosted AI’s ability to get context and reason, paving the way for today’s modern conversational and knowledge-based systems.

In today’s email:

  • Grok 4 Details

  • Perplexity Browser

  • Google Preps DeepThink

  • 5 New AI Tools

  • Latest AI Research Papers

You read. We listen. Let us know what you think by replying to this email.

In Partnership With Columbia Business Executive Education

Be the Person on Your Team Who Knows How to Actually Use AI

The AI for Business & Finance Certificate Program from Columbia Business School Exec Ed & Wall Street Prep delivers hands-on experience directly from from AI leaders and Columbia’s world-class faculty and guest speakers

Gain the skills that will turn you into an “AI first” leader in your role.  

  • Skip the theory and go straight to practice by tackling real use cases currently being implemented at the world’s most AI-forward financial institutions and corporations

  • Learn from Columbia Business School faculty and directly from those building AI tools at BlackRock, Morgan Stanley, Ramp, Perplexity, and OpenAI

  • Join a rigorous program with no coding or tech background needed—we'll guide you step by step from fundamentals to real-world application

Earn a certificate from Columbia Business School Exec Ed in just 8 weeks—and gain lifetime access to course materials and a global professional network.

Save $300 using code AIBREAKFAST when you enroll today + an extra $200 on early enrollment.

Thank you for supporting our sponsors!

Today’s trending AI news stories

Grok 4 Sets AI Records, More Advanced Features Drop This Weekend 

xAI has launched Grok 4, its most advanced AI model to date, built on the in-house Colossus supercomputer with over 300 billion parameters. Alongside it comes Grok 4 Heavy, a parallel multi-agent variant, and a $300/month SuperGrok Heavy subscription for early access and premium features.

Grok 4 uses a smart “mixture-of-experts” design that routes data efficiently, making it faster and cheaper to run. New versions include Grok 4 Code for live coding help and debugging, and Grok 4 Voice for natural-sounding speech. Upcoming upgrades will let it handle video too. DeepSearch keeps its answers current by pulling live data from the web, and special tuning helps it understand memes and internet slang.

Performance-wise, Grok 4 has shaken up the AI leaderboard. On the challenging ARC-AGI-2 benchmark, it reached 15.9% accuracy, almost double its closest commercial competitor. It also scored 66.7% on ARC-AGI-1, matching recent research bests. On the Artificial Analysis Intelligence Index, Grok 4 edged ahead of OpenAI’s o3 and Google’s Gemini 2.5 Pro with a score of 73 (vs. their 70). Grok 4 performed strongly on coding and math tests too: 87% on MMLU-Pro, 94% on AIME 2024, and a record-breaking 88% on GPQA Diamond. It also handled the notoriously tough Humanity’s Last Exam.

The rollout, however, is shadowed by controversy. Musk blamed “user prompts” and acknowledged the model was “too compliant,” with xAI now adding pre-post filters to limit hate speech. Musk also confirmed Grok AI will roll out to Tesla vehicles as early as next week. Read more.

Perplexity launches Comet, a web browser powered by AI

Perplexity has launched Comet, a Chromium-based browser built from the ground up around AI rather than bolted on as an afterthought. Its standout feature, Comet Assistant, can read and reason over any webpage in real time, summarizing YouTube videos, parsing documents, or comparing products without endless tab hopping. A hybrid AI design keeps basic tasks local for speed and privacy, while complex queries run in the cloud.

Comet ships first to $200/month Perplexity Max subscribers on Windows and macOS, with seamless Chrome extension and bookmark import to lower switching friction. The timing is calculated: OpenAI is reportedly prepping its own AI browser, while Perplexity sets its sights on challenging Chrome’s huge market lead. The company already handles over 780 million monthly queries and is growing at double-digit rates. Read more.

Google preps Deep Think and Agent Mode for next Gemini upgrade

Google is close to rolling out Gemini 2.5 Pro Deep Think, internally codenamed kingfall, possibly next week. Backend toggles show it’s already active behind the curtain: output quality beats earlier models, though response times stretch to about five minutes for 10 prompts.

Alongside Deep Think, fresh code reveals Agent Mode, flagged as “Autonomous Exploration, Planning and Execution.” This points to a shift toward multi-step, unsupervised task handling, likely via Google’s A2A agent stack. Gemini’s toolbox is also adding Bespoke, hinting at deeply personalized outputs, and Learning Mode, which seems built for students and study workflows.

In creative AI, Google launched an image-to-video generator for Veo 3 in Gemini, enabling Pro and Ultra users to turn single photos into short clips with text prompts, pushing Gemini toward a unified, multimodal platform.

On the dev side, Google added three Gemini-powered AI modes to Firebase Studio: Ask, Agent (needs approval), and Agent Auto-run, which autonomously writes or updates code. Paired with Model Context Protocol and a Gemini CLI, it edges Firebase toward a true AI-first IDE. Internally, Google says AI now writes about 50% of its code.

Google Cloud also launched Vertex AI Memory Bank to fix forgetful bots. Rather than stuffing entire chat logs into the model, it smartly extracts and recalls key facts on demand, cutting latency and cost. Read more.

5 new AI-powered tools from around the web

arXiv is a free online library where researchers share pre-publication papers.

Thank you for reading today’s edition.

Your feedback is valuable. Respond to this email and tell us how you think we could add more value to this newsletter.

Interested in reaching smart readers like you? To become an AI Breakfast sponsor, reply to this email or DM us on 𝕏!