AI Breakfast
Posts
Google Releases Gemini Pro, +1M Tokens

Google Releases Gemini Pro, +1M Tokens

AI Breakfast
April 10, 2024

In partnership with

Good morning. It’s Wednesday, April 10th.

Did you know: On this day in 1959, NASA publicly announced the names of the first group of American astronauts. They were known as the Mercury Seven.

In today’s email:

Google Gemini Pro, 1M+ Tokens
Google’s Cloud Next 24 Releases
10 New AI Tools
Latest AI Research Papers
AI Creates Comics

You read. We listen. Let us know what you think by replying to this email.

_{In partnership with}

Have an AI Idea and need help building it?

When you know AI should be part of your business but aren’t sure how to implement your concept, talk to AE Studio.

Elite software creators collaborate with you to turn any AI/ML idea into a reality–from NLP and custom chatbots to automated reports and beyond.

AE Studio has worked with early stage startups and Fortune 500 companies, and we’re ready to partner with your team. Computer vision, blockchain, e2e product development, you name it, we want to hear about it.

Book time to tell us all about your idea

Today’s trending AI news stories

Gemini Pro, Google’s 1M+ Token Context LLM

Google has launched Gemini 1.5 Pro, a powerful AI tool with advanced capabilities, including audio understanding, almost unlimited file handling, and an expanded 1 million context window.

Notable among these updates are native audio understanding and a new File API, which simplifies file management. The update also introduces system instructions and JSON mode, allowing for more precise control over model outputs, and a new text embedding model that delivers superior performance compared to existing models.

Gemini 1.5 Pro now supports audio and video inputs, enabling applications like converting lecture recordings into quizzes with answer keys (as seen in the example below)

Upload a recording of a lecture, like this 117,000+ token lecture from Jeff Dean, and Gemini 1.5 Pro can turn it into a quiz with an answer key.

The update also addresses top developer requests by including system instructions for guiding model responses, JSON mode for structured data extraction, and enhanced function calling modes for improved output reliability.

Developers can access the new text embedding model, text-embedding-004, which outperforms comparable models on the MTEB benchmarks, offering stronger retrieval performance. These enhancements are part of Google's ongoing efforts to make Google AI Studio and the Gemini API the best tools for building with Gemini. For more information, developers are encouraged to visit Google AI Studio, explore the Gemini API Cookbook, and join the community discussion on Discord.

Check it out here

Google Unleashes Powerful AI Tools and Services at Cloud Next '24

At Google Cloud Next '24, the tech giant revealed a host of powerful AI-driven updates. This includes integrating Gemini into Databases, Looker, and BigQuery to enhance data management and analysis. Additionally, Vertex AI Agent Builder simplifies AI agent creation for developers. Google is also building a comprehensive platform for generative AI development, expanding its Vertex AI Model Garden and supporting AI startups.

The updates extend to Google Workspace as well. Google Vids, a generative AI-powered video service, streamlines content creation. AI-driven features for messaging, meetings, and security further bolster productivity and collaboration. Google also announced AI-powered cybersecurity advancements, including Gemini-assisted threat detection and response.

Showcasing real-world applications, Google highlighted how customers like Bayer, Best Buy, and Discover Financial are innovating with Google's generative AI services. The company also expanded its Gemma family of models, introducing CodeGemma for coding assistance and RecurrentGemma for research workflows.

Alongside software advancements, Google showcased hardware innovations, including the TPU v5p chip and the Arm-based Axion data center processor. These AI-focused chips promise substantial performance gains, underscoring Google's commitment to providing accessible AI hardware and meeting diverse workload requirements.

🖇️ Etcetera

> Did Claude enslave 3 Gemini agents? Will we see “rogue hiveminds” of agents jailbreaking other agents

> MIT Engineers Aim to Advance Household Robots With AI Integration

> Elon Musk predicts AI will overtake human intelligence next year

> Next-generation Grok 3 model will require 100,000 Nvidia H100 GPUs to train

> Deepmind co-founder Demis Hassabis reportedly 'deeply frustrated' by Google AI Deepmind merger

> Google's Mixture-of-Depths uses computing power more efficiently by prioritizing key tokens

> Stability AI brings 12B parameters to Stable LM 2 model update

> Meta confirms that its Llama 3 open source LLM is coming in the next month

> We now have a better look at what’s inside the Humane AI pin

10 new AI-powered tools from around the web

Map Story is an AI-powered tool for interactive map stories like travel blogging. Create maps easily with step-by-step guide or AI text input.

UI Bakery AI App Generator swiftly constructs CRUD apps, admin panels atop SQL databases via AI prompts. Publish securely.

WiseMap.ai merges mind mapping with AI, empowering users to automate idea generation, project planning, and concept visualization.

AI Photo Filter by Stylar offers unparalleled image-to-image style transfer, delivering effortless transformation of photos into artistic masterpieces.

Gobble Bot is a free all-to-one scraper for GPTs. Easily convert websites, PDFs, or YouTube videos into plain text files for training ChatGPT chatbots.

Node AI facilitates decentralized AI via GPU nodes, offering global resource access, pay-as-you-go pricing, and task distribution. Users earn rewards lending GPUs.

Cliplama automates video creation for platforms like TikTok and YouTube without on-camera talent. Users can describe ideas; Cliplama generates complete videos with images, music, and captions.

Odaptos offers AI-powered customer research through a SaaS platform, conducting user tests via videoconferencing to understand emotions, behaviors, and provide actionable insights.

Aqua Voice is a voice-native text editor allowing users to create and edit documents using natural language voice commands, ideal for hands-free operation.

TypeflowAI is an AI-powered form creation tool utilizing GPT to generate intelligent prompts, aiding businesses in crafting dynamic content and automating tasks for enhanced engagement.