Welcome to Google Cloud Next ‘24

Welcome to Google Cloud Next. We last came together just eight months ago at Next 2023, but since then, we have made well over a year’s progress innovating and transforming with our customers and partners. We have introduced over a thousand product advances across Google Cloud and Workspace. We have expanded our planet-scale infrastructure to 40 regions and announced new subsea cable investments to connect the world to our Cloud with predictable low latency. We have introduced new, state-of-the-art models — including our Gemini models — and brought them to developers and enterprises. And the industry is taking notice — we have been recognized as a Leader in 20 of the top industry analyst evaluations.

Last year, the world was just beginning to imagine how generative AI technology could transform businesses — and today, that transformation is well underway. More than 60% of funded gen AI startups and nearly 90% of gen AI unicorns are Google Cloud customers, including companies like Anthropic, AI21 Labs, Contextual AI, Essential AI, and Mistral AI who are using our infrastructure. Leading enterprises like Deutsche Bank, Estée Lauder, Mayo Clinic, McDonald’s, and WPP are building new gen AI applications on Google Cloud. And today, we are announcing new or expanded partnerships with Bayer, Cintas, Discover Financial, IHG Hotels & Resorts, Mercedes Benz, Palo Alto Networks, Verizon, WPP, and many more. In fact, this week at Next, more than 300 customers and partners will share their gen AI successes working with Google Cloud.

Central to the opportunities of gen AI are the connected AI agents that bring them to life. Agents help users achieve specific goals — like helping a shopper find the perfect dress for a wedding or helping nursing staff expedite patient hand-offs when shifts change. They can understand multi-modal information — processing video, audio, and text together, connecting and rationalizing different inputs. They can learn over time and facilitate transactions and business processes. Today, our customers, including Best Buy, Etsy, The Home Depot, ING Bank and many more are seeing the benefits of powerful, accurate and innovative agents that make gen AI so revolutionary. This path to agents is built on our AI-optimized infrastructure, models and platform, or by utilizing our own agents in Gemini for Google Cloud and Gemini for Google Workspace.

Today, at Next ‘24, we are making significant announcements to drive customer success and momentum, including: custom silicon advancements, like the general availability of TPU v5p and Google Axion, our first custom ArmⓇ-based CPU designed for the datacenter; Gemini 1.5 Pro, which includes a breakthrough in long context understanding, going into public preview; new grounding capabilities in Vertex AI; Gemini Code Assist for developers; expanded cybersecurity capabilities with Gemini in Threat Intelligence; new enhancements for Gemini in Google Workspace, and much more. These innovations transcend every aspect of Google Cloud, including:

Our AI Hypercomputer, a supercomputing architecture that employs an integrated system of performance-optimized hardware, open software, leading ML frameworks, and flexible consumption models;
Our foundation models, including Gemini models, which process multi-modal information, and have advanced reasoning skills;
Our Vertex AI platform, which helps organizations and partners to access, tune, augment, and deploy custom models and connect them with enterprise data, systems and processes to roll out generative AI agents;
Gemini for Google Cloud, which provides AI assistance to help users work and code more efficiently, manage their applications, gain deeper data insights, identify and resolve security threats, and more;
Gemini for Workspace, which is the agent built right into Gmail, Docs, Sheets, and more, with enterprise-grade security and privacy; and
A number of announcements across analytics, databases, cybersecurity, compute, networking, Google Workspace, our growing AI ecosystem, and more.

Scale with AI-optimized infrastructure

The potential for gen AI to drive rapid transformation for every business, government and user is only as powerful as the infrastructure that underpins it. Google Cloud offers our AI Hypercomputer, an architecture that combines our powerful TPUs, GPUs, AI software and more to provide an efficient and cost effective way to train and serve models. Leading AI companies globally, like Bending Spoons and Kakao Brain, are building their models on our platform.

Today, we are strengthening our leadership with key advancements to support customers across every layer of the stack:

A3 Mega: Developed with NVIDIA using H100 Tensor Core GPUs, this new GPU-based instance will be generally available next month and delivers twice the bandwidth per GPU compared to A3 instances, to support the most demanding workloads. We are also announcing Confidential A3, which enables customers to better protect the confidentiality and integrity of sensitive data and AI workloads during training and inferencing.
NVIDIA HGX B200 and NVIDIA GB200 NVL72: The latest NVIDIA Blackwell platform will be coming to Google Cloud in early 2025 in two variations: the HGX B200 and the GB200 NVL72. The HGX B200 is designed for the most demanding AI, data analytics, and HPC workloads, while the GB200 NVL72 powers real-time large language model inference and massive-scale training performance for trillion-parameter- scale models.
TPU v5p: We’re announcing the general availability of TPU v5p, our most powerful, scalable, and flexible AI accelerator for training and inference, with 4X the compute power per pod compared to our previous generation. We’re also announcing availability of Google Kubernetes Engine (GKE) support for TPU v5p. Over the past year, the use of GPUs and TPUs on GKE has grown more than 900%.
AI-optimized storage options: We’re accelerating training speed with new caching features in Cloud Storage FUSE and Parallelstore, which keep data closer to a customer’s TPU or GPU. We’re also introducing Hyperdisk ML (in preview), our next generation block storage service that accelerates model load times up to 3.7X compared to common alternatives.

New options for Dynamic Workload Scheduler: Calendar mode for start time assurance and flex start for optimized economics will help customers ensure efficient resource management for the distribution of complex training and inferencing jobs.

We are also bringing AI closer to where the data is being generated and consumed — to the edge, to air-gapped environments, to Google Sovereign Clouds, and cross-cloud. We are enabling AI anywhere through Google Distributed Cloud (GDC), allowing you to choose the environment, configuration, and controls that best suit your organization’s specific needs. For example, leading mobile provider Orange, which operates in 26 countries where local data must be kept in each country, leverages AI on GDC to improve network performance and enhance customer experiences.

Today we are announcing a number of new capabilities in GDC, including:

NVIDIA GPUs to GDC: We are bringing NVIDIA GPUs to GDC for both connected and air-gapped configurations. Each of these will support new GPU-based instances to run AI models efficiently.
GKE on GDC: The same GKE technology that leading AI companies are using on Google Cloud will be available in GDC.
Support AI models: We are validating a variety of open AI models, including Gemma, Llama 2 and more on GDC to run in air-gapped and connected edge environments.
AlloyDB Omni for Vector Search: We are also bringing the power of AlloyDB Omni for Vector Search to allow search and information retrieval on GDC for your private and sensitive data with extremely low latency.
Sovereign Cloud: For the most stringent regulatory requirements, we deliver GDC in a fully air-gapped configuration with local operations, full survivability, managed by Google or through a partner of your choice. You have complete control, and when regulations change, we have the flexibility to help you respond quickly.

While not every workload is an AI workload, every workload you run in the cloud needs optimization — from web servers to containerized microservices. Each application has unique technical needs, which is why we’re pleased to introduce new, general-purpose compute options that help customers maximize performance, enable interoperability between applications, and meet sustainability goals, all while lowering costs:

Google Axion: Our first custom ArmⓇ-based CPU designed for the datacenter, delivers up to 50% better performance and up to 60% better energy efficiency than comparable current-generation x86-based instances.

We are also announcing N4 and C4, two new machine series in our general purpose VM portfolio; native bare-metal machine shapes in the C3 machine family; the general availability of Hyperdisk Advanced Storage Pools; and much more.

Create agents with Vertex AI

Vertex AI, our enterprise AI platform, sits on top of our world-class infrastructure. It is the only unified platform that lets customers discover, customize, augment, deploy, and manage gen AI models. We offer more than 130 models, including the latest versions of Gemini, partner models like Claude 3, and popular open models including Gemma, Llama 2, and Mistral. Today, we’re excited to deliver expanded access to a variety of models, giving customers the most choice when it comes to model selection:

Gemini 1.5 Pro: Gemini 1.5 Pro offers two sizes of context windows: 128K tokens and 1 million tokens, and is now available in public preview. In addition, we are announcing the ability to process audio files including videos with audio. Customers can process vast amounts of information in a single stream including 1 hour of video, 11 hours of audio, codebases with over 30,000 lines of code, or over 700,000 words.
Claude 3: Claude 3 Sonnet and Claude 3 Haiku, Anthropic’s state-of-the-art models, are generally available on Vertex AI thanks to our close partnership, with Claude 3 Opus to be available in the coming weeks.
CodeGemma: Gemma is a family of lightweight, state-of-the-art open models built by the same research and technology used to create the Gemini models. A new fine-tuned version of Gemma designed for code generation and code assistance, CodeGemma, is now available on Vertex AI.
Imagen 2.0: Our most advanced text-to-image technology boasts a variety of image generation features to help businesses create images that match their specific brand requirements. A new text-to-live image capability allows marketing and creative teams to generate animated images, such as gifs, which are equipped with safety filters and digital watermarks. In addition, we are announcing the general availability of advanced photo editing features, including inpainting and outpainting, and much more.
Digital Watermarking: Powered by Google DeepMind’s SynthID, we are proud to announce it is generally available today for AI-generated images produced by Imagen 2.0.

Vertex AI allows you to tune the foundation model you have chosen with your data. We provide a variety of different techniques including fine tuning, Reinforcement Learning with Human Feedback (RLHF), distilling, supervised, adapter-based tuning techniques such as Low Rank Adaption (LORA) and more. Today we are announcing support supervised, adapter-based tuning, to customize Gemini models in an efficient, lower-cost way.

Customers get far more from their models when they augment and ground them with enterprise data. Vertex AI helps you do this with managed tooling for extensions, function calling, and grounding. In addition, Retrieval Augmented Generation (RAG) connects your model to enterprise systems to retrieve information and take action, allowing you to get up-to-the-second billing and product data, update customers’ contact info or subscriptions, or even complete transactions. Today, we are expanding Vertex AI grounding capabilities in two new ways:

Google Search: Grounding models in Google Search combines the power of Google’s latest foundation models along with access to fresh, high-quality information to significantly improve the completeness and accuracy of responses.
Your data: Give your agents Enterprise Truth by grounding models with your data from enterprise applications like Workday or Salesforce, or information stored in Google Cloud database offerings like AlloyDB and BigQuery.

Once you have chosen the right model, tuned and grounded it, Vertex AI can help you deploy, manage and monitor the models. Today, we are announcing additional ML opps capabilities:

Prompt management tools: These tools let you collaborate on built-in prompts with notes and status, track changes over time, and compare the quality of responses from different prompts.
Automatic side-by-side (AutoSxS): Now generally available, Auto SxS provides explanations of why one response outperforms another and certainty scores, which helps users understand the accuracy of the evaluation.
Rapid Evaluation feature: Now in preview, this helps customers quickly evaluate models on smaller data sets when iterating on prompt design.

Finally, Vertex AI Agent Builder brings together foundation models, Google Search, and other developer tools, to make it easy for you to build and deploy agents. It provides the convenience of a no-code agent builder console alongside powerful grounding, orchestration and augmentation capabilities. With Vertex AI Agent Builder, you can now quickly create a range of gen AI agents, grounded with Google Search and your organization’s data.

Accelerate development

Gemini Code Assist is our enterprise-focused AI code-assistance solution. We deployed it to a group of developers inside Google and found they had more than 40% faster completion time for common dev tasks and spent roughly 55% less time writing new code. We’re also seeing success with customers, like Quantiphi, who recorded developer productivity gains over 30%.

We are proud to share that Gemini Code Assist supports your private code base to be anywhere — on premises, GitHub, GitLab, Bitbucket, or even multiple locations. Today, we’re making key announcements to extend our industry leadership for developers:

Gemini 1.5 Pro in Gemini Code Assist: This upgrade, now in private preview, brings a massive 1 million token context window, revolutionizing coding for even the largest projects. Gemini Code Assist now delivers even more accurate code suggestions, deeper insights, and streamlined workflows.
Gemini Cloud Assist: This provides AI assistance across your application lifecycle, making it easier to design, secure, operate, troubleshoot, and optimize the performance and costs of your application.

Unlock the potential of AI with data

Google Cloud lets you combine the best of AI with your grounded enterprise data, while keeping your data private and secure. This year, we’ve made many advancements in our AI-ready Data Cloud such as LLM integration and vector matching capabilities across all of our databases, including AlloyDB and BigQuery. Now, data teams can use Gemini models for multimodal and advanced reasoning for their existing data. This can help improve patient care for healthcare providers, streamline supply chains, and increase customer engagement across industries like telco, retail, and financial services. For example, customers like Bayer, Mayo Clinic, Mercado Libre, NewsCorp, and Vodafone are already seeing benefits. And Walmart is building data agents to modernize their shopping experiences:

“Using Gemini, we’ve enriched our data, helping us improve millions of product listings across our site and ultimately, enabling customers to make better decisions when they shop with Walmart.” – Suresh Kumar, EVP, Global Chief Technology Officer and Chief Development Officer, Walmart, Inc.

Today, we’re announcing new enhancements to help organizations build great data agents:

Gemini in BigQuery: Gemini in BigQuery uses AI to help your data teams with data preparation, discovery, analysis and governance. Combined with this, you can build and execute data pipelines with our new BigQuery data canvas, which provides a new notebook-like experience with natural language and embedded visualizations, both available in preview.
Gemini in Databases: This makes it easy for you to migrate data safely and securely from legacy systems, for example, converting your database to a modern cloud database like AlloyDB.
Vector indexing: New query capabilities using vector indexing directly in BigQuery and AlloyDB allow you to leverage AI over your data where it is stored, enabling real-time and accurate responses.
Gemini in Looker: We’re introducing new capabilities, currently in preview, that allow your data agents to easily integrate with your workflows. We have also added new gen AI capabilities to enable you to chat with your business data, and it is integrated with Google Workspace.

Improve your cybersecurity posture with AI-driven capabilities

The number and sophistication of cybersecurity attacks continues to increase, and gen AI has the potential to tip the balance in favor of defenders, with Security Agents providing help across every stage of the security lifecycle: prevention, detection and response.

Today at Google Cloud Next, we are announcing new AI-driven innovations across our security portfolio that are designed to deliver stronger security outcomes and enable every organization to make Google a part of their security team:

Gemini in Threat Intelligence: Uses natural language to deliver deep insight about threat actor behavior. With Gemini, we are able to analyze much larger samples of potentially malicious code. Gemini’s larger context window allows for analysis of the interactions between modules, providing new insight into code’s true intent.
Gemini in Security Operations: A new assisted investigation feature converts natural language to detections, summarizes event data, recommends actions to take, and navigates users through the platform via conversational chat.

Leading global brands are already seeing the benefits in their security programs. At Pfizer, data sources that used to take days to aggregate can now be analyzed in seconds. 3M is using Gemini in Security Operations to help their team cut through security noise, while engineers in Fiserv’s Security Operations Center are able to create detections and playbooks with significantly less effort, and analysts get answers more quickly.

Supercharge productivity with Google Workspace

Google Workspace is the world’s most popular productivity suite, with more than 3 billion users and over 10 million paying customers, from individuals to enterprises. Over the last year, we have released hundreds of features and enhancements to Google Workspace and Gemini for Workspace, the AI-powered agent that’s built right into Gmail, Docs, Sheets and more. And customers are experiencing a significant benefit. In fact, 70% of enterprise users who use “Help me write” in Docs or Gmail end up using Gemini’s suggestions, and more than 75% of users who create images in Slides are inserting them into their presentations.

Google Workspace is already helping employees at leading brands like Uber, Verizon and Australian retailer Woolworths, and today we are announcing the next wave of innovations and enhancements to Gemini for Google Workspace, including:

Google Vids: This new AI-powered video creation app for work is your video, writing, production, and editing assistant, all in one. It can generate a storyboard you can easily edit, and after choosing a style, it pieces together your first draft with suggested scenes from stock videos, images, and background music. It can also help you land your message with the right voiceover — either choosing one of our preset voiceovers or using your own. Vids will sit alongside other Workspace apps like Docs, Sheets, and Slides. It includes a simple, easy-to-use interface and the ability to collaborate and share projects securely from your browser. Vids is being released to Workspace Labs in June.
AI Meetings and Messaging add-on: With “take notes for me”, chat summarization, and real-time translation in 69 languages (equal to 4,600 language pairs), this collaboration tool will only cost $10 per user, per month.
New AI Security add-on: Workspace admins can now automatically classify and protect sensitive files and data using privacy-preserving AI models and Data Loss Prevention controls trained for their organization. The AI Security add-on is available for $10 per user, per month and can be added to most Workspace plans.

Benefit from AI agents

With our entire AI portfolio — infrastructure, Gemini, models, Vertex AI, and Workspace — many customers and partners are building increasingly sophisticated AI agents. We are excited to see organizations building AI agents that serve customers, support employees, and help them create content, in addition to the coding agents, data agents, and security agents mentioned earlier.

Great customer agents can understand what you want, know the products and facts, engage conveniently, and ultimately help your customers interact with your business more seamlessly. The most impactful customer agents work across channels — web, mobile, call center, and point of sale — and in multiple modalities, like text, voice, and more. The opportunity for customer agents is tremendous for every organization, and our customers are just getting started:

Discover Financial’s 10,000 contact center reps search and synthesize across detailed policies and procedures during calls.
IHG Hotels & Resorts will launch a generative AI-powered travel planning capability that can help guests easily plan their next vacation.
Minnesota’s Department of Public Safety helps foreign-language speakers get licenses and other services with two-way, real-time translation.
Target is optimizing offers and curbside pickup on the Target app and Target.com.

Employee agents help all your employees be more productive and work better together. Employee agents can streamline processes, manage repetitive tasks, answer employee questions, as well as edit and translate critical communications. We’re seeing the impact of this every day with our customers, including Bristol Myers Squibb, HCA Healthcare, Sutherland, a leading services team, and more:

Dasa, the largest medical diagnostics company in Brazil, is helping physicians detect relevant findings in test results more quickly.
Etsy uses Vertex AI to improve search, provide more personalized recommendations to buyers, optimize their ads models, and increase the accuracy of delivery dates.
Pennymac, a leading US-based national mortgage lender, is using Gemini across several teams including HR, where Gemini in Docs, Sheets, Slides and Gmail is helping them accelerate recruiting, hiring, and new employee onboarding.
Pepperdine University benefits from Gemini in Google Meet, which enables real-time translated captioning and notes for students and faculty who speak a wide range of languages.

Creative agents
Creative Agents can serve as the best designer and production team — working across images, slides, and exploring concepts with you. We provide the most powerful platform and stack to build creative agents, and many customers are building agents for their marketing teams, audio and video production teams, and all the creative people that can use a hand, for example:

Canva is using Vertex AI to power its Magic Design for Video, helping users skip tedious editing steps.
Carrefour is pioneering new ways to use gen AI for marketing. Using Vertex AI, they were able to create dynamic campaigns across various social networks in weeks not months.
Procter & Gamble is using Imagen to accelerate the development of photo-realistic images and creative assets, giving teams more time back to focus on high-level plans.

A leading AI ecosystem

To adopt gen AI broadly, customers need an enterprise AI platform that provides the broadest set of end-to-end capabilities, highly optimized for cost and performance — an open platform that offers choice, is easy to integrate with existing systems, and is supported by a broad ecosystem. At Google, we are engineering our AI platform to address these challenges. Google Cloud is the only major cloud provider offering both first-party and extensible, partner-enabled solutions at every layer of the AI stack. Through Google Cloud’s own innovations and those of our partners, we’re able to provide choice across infrastructure, chips, models, data solutions, and AI tooling to help customers build new gen AI applications and create business value with this exciting technology.

At Next ‘24, we’re highlighting important news and innovations with our partners across every layer of the stack, including:

Broadcom will migrate its VMware workloads to our trusted infrastructure and use Vertex AI to enhance customer experiences.
Palo Alto Networks is choosing Google Cloud as its AI provider of choice, helping to improve cybersecurity outcomes for global businesses.
Accenture, Capgemini, Cognizant, Deloitte, HCLTech, KPMG, McKinsey, and PwC have all announced expanded gen AI implementation services for enterprises, and our ecosystem of services partners have now taken more than half a million gen AI courses in order to help deliver the cloud of connected agents.

To date, we have helped more than a million developers get started with gen AI and our gen AI trainings have been taken millions of times. Looking back at this past year, it’s truly remarkable to see how quickly our customers have moved from enthusiasm and experimentation to implementing AI tools and launching early-stage products.

Today, realizing the full potential of the cloud goes beyond infrastructure, network, and storage. It demands a new way of thinking. It’s embracing possibilities to solve problems in boldly creative ways, and reimagining solutions to achieve the previously impossible. We’re both inspired and amazed to see this mindset quickly materialize in our customers’ work as they pave new paths forward in the AI era — whether automating day-to-day tasks or tackling complex challenges.

The world is changing, but at Google, our north star is the same: to make AI helpful for everyone, to improve the lives of as many people as possible.

Thank you, customers, developers, and partners for entrusting us to join you on this journey. We can’t wait to see what you do next.