Google named a Leader in The Forrester Wave™: AI Infrastructure Solutions, Q4 2025

Google named a Leader in The Forrester Wave™: AI Infrastructure Solutions, Q4 2025

For most organizations, the question is no longer if they will use AI, but how to scale it from a promising prototype into a production-grade service that drives business outcomes. In this age of inference, competitive advantage is defined by your ability to serve useful information to users around the world at the lowest possible cost. As you move from demos to production deployments at scale, you need to simplify infrastructure operations with integrated systems that provide the latest AI software and accelerator hardware platforms, while keeping costs and architectural complexity low. 

Yesterday, Forrester released The Forrester Wave™: AI Infrastructure Solutions, Q4 2025 report, evaluating 13 vendors, and we believe their findings validate our commitment to solving these core challenges. Google received the highest score of all vendors in the Current Offering category and received the highest possible score in 16 out of 19 evaluation criteria, including, but not limited to: Vision, Architecture, Training, Inferencing, Efficiency, and Security.

Access the full report: The Forrester Wave™: AI Infrastructure Solutions, Q4 2025

Accelerating time-to-value with an integrated system

Enterprises don’t run AI in a vacuum. They need to integrate it with a diverse range of applications and databases while adhering to stringent security protocols. Forrester recognized Google Cloud’s strategy of co-design by giving us the highest possible score in the Efficiency and Scalability criteria:

“Google pursues a strategy of silicon-infrastructure co-design. It develops TPUs to improve inference efficiency and NVIDIA GPUs for access to broader ecosystem compatibility. Google designs TPUs to integrate tightly with its networking fabric, giving customers high bandwidth and low latency for inference at scale.”

For over two decades, we have operated some of the world’s largest services, from Google Search and YouTube to Maps, where their unprecedented scale required us to solve problems that no one else had. We couldn’t simply buy the platform and infrastructure we needed; we had to invent it. This led to a decade-long journey of deep, system-level co-design, building everything from our custom network fabric and specialized accelerators to frontier models, all under one roof. 

The result was an integrated supercomputing system, AI Hypercomputer, which has paid significant dividends for our customers. It supports a wide range of AI-optimized hardware, allowing you to optimize for granular, workload-level objectives — whether that’s higher throughput, lower latency, faster time-to-results, or lower TCO. That means you can use our custom Tensor Processing Units (TPUs), the latest NVIDIA GPUs, or both, backed by a system that tightly integrates accelerators with networking and storage for exceptional performance and efficiency. It’s also why today, leading generative AI companies such as Anthropic, Lightricks, and LG AI Research trust Google Cloud to power their most demanding AI workloads.1

This system-level integration lays the foundation for speed, but operational complexity could still slow you down. To accelerate your time-to-market, we provide multiple ways to deploy and manage AI infrastructure, abstracting away the heavy lifting regardless of your preferred workflow. Google Kubernetes Engine (GKE) Autopilot automates management for containerized applications, helping customers like LiveX.AI reduce operational costs by 66%. Similarly, Cluster Director simplifies deployment for Slurm-based environments, enabling customers like LG AI Research to slash setup time from 10 days to under one day. 

Managing AI cost and complexity

Forrester gave Google Cloud the highest scores possible in the Pricing Flexibility and Transparency criterion. The price of compute is only one part of the AI infrastructure cost equation. A complete view should also account for development costs, downtime and inefficient resource utilization. We offer optionality at every layer of the stack to provide the flexibility businesses demand.

  • Flexible consumption: Dynamic Workload Scheduler allows you to secure compute at up to 50% savings, by ensuring you only pay for the capacity you need, when you need it.

  • Load balancing: GKE Inference Gateway improves throughput by using AI-aware routing to balance requests across models, preventing bottlenecks and ensuring servers aren’t sitting idle.

  • Eliminating data bottlenecks: Anywhere Cache co-locates data with compute, reducing read latency by up to 96% and eliminating the “integration tax” of moving data. By using Anywhere Cache together with our unified data platform BigQuery, you can avoid latency and egress fees while keeping your accelerators fed with data. 

Mitigating strategic risk through flexibility and choice

We are also committed to enabling customer choice across accelerators, frameworks and multicloud environments. This isn’t new for us. Our deep experience with Kubernetes, which we developed then open-sourced, taught us that open ecosystems are the fastest path to innovation and provide our customers with the most flexibility. We are bringing that same ethos to the AI era by actively contributing to the tools you already use.

  • Open source frameworks and hardware portability: We continue to support open frameworks such as PyTorch, JAX, and Keras. We’ve also directly addressed concerns about workload portability on custom silicon by investing in TPU support for vLLM, allowing developers to easily switch between TPUs and GPUs (or use both) with only minimal configuration changes.

  • Hybrid and multicloud flexibility: Our commitment to choice extends to where you run your applications. Google Distributed Cloud brings our services to on-premises, edge and cloud locations, while Cross-Cloud Network securely connects applications and users with high-speed connectivity between your environments and other clouds. This powerful combination means you’re no longer locked into a specific environment; you can easily migrate workloads and apply uniform management practices, streamlining operations, and mitigating the risk of lock-in.

Systems you can rely on

When your entire business model depends on the availability of AI services, infrastructure uptime is critical. Google Cloud’s global infrastructure is engineered for enterprise-grade reliability, an approach rooted in our history as the birthplace of Site Reliability Engineering (SRE).

We operate one of the world’s largest private software-defined networks, handling approximately 25% of global internet egress traffic. Unlike providers that rely on the public internet, we keep your traffic on Google’s own fiber to improve speed, reliability, and latency. This global backbone is powered by our Jupiter data center fabric, which scales to 13 Petabits/sec of bandwidth, delivering 50x greater reliability than previous generations — to say nothing of other providers. Finally, to improve cluster-level fault tolerance, we employ capabilities like elastic training and multi-tier checkpointing, which allow jobs to continue uninterrupted, by dynamically resizing the cluster around failed nodes while minimizing the time to recovery.

Building on a secure foundation

Our approach is to secure AI from the ground up. In fact, Google Cloud maintains a leading track record for cloud security. Independent analysis from cloudvulndb.org (2024-2025) shows that our platform has up to 70% fewer critical and high vulnerabilities compared to the other two leading cloud providers. We were also the first in the industry to publish an AI/ML Privacy Commitment, which guarantees that we do not use your data to train our models. With those safeguards in place, security is integrated into the foundation of Google Cloud, based on the zero-trust principles that protect Google’s own services:

  • A hardware root of trust: Our custom Titan chips, as part of our Titanium architecture, create a verifiable hardware root of trust. We recently extended this with Titanium Intelligence Enclaves for Private AI Compute, allowing you to process sensitive data in a hardened, isolated, and encrypted environment.

  • Built-in AI security: Security Command Center (SCC) natively integrates with our infrastructure, providing AI Protection by automatically discovering assets, preventing security issues, detecting active threats with frontline Google Threat Intelligence, and discovering known and unknown risks before attackers can exploit them.  

  • Sovereign solutions: We enable you to meet stringent data residency, operational control, and software sovereignty requirements through solutions like Data Boundary. This is complemented by flexible options like partner-operated sovereign controls and Google Distributed Cloud for air-gapped needs.

  • Platform controls for AI and agent governance: Vertex AI provides the essential governance layer for the enterprise builder to deploy models and agents at scale. This trust is anchored in Google Cloud’s secure-by-default infrastructure, utilizing platform controls like VPC Service Controls (VPC-SC) and Customer-Managed Encryption Keys (CMEK) to sandbox environments and protect sensitive data, and Agent Identity for granular IAM permissions. At the platform level, Vertex AI and Agent Builder integrate Model Armor to provide runtime protection against emergent agentic threats, such as prompt injection and data exfiltration. 

Delivering continuous AI innovation

We are honored to be recognized as a Leader in The Forrester Wave™ report, which we believe validates decades of R&D and our approach to building ultra-scale AI infrastructure. Look to us to continue on this path of system-level innovation as we help you convert the promise of AI into a reality.

Access the full report: The Forrester Wave™: AI Infrastructure Solutions, Q4 2025


1. IDC Business Value Snapshot, Sponsored by Google Cloud, The Business Value of Google Cloud AI Hypercomputer, US53855425, October 2025