Generative AI is changing the way we create, innovate, and interact with the world. From generating realistic images and videos to composing music and writing code, gen AI models are pushing the boundaries of what’s possible. But achieving the heights of AI’s promises hinges on a scalable storage foundation.
At Google Cloud, we’re committed to providing the infrastructure for businesses to harness the possibilities of gen AI. At Google Cloud Next ’24, we’re excited to announce a series of advancements in our storage portfolio.
Accelerating AI training and inference with Google Cloud Storage
Gen AI models train on datasets in a computationally intensive and time-consuming process, gradually refining their ability to generate new content that resembles the training data. Similarly, AI inference (serving) in production requires low-latency access to models. At Next ’24, we introduced new storage solutions that address the challenge of decreasing model load, training, and inference times while maximizing accelerator utilization.
Cloud Storage FUSE with file caching: Faster training and inference through local data access
Cloud Storage FUSE allows you to mount Cloud Storage buckets as filesystems — a game-changer for AI/ML workloads that rely on frameworks that often require file-based data access. Training and inference can leverage the benefits of Cloud Storage, including lower cost, through filesystem APIs. And with the addition of file caching, Cloud Storage FUSE can increase training throughput by 2.9X. By keeping frequently accessed data closer to your compute instances, Cloud Storage FUSE file caching delivers faster training compared to native ML framework data loaders, so you can rapidly iterate and bring your gen AI models to market quicker.
Parallelstore: Ultra-low latency and caching for demanding training workloads
Parallelstore, Google Cloud’s parallel file system for high-performance computing and AI/ML workloads now also includes caching in preview. It delivers high performance, making it ideal for training and complex gen AI models. With caching, it enables up to 3.9X faster training times and up to 3.7X higher training throughput compared to native ML framework data loaders. Parallelstore also features optimized data import and export from Cloud Storage, to further accelerate training.
Hyperdisk ML: Purpose-built for high-performance training and inference
Training and serving inference in production require fast and reliable access to data. Hyperdisk ML is a new block storage offering that’s purpose-built for AI workloads. Currently in preview, it delivers exceptional performance, not only accelerating training times, but also increasing model load times up to 11.9X compared to common alternatives. Hyperdisk ML allows you to attach up to 2,500 instances to the same volume, so a single volume can serve over 150x more compute instances than competitive block storage volumes ensuring that storage access scales with your accelerator needs.
Manage storage at scale with Generate insights with Gemini
Google Cloud is innovating to use large language models (LLMs) to help you manage cloud storage at scale. Generate insights with Gemini is built upon Insights Datasets, a Google-managed, BigQuery-based storage metadata warehouse. Using simple, natural language, you can easily and quickly analyze your storage footprint, optimize costs, and enhance security — even when managing billions of objects.
Leveraging Google Cloud’s history of thoughtfully-designed user experiences we’ve tailored Generate insights with Gemini with solutions to meet the demanding requirements of modern organizations, including:
-
Fully validated responses for top customer questions: Verified data responses for pre-canned prompts, ensuring rapid, precise answers to your team’s most critical questions.
-
Accelerated understanding with visuals: Translate complex data into clear, visual representations, making it easy to understand, analyze, and share key findings across teams.
-
Dive deeper with multi-turn chat: Need more context or have follow-up questions? Generate insights with Gemini’s multi-turn chat feature allows you to engage in interactive analysis, and gain a granular understanding of your environment.
Generate insights with Gemini is available now through the Google Cloud console as an allowlist experimental release.
Other notable storage announcements
Beyond AI/ML, we also unveiled a range of storage innovations at Next ’24 that benefit a wide variety of use cases:
-
Google Cloud NetApp Volumes: NetApp Volumes is a fully managed, SMB and NFS storage service that provides advanced data management capabilities and highly scalable performance, for enhanced cost efficiency and performance for Windows and Linux workloads. And now, NetApp Volumes dynamically migrates files by policy to lower-cost storage based on access frequency (in preview Q2’24). In addition, NetApp Volumes Premium and Extreme service levels will support volumes of up to 1PB in size, and are increasing throughput performance up to 2X and 3X, respectively (preview Q2’24). Additionally, we are introducing a new Flex service level enabling volumes as small as 1GiB, and expanding to 15 new Google Cloud regions in Q2’24 (GA).
-
Filestore: Google Cloud’s fully managed file storage service now supports single-share backup for Filestore Persistent Volumes and Google Kubernetes Engine (GKE) (generally available) and NFS v4.1 ( preview), plus expanded Filestore Enterprise capacity up to 100TiB.
-
Hyperdisk Storage Pools: With Hyperdisk Advanced Capacity (generally available) and Advanced Performance (preview), you can purchase and manage block storage capacity in a pool that’s shared across workloads. Individual volumes are thinly provisioned from these pools; they only consume capacity as data is actually written to disk, and they benefit from data reduction such as deduplication and compression. This lets you substantially increase storage utilization and can reduce storage TCO by over 50% in typical scenarios, compared to leading cloud providers. Google is the first and only cloud hyperscaler to offer storage capacity pooling.
-
Anywhere Cache: Working with multi-region buckets, Cloud Storage Anywhere Cache now uses zonal SSD read cache across multiple regions within a continent to speed up cacheable workloads such as analytics, and AI/ML training and inference (allowlist GA).
-
Soft delete: With this feature, Cloud Storage protects against accidental or malicious deletion of data by preserving deleted items for a configurable period of time (generally available).
-
Managed Folders: This new Cloud Storage resource type allows granular IAM permissions to be applied to groups of objects (generally available).
-
Tag-based at scale backup: With this feature, users can leverage Google Cloud tags to manage data protection for Compute Engine VMs (generally available).
-
High-performance backup for SAP HANA: A new option for backups of SAP HANA databases running in Compute Engine VMs leverages persistent disk (PD) snapshot capabilities for database-aware backups (generally available).
-
Backup and DR Service Report Manager: Customers can now customize reports with data from Google Cloud Backup and DR using Cloud Monitoring, Cloud Logging, and BigQuery (generally available).
Accelerate your journey with Google Cloud Storage
At Google Cloud, we’re committed to empowering businesses to unlock the full potential of AI/ML, enterprise, and cloud-first workloads. Whether you’re training massive gen AI models, serving inference at scale, or running Windows or GKE workloads, Google Cloud storage provides the versatility and power you need to succeed. Get in touch with your account team to learn how we can help you unleash the potential of generative AI with Google Cloud storage. You can also attend the following sessions live at Next ‘24 or watch them afterwards:
-
ARC 232 Next Generation Storage: Designing storage for the future
-
ARC 306 How to define a storage infrastructure for AI/ML workloads
-
ARC 307 A Masterclass in Managing Billions of Google Cloud Storage Objects and Beyond
-
ARC 204 How to optimize block storage for any workload with the latest from Hyperdisk