Unleash your AI workloads with Google Cloud’s latest storage solutions

August 24, 2023

By

Google Cloud Blog

As businesses continue to grapple with the unique storage and access demands of data-intensive AI workloads, they’re looking to the cloud to deliver highly capable, cost-effective, and easily manageable storage solutions.

To deliver the right cloud storage solution for the right application, today we’re launching three new solutions:

Parallelstore, a parallel file system for demanding AI and HPC applications that use GPU/TPUs
Cloud Storage FUSE for AI applications that require file system semantics
Google Cloud NetApp Volumes, for enterprise applications running in the cloud

Parallelstore

Google Cloud customers training AI models often turn to GPU/TPUs to get the performance they need. But let’s face it: those resources are limited and an infrastructure asset you want to fully utilize. Google Cloud Parallelstore, now in private Preview, helps you stop wasting precious GPU resources while you wait for storage I/O by providing a high-performing parallel file storage solution for AI/ML and HPC workloads.

By keeping your GPUs saturated with the data you need to optimize the AI/ML training phase, Parallelstore can help you significantly reduce — or even eliminate — costs associated with idle GPUs.

Based on the next-generation Intel DAOS architecture, all compute nodes in a Parallelstore environment have equal access to storage, so VMs can get immediate access to their data. With up to 6.3x read throughput performance compared to competitive Lustre Scratch offerings., Parallelstore is well suited for cloud-based applications that require extremely high performance (IOPS and throughput) and ultra low latency.

As companies seek to migrate their data across pre-processing, model development, training, and checkpoint stages, Parallelstore is a differentiated high performance solution for when they need to push the limits of I/O patterns, file sizes, latency, and throughput. For high-performance AI/ML workloads, Parallelstore can be configured to eliminate waste on unnecessary storage so you’re not caught flat-footed with a solution that can’t handle your workload requirements.

For more details, check out our Parallelstore web page.

Cloud Storage FUSE

Cloud Storage FUSE lets you mount and access Cloud Storage buckets as local file systems. With Cloud Storage FUSE you get a smooth experience for AI applications that need file system semantics to store and access training data, models, and checkpoints, while preserving the scale, affordability, performance, and simplicity of Cloud Storage.

Now generally available as a first-party Google Cloud offering, Cloud Storage FUSE is focused on delivering four key benefits:

Compatibility: Because Cloud Storage FUSE enables objects in Cloud Storage buckets to be accessed as files mounted as a local file system, it removes the need for refactoring applications to call cloud-specific APIs.
Reliability: As a first-party offering, Cloud Storage FUSE is integrated with the official Go Cloud Storage client library, and has been specifically validated for PyTorch and TensorFlow at high scale and long duration using ViT DINO and ResNet ML models.
Performance: Because Cloud Storage FUSE lets developers treat a Cloud Storage bucket as a local file system, there’s no delay caused by moving data from sources to GPUs and TPUs, eliminating much of the resource idle time that read-heavy machine learning workloads incur. OpenX, a global adtech company, reduced pod startup time by 40% by using Cloud Storage FUSE with Google Kubernetes Engine (GKE). Previously, OpenX relied on a homegrown solution to fetch data files from Cloud Storage at pod startup.
Portability: You can deploy Cloud Storage FUSE in your own environment as a Linux package, using pre-built Google ML images, as part of the Vertex AI platform, or as part of a turn-key integration with GKE through the Cloud Storage Fuse CSI driver.

For more details, check out today’s blog post on our new first-party Cloud Storage FUSE offering, or the Cloud Storage FUSE product page for full documentation.

Google Cloud NetApp Volumes

Many enterprise customers that have architected their applications on top of NetApp storage arrays now want to migrate those workloads to the cloud. NetApp Volumes simplifies the process by providing a fully Google-managed, high-performance file storage service designed specifically for demanding enterprise applications and workloads.

NetApp Volumes helps customers unlock the full potential of today’s most demanding workloads thanks to:

The capability to increase volumes from 100GiB to 100TiB for maximum scalability
The ability to implement ONTAP data management for hybrid workloads, providing a familiar management interface for longtime NetApp customers
The power and flexibility to run either Windows or Linux applications as virtual machines without refactoring

For more information, check out our blog post announcing NetApp Volumes or the NetApp Volumes product page.

The right storage options

AI has become instrumental in automating data management. As you adapt to these workloads, we want to make the process as seamless as possible with options tailored to your training model needs. With the right storage solution, you can simplify operations, unlock innovation, reduce costs, and position your business to meet the changing needs of your workloads and applications.

Start using Cloud Storage FUSE or NetApp Volumes today by visiting the Google Cloud console. To begin using Parallelstore, contact your Google account manager.