Meta’s Llama 3.2 is now available on Google Cloud

September 25, 2024

By

Google Cloud Blog

In July, we announced the addition of Meta’s Llama 3.1 open models to Vertex AI Model Garden. Since then, developers and enterprises have shown tremendous enthusiasm for building with the Llama models. Today, we’re announcing that Llama 3.2, Meta’s new generation of multimodal models, is available on Vertex AI Model Garden.

Llama 3.2 is a new generation of vision and lightweight models that fit on edge devices, tailored for use cases that require more private and personalized AI experiences. With the new models:

Llama goes multimodal: With Llama 3.2 new 11B and 90B vision LLMs, you will now be able to reason on high-resolution images such as charts, graphs, or image captioning. This unlocks new possibilities such as image-based search and content generation, interactive educational tools and beyond.
Llama goes small: Llama 3.2 new 1B and 3B lightweight models are designed for seamless integration on mobile and edge devices. With these models, you can build private, personalized AI experiences with minimal latency and resource overhead. Imagine on-device multilingual summarization, information retrieval, and local AI agents — all while preserving user privacy.

The new Llama models prioritize accessibility, efficiency, and privacy, with a focus on responsible innovation and system-level safety.

Vertex AI provides you with a unified AI platform for experimenting with, customizing, and deploying models like Llama 3.2 with ease. The addition of Llama 3.2 to Vertex AI expands our curated collection of over 160 enterprise-ready first-party, open-source, and third-party models in Model Garden — enabling you to select the best models for your needs through our open and flexible AI ecosystem.

You can easily access the new 90B model in preview here through our Model-as-a-Service (MaaS) offering. With MaaS, you can access the model instantly, tailor it with robust development tools, and deploy with fully managed infrastructure and pay-as-you-go billing. 90B will become generally available in the coming weeks. The 11B vision model will also be available as MaaS in the coming weeks.
All four Llama 3.2 models are ready for self-service deployment through Vertex AI Model Garden, starting today.

Customers building with Llama on Google Cloud

Shopify is optimizing its data generation processes with Llama on Vertex AI to bring data-driven insights to support businesses worldwide:

“Using Llama 3.1 on Google Cloud Vertex AI has made high-quality data generation easier and more efficient for Shopify. The convenience of Vertex AI’s infrastructure allows us to consistently deliver dependable outputs for critical applications, streamlining the process for our teams,” said Mike Tamir, Distinguished ML Engineer at Shopify. “With Vertex AI we’re able to deliver precise, data-driven insights to support millions of businesses worldwide, helping shape the future of commerce.”

TransCrypts leverages Llama on Google Cloud to bring its AI-powered financial guide, Castello, to thousands of customers:

“Using Llama 3.1 on Google Cloud has been a game-changer for TransCrypts. The performance and cost-efficiency of TPUs allow us to deploy these advanced models at lightning speed, handling complex workloads that would otherwise be out of reach,” said Zain Zaidi, Co-Founder & CEO of TransCrypts. “Google Cloud’s ease in enabling us to effortlessly scale has allowed us to offer our solution, Castello, to tens of thousands of customers in a matter of days.”

BMC has integrated Llama 3.1 into its flagship BMC Helix platform to accelerate IT service and operations management through conversational AI, recommendations, and analysis:

“We’re thrilled to partner with Google Cloud, bringing the power of Vertex AI and Llama 3.1 to our BMC Helix platform,” said Margaret Lee, GM and SVP of the Digital Service and Operations Management Business Unit at BMC. “This integration significantly boosts accuracy for conversational AI and AIOps recommendations, giving our customers access to cutting-edge AI solutions tailored to their needs.”

Using Llama 3.2 on Google Cloud

By using Llama 3.2 on Vertex AI, you can:

Experiment with confidence: Explore Llama 3.2 capabilities through simple API calls and our comprehensive generative AI evaluation service within Vertex AI’s intuitive environment, without worrying about complex deployment processes.
Tailor Llama 3.2 to your exact needs: Fine-tune the model using your own data to build bespoke solutions tailored to your unique needs.
Ground your AI in truth: Make sure your AI outputs are reliable, relevant, and trustworthy with Vertex AI’s multiple options for grounding and RAG. For example, you can connect your models to enterprise systems, use Vertex AI Search for enterprise information retrieval, leverage Llama for generation, and more.
Craft intelligent agents: Create and orchestrate agents powered by Llama 3.2, using Vertex AI’s comprehensive set of tools, including LangChain on Vertex AI. Integrate Llama 3.2 into your AI experiences with Genkit’s Vertex AI plugin.
Deploy without overheads: Simplify deployment and scaling Llama 3.2 applications with flexible auto-scaling, pay-as-you-go pricing, and world-class infrastructure designed for AI.
Operate within your enterprise guardrails: Deploy with confidence with not only support for Meta’s Llama Guard for the models, but also Google Cloud’s built-in security, privacy, and compliance measures. Moreover, enterprise controls, such as Vertex AI Model Garden’s new organization policy, provide the right access controls to make sure only approved models are accessed by users.

Get started with Llama 3.2 on Google Cloud

With every new innovation in AI models, enterprise AI ecosystems become more diverse. Our partnership with Meta testifies to both organizations’ commitment to providing world-class innovation supported by an open and accessible AI ecosystem. We’ll continue to work closely with Meta and other partners to keep our customers at the forefront of AI capabilities.

Visit Model Garden and check out our documentation to start building with Llama 3.2.