BigQuery and Anthropic’s Claude: A powerful combination for data-driven insights

BigQuery and Anthropic’s Claude: A powerful combination for data-driven insights

The world’s most productive and innovative organizations rely on their trusted business data to inform their decision-making, operational efficiency, insights, and growth. Now, gen AI enters the equation, opening up possibilities to transform this wealth of information into an unprecedented competitive edge. 

Google Cloud has been at the forefront of integrating advanced gen AI capabilities directly within BigQuery, our gen AI-ready data platform. Organizations are already harnessing gen AI models like Gemini 1.5 Pro on Vertex AI within the BigQuery platform. And today, we’re extending Google Cloud’s open platform with the preview of BigQuery’s new integration with Anthropic’s Claude models on Vertex AI that connects your data in BigQuery with the powerful intelligence capabilities of Claude models.

Organizations can now access the power of Anthropic’s Claude models that offer advanced gen AI capabilities through BigQuery ML (BQML). BQML simplifies the application of machine learning to data within BigQuery, making it accessible to analysts and SQL users. This integration enables tasks such as text generation, summarization, translation, and more, to be performed directly on your data. 

Powerful use cases

BigQuery’s integration with Anthropic’s Claude models allows organizations to reimagine data-driven decision making and boost productivity across a variety of tasks including:

  1. Analyzing log data for enhanced security: Security teams can efficiently analyze log data in BigQuery, converting complex technical information into clear, readable form and generating appropriate response strategies.
  2. Marketing optimization: Marketing teams can now harness user and product data stored in BigQuery to generate targeted, data-driven campaigns at scale — helping to boost engagement and ROI.
  3. Document summarization: Organizations can streamline knowledge management by automatically summarizing internal documents stored in Google Cloud Storage, saving time and resources.
  4. Content localization: Global organizations can quickly translate text content stored in BigQuery, facilitating communication across language barriers.

Let’s further explore a couple of examples showcasing the possibilities of using Claude models in BigQuery.

Log summarization and recommendations

Organizations commonly store error log data in BigQuery for its ease of use, scalability, and advanced features such as search and vector indexes, which aid in log analytics. Combining your BigQuery data with the Claude models on Vertex AI can supercharge this use case. For example, organizations can efficiently summarize log entries and generate suggested fixes to streamline issue identification and resolution processes. 

Let’s see how:

code_block
<ListValue: [StructValue([('code', "CREATE OR REPLACE MODELrn`PROJECT_ID.DATASET_ID.MODEL_NAME`rnREMOTE WITH CONNECTION `PROJECT_ID.REGION.CONNECTION_ID`rnOPTIONS (ENDPOINT = 'claude-3-5-sonnet');"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x3e9970fe8670>)])]>
code_block
<ListValue: [StructValue([('code', 'SELECTrn logName,rn Log_payload,rn ml_generate_text_result.content[0].text AS log_summary_and_suggestionrn FROMrn ML.GENERATE_TEXT(rn MODEL `PROJECT_ID.DATASET_ID.MODEL_NAME`,rn (rn SELECTrn logName,rn TO_JSON_STRING(protopayload_auditlog) as Log_payload,rn concat("summarize this log payload in one sentence and recommend solutions, please remove sensitive information: ",TO_JSON_STRING(protopayload_auditlog)) AS promptrn from `PROJECT_ID.DATASET_ID.Sample_log_table`rn )rn )'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x3e9970fe8df0>)])]>

Figure X-Summarizing log entries and recommend suggestions.png

Summarizing log entries and recommending fixes

And there you have it! We’ve generated a concise log summary and recommended solutions using only SQL and the power of Claude’s AI capabilities.

Translating museum art descriptions

Let’s explore another use case: translating the titles of Korean art pieces stored in a BigQuery table into English. Claude can efficiently handle this task for you.

code_block
<ListValue: [StructValue([('code', 'SELECTrn object_id,rn title,rn ml_generate_text_result.content[0].text AS translationsrn FROMrn ML.GENERATE_TEXT(rn MODEL `PROJECT_ID.DATASET_ID.MODEL_NAME`,rn (rn SELECTrn object_id,rnttitle,rn concat("translate this into English and only return the translated result:" ,title) AS promptrn from `PROJECT_ID.DATASET_ID.sample_Museum_objects_table`rn )rn )'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x3e9970fe8f40>)])]>

Figure X2-translation

Translating Korean museum art descriptions into English

Get started with Claude in BigQuery 

To get started with Claude in BigQuery, you can follow our documentation or import our sample notebook directly into BigQuery Studio for a hands-on walkthrough.

For users seeking more advanced Python support and configuration flexibility, we also offer two additional integration methods:

  1. Python with BigQuery Studio (generally available): Data scientists and Python developers can utilize notebooks in the BigQuery UI to directly connect BigQuery data to the Claude models using Python. For a quick start guide and example code, refer to our sample notebook that uses BigQuery DataFrames.

  2. BigQuery remote functions (generally available): This method is ideal for code-heavy users, offering high flexibility and access to all Claude models. Get started by exploring our sample GitHub repository. You can also use this sample notebook, which leverages BigQuery DataFrames to automatically create remote functions and perform inference with Claude.

Anthropic’s Claude integration with BigQuery marks a significant step forward in democratizing gen AI and enabling businesses of all sizes to harness the full potential of their data. We encourage you to explore this integration and discover how it can transform your data analytics workflows.