Data Analytics

Frontier and Center: Who evaluates the evaluations?

Fri, 10 Jul 2026 16:00:00 +0000

Editor’s note: Some of the most interesting questions in AI are being asked by information theoreticians, around how to provide context to an emerging class of AI agents. A few weeks ago, we waded into those waters with a blog about the Open Knowledge Format, a specification that formalizes the LLM-wiki pattern into a portable, interoperable format to represent the metadata, context, and curated knowledge that modern AI systems need to operate. That blog generated a ton of interest, so we’ve decided to bring you more of the same, as part of our new “Frontier and Center” series. Today, we hear from two members of Google Data Cloud’s frontier AI team on the recurring challenge of how to systematically evaluate whether or not an agent is able to answer questions effectively based on its context. Read on for more, and watch this space for more blogs from this team.

A passing grade is the least interesting thing an exam can tell you. It says the student cleared the bar; leaving you entirely in the dark about how narrow their failures were, how effortless their passes were, or what to teach next. Yet this is exactly how we evaluate AI agents. We run a fixed benchmark, calculate a score, and declare progress. In doing so, we are handing our agents a pass/fail exam when what we actually need is a map of the agent’s capabilities: a picture of the terrain that shows exactly where capability falls off, and by how much.

For data agents, this map matters a lot for data discovery in search and retrieval — the unglamorous first step where an agent, handed a vague human question and a warehouse or data lake of thousands of tables and files, has to find the right datasets before it can reason over anything. Discovery is a "needle in a haystack" problem. Real users phrase their questions imperfectly, and inferring what datasets to retrieve presents a real challenge to agents. So the interesting question in evaluations is never "can the agent pass?" It is "how vague can the question get before the agent breaks?" An exam cannot easily answer that, but a map can.

Today, we share an approach rooted in information theory that we’ve been leveraging to add detail and nuance, i.e., fidelity, to benchmarks, so we can better understand agents’ performance as a part of their evaluations. Along the way, the added fidelity exposed some deeper issues with the quality of emergent evaluation cases themselves.

Difficulty, measured

When it comes to retrieval, evaluation cases are often stratified into tiers of difficulty. This can happen organically, e.g., pervasive and enduring failure scenarios are deemed difficult. Or it can be from labels applied by humans or machines categorizing some questions as "easy" or "hard" for an agent to answer correctly, e.g., based on the context provided in the query. While this kind of sentiment-based labeling is not the only way to label test cases, it’s frequently used despite its imperfections, such as being challenging to reproduce.

Despite being an industry staple, the approach of assessing every evaluation case by hand is unrealistic at scale. What we need is a rigorous approach that can modulate the difficulty of evaluation cases. We’re iterating on a meta-benchmark we call Discovery Bench: a framework that modulates an evaluation case by generating “easy” and “hard” variations of every case. This allows us to audit how close or how far an agent is from succeeding in those cases.

The lever for modulating the difficulty of an input query comes via a tried-and-trusted concept that’s present across information theory and machine learning: surprisal, or the likelihood of an output given a set of inputs. In our case, a query’s surprisal represents the uncertainty that remains about the correct dataset given the query.

The thinking behind our approach is simple: A term or a phrase in an evaluation query has high informative power when it sharply distinguishes the target from everything else in the corpus. Therefore, we can adjust the difficulty of evaluation cases by adding or removing terms with varying levels of informative power.

Let’s work through a real example from KramaBench, a publicly available benchmark. One of KramaBench’s datasets has information about orbiting satellites, and the example query from the suite includes the following text: "…the total count of satellite major altitude changes for satellite 48445 during 2024 using TLE history."

The token "TLE" is sharply distinguishing; it points almost uniquely at the TLE_____48445 table from the dataset. Strip it, and the query degrades to "the count of satellite altitudes for satellite 48445," whose vague phrasing now matches density tables, precise-orbit files, and decay logs alike. Surprisal makes this quantitative: rare, pointed terms carry more bits than common ones.

The remaining surprisal of a query is how much uncertainty is left about its answer. As surprisal approaches zero, the query has become specific enough to pinpoint exactly one dataset.

The heart of the idea behind Discovery Bench is this refinement loop, which we call iterative surprisal-based query refinement, or iSQR, which generates cases with higher or lower informative power to test where an agent can start successfully answering the query:

Figure 2: The iSQR refinement loop.

The crux is being able to control the challenge embedded into the evaluation case by making adjustments: Instead of one fixed phrasing per question, we generate the same question at three levels of calibrated ambiguity [high, medium, low], with each grounded in bits (not subjective opinion). We can even justify, term by term, why a word was added or removed. Difficulty stops being a property that is attributed by sentiment or classification, and becomes one we engineer.

The cliff you couldn't see

Here is what Discovery Bench’s difficulty dial reveals — and what a single-phrasing benchmark structurally cannot.

We have an F1 agent that's built for recall (on Gemini 3.1 Pro). Running it against KramaBench and across the full sweep of ambiguity levels traces a curve: 0.34 at high ambiguity, 0.76 at neutral, 0.81 at medium, 0.78 at low.

Figure 3: F1 swept across ambiguity — the dot versus the curve.

Two findings fall out immediately (and neither were visible to a conventional eval).

First, the cliffs. This query scores a perfect F1 = 1.00 at neutral phrasing — and 0.00 at high ambiguity. It is the satellite-48445 case from above: drop the distinguishing token "TLE" and the agent loses the table entirely. Same query, same agent, same ground truth; one notch vaguer and it falls off a cliff. A static benchmark tests the neutral phrasing, stamps "solved," and reports flat ground where there is a precipice. Pass/fail was particularly misleading in that it did not just miss the cliff, but it told us the terrain was level.

Second, the sweet spot. For Discovery Agent, medium ambiguity beat neutral, and low ambiguity sometimes underperformed it. More specificity is not monotonically better for the system being evaluated; there is an optimal amount of steering. That is a graded, actionable signal. This is the "how close, how hard" texture we were missing from a scalar. It tells you where to hill-climb, or improve, the agent: in our case, straight at concrete failure modes like time-sharded tables (precision collapsing to ~8% as the agent over-retrieves 21 near-identical shards for a two-table answer) and context blow-up (F1 dropping from 0.75 to 0.32 once a query triggers long search chains). The map did not just say that the agent failed, but it said where, and why. Note that our hypothesis that less ambiguity and more context (via steering terms) should improve retrieval generally holds true, but for the specific Discovery Agent being exercised, the idiosyncratic “sweet spot” meaningfully highlighted trade-offs in its implementation.

We're not alone

The field is converging on meta-benchmarking and exerting greater control of how we challenge and evaluate our agents. A growing body of work uses item response theory, the latent-ability model behind standardized testing, to treat difficulty as a measured quantity rather than a label: tinyBenchmarks and metabench show that a handful of informative items reproduce a model's full score, and PSN-IRT turns the same lens on benchmark quality itself. Others audit the ground truth directly: MMLU-Redux found that 6.49% of Massive Multitask Language Understanding (MMLU) questions are mislabeled, and Platinum Benchmarks re-cleaned ten datasets to minimize both label errors and ambiguity — the same two axes we sweep for. And ambiguity is increasingly treated as intrinsic rather than noise: AmbigQA showed that a large fraction of real questions admit multiple readings, and later work finds that apparent hallucinations often stem from query ambiguity rather than model failure. What we have not seen elsewhere is the combination: information-theoretic ambiguity sweeping applied as a meta-benchmark over live enterprise data.

A benchmark we trusted turned out to be broken

We built our first evaluation on kramabench-astronomy, a benchmark established in the field, and one which other teams had already leaned on for their own evals. Teams derived benchmarks from this dataset, and we hypothesized subtle issues may have been introduced over time. When we actually read the benchmarks used by teams, with Gemini's help, we found it was wrong in meaningful ways: ground-truth tables that did not answer their query, a question whose 124 sharded tables exceeded what some teams’ retrieval APIs could even return, months specified where exact dates were required. Quietly broken ground truth means quietly wrong conclusions not just for us, but for every prior analysis built on it.

This is the generalized crux of the matter: an evaluation is itself an artifact that can be defective, and almost nobody evaluates it. We instrument the agent and trust the ruler, but where do we validate that the measuring stick makes sense?

When two maps disagree

Now the recursive turn: If difficulty is something we generate, then we need to evaluate the generator itself; we should not trust it blindly either.

So we built the same ambiguity sweep two ways: steering terms from a pure-LLM guess, versus terms grounded in TF-IDF surprisal. The two disagreed violently. At high ambiguity, the LLM-built sweep scored the agent at F1 ≈ 0.34; the grounded sweep, ≈ 0.85. One of these maps is badly distorted. The grounded one, predictably, is the more robust: surprisal gives it a footing the free-running LLM lacks.

This is "evaluate your evals," made concrete. The information-theoretic lens does not only grade the agent along a continuous axis; it grades the benchmark's own construction, and adjudicates between the two.

Evaluate your evals

We have spent years optimizing agents against rulers we never measured. The bitter irony is that better models make this worse: as agents clear coarse benchmarks, the score saturates near the top and the exam loses its ability to highlight where the agent can be improved.

So the call to action is uncomfortable and overdue: evaluate your evals. Read your ground truth. Treat difficulty as a measured quantity, not a label: sweep it, plot it, find the bit-width where your system breaks. Ask not just "did it pass?" but "how close was the miss, how hard was the pass, and would a slightly vaguer question have sent it off a cliff?" Build evaluations that produce signals; not just verdicts.

There is a genuine tension to sit with here. Difficulty-as-entropy is only as reliable as the model that estimates the entropy. There's a risk that if we push too hard on a measurable proxy, we optimize the ruler instead of the agent. That is not a reason to retreat to pass/fail; it is a reason to keep the evaluator under the same scrutiny as what it is evaluating. The moment we stop asking who evaluates the evaluators is the moment our maps stop being useful again.

^{1. Maia Polo, F. et al. tinyBenchmarks: Evaluating LLMs with Fewer Examples. ICML 2024. arxiv.org/abs/2402.14992}^{2. Kipnis, A. et al. metabench: A Sparse Benchmark of Reasoning and Knowledge in Large Language Models. ICLR 2025. arxiv.org/abs/2407.12844}^{3. Lost in Benchmarks? Rethinking Large Language Model Benchmarking with Item Response Theory (PSN-IRT). AAAI 2026. arxiv.org/abs/2505.15055}^{4. Gema, A. P. et al. Are We Done with MMLU? (MMLU-Redux). 2024. arxiv.org/abs/2406.04127}^{5. Vendrow, J. et al. Do Large Language Model Benchmarks Test Reliability? (Platinum Benchmarks). 2025. arxiv.org/abs/2502.03461}^{6. White, C., Dooley, S. et al. LiveBench: A Challenging, Contamination-Limited LLM Benchmark. 2024. arxiv.org/abs/2406.19314}^{7. Min, S. et al. AmbigQA: Answering Ambiguous Open-domain Questions. EMNLP 2020. aclanthology.org/2020.emnlp-main.466}^{8. Lai, E., Vitagliano, G. et al. KramaBench: A Benchmark for AI Systems on Data-to-Insight Pipelines over Data Lakes. 2025. arxiv.org/abs/2506.06541}

Shift into high gear with agents: Securing the software-defined vehicle

Mon, 06 Jul 2026 16:00:00 +0000

The automotive industry is at a pivotal crossroads as it hits the gas on adopting new technology. The era of the traditional connected vehicle has shifted into the age of the software-defined vehicle (SDV), notable for rapid innovation with many new capabilities delivered over the air.

By integrating AI and agents, the next generation of SDVs will be capable of turning raw telemetry into actionable insights in real-time, allowing for a fundamental rethink of how vehicles interact with their environment and their users. To better support and secure SDVs, Google Cloud and Valtech have partnered to develop Nexus SDV, a highly-scalable, AI-enabled connected vehicle platform built on Google Cloud. This modular, developer-friendly and open-source solution is designed to manage up to 100 million devices, and features deep integration with Android Automotive OS (AAOS) to streamline data flows and in-vehicle experiences.

We are proud to announce the first release of the Nexus SDV open-source core that showcases how it can reduce total cost of ownership through Arm-based compute and Bigtable, while providing a AI-native environment for building the next era of automotive intelligence.

AI-driven experiences with Nexus SDV

Nexus AI serves as the platform’s intelligent engine, transforming the vehicle from a passive data source into a proactive, agentic partner. Using Gemini models and Gemini Enterprise Agent Platform, Nexus AI can analyze complex telemetry in real-time to provide information for autonomous decision-making and hyper-personalized driver assistance, effectively acting as an intelligent agent that anticipates user needs.

Crucially, this advanced intelligence is paired with a focus on significant total cost of ownership (TCO) reduction. By using high-efficiency Arm-based compute and Bigtable-optimized data storage, the platform lowers the operational costs associated with processing massive data volumes. This modular, AI-native architecture ensures that manufacturers can scale their fleet intelligence rapidly without the prohibitive cloud and development expenses traditionally associated with next-generation vehicle software.

Cloud-native under the hood

The architecture of Nexus SDV is built on a modular, cloud-native foundation designed to bridge the gap between the vehicle edge and the data center. Deep compatibility with AAOS is the keystone of the close integration between the cloud and the vehicle, and will help ensure that high-fidelity telemetry is ingested and synchronized in real-time. This robust data loop allows Nexus AI to quickly push intelligent updates and services back to the vehicle.

By providing this developer-friendly, open framework, Nexus SDV enables manufacturers to manage the entire lifecycle of a SDV with the scalability and reliability of the Google Cloud ecosystem.

Architecture for Nexus SDV.

Defense in depth with Google Cloud Security controls

By building on Google's secure foundations, including secure-by-design and Zero Trust architecture, Nexus SDV supports the heavy lifting of compliance and threat protection. To achieve this, the Nexus SDV architecture implements a comprehensive, defense-in-depth security model across six key elements:

Mutual TLS (mTLS) and public key infrastructure (PKI)
Nexus SDV relies on cryptographic trust chains to authenticate vehicles before any data exchange can occur. The infrastructure uses Google Cloud Certificate Authority Service (CAS) to manage distinct CA pools (server, factory, and registration CAs), ensuring a highly available and secure root of trust.

Specifically, the registration server enforces registration by forcing clients to present a valid "factory-issued" certificate during the initial TLS handshake, extracting and parsing the certificate directly from the connection stream to definitively prove the vehicle's identity. During registration, the server performs Certificate Signing Request (CSR) validation sent by the vehicle before issuing a new operational certificate.

Identity and access management
The system uses identity brokering where Keycloak is deployed as the central OpenID Connect (OIDC) identity provider. Vehicles authenticate against Keycloak using their operational certificate via mTLS to receive a short-lived JSON Web Token (JWT).

For fine-grained access control, a custom NATS Auth Callout service provides dynamic subject permissions: It intercepts all messaging broker connection attempts, validates the Keycloak JWT using public JWK keys, and programmatically maps the vehicle's roles to specific NATS subjects.

For secure service-to-service communication, it uses Workload Identity Federation so pipelines exchange GitHub OIDC tokens for temporary Google Cloud access, removing static credentials, while GKE Workload Identity allows Kubernetes Pods to access backend services like Bigtable by binding Kubernetes service accounts to Google service accounts.

Security is reinforced through restricted IAM scopes, ensuring dedicated service accounts are provisioned with minimal permissions, such as the data API being restricted only to reading from Bigtable. Using VPC-SC, Organization policy constraints and Private Service Connect (PSC) in your deployment context also helps you achieve secure foundations.

Secret management
Nexus SDV relies on centralized secret management to protect sensitive information. All sensitive configurations, database passwords, and cryptographic signing keys are generated dynamically during Terraform infrastructure provisioning and locked inside Google Cloud Secret Manager.

A secret fetching during deployment is used to avoid baking secrets into application code and container images. Instead, services pull signing keys and credentials directly into memory only at runtime, minimizing exposure both at rest and in transit.

Network isolation
To enforce network isolation, the underlying computer infrastructure is heavily shielded. Nexus SDV runs on private GKE clusters where worker nodes have no public IP addresses, preventing direct internet exposure. Additionally, the Keycloak PostgreSQL database uses Cloud SQL IAM Authentication, which allows the Cloud SQL Proxy to connect securely using IAM roles rather than relying on static database passwords or managing IP allowlists.

Secure AI Framework
Google Cloud secures these advanced AI capabilities through a comprehensive, enterprise-grade framework that prioritizes data privacy, model governance, and safe execution, based on guidance from the Secure AI Framework (SAIF). With Gemini Enterprise Agent Platform, security and governance are natively embedded into the machine-learning lifecycle through capabilities, such as dedicated Explainability and Safety controls, continuous Evaluation and Monitoring, and secure model registries.

You can learn more about how we secure AI here.

Data API
Instead of allowing downstream applications and external clients direct access to data stores like Bigtable, Nexus SDV routes data retrieval through a custom Data API. This microservice acts as a secure abstraction layer that translates strictly, such as querying specific vehicle IDs, sensor data types, and predefined time windows, into heavily constrained Bigtable row-range scans and column filters.

By doing so, it serves as a secure gateway that enforces structured data access patterns.

Start your journey with Nexus SDV

Nexus SDV represents a new era of automotive intelligence, delivering an agentic, secure, and cost-efficient platform that empowers manufacturers to harness the full power of AI in an open-source framework. You can learn more about how we are redefining the software-defined vehicle here.

Conversational analytics in BigQuery brings trusted agentic reasoning to everyone

Tue, 30 Jun 2026 18:00:00 +0000

Businesses run on fast decisions, but the teams who hold the answers are often buried under a backlog of routine requests, leaving users waiting in line for insights they need now. Today, we are bringing Conversational Analytics in BigQuery to general availability, so both business and technical teams can query data, run multi-step analyses, and generate visual reports using natural language, right where the data lives. With this release, Conversational Analytics in BigQuery now delivers an agent that behaves like an analyst who knows your business, thinks before it answers, and stands behind its work. Built on Google’s latest Gemini models and BigQuery’s secure, governed foundation, it brings that trusted analyst to everyone in your organization.

Fig 1. Conversational Analytics in BigQuery

Conversational analytics for enterprise data

BigQuery’s conversational capabilities are built-in and available for use instantly, with no setup required.

For deeper, more consistent insights, data professionals can author specialized agents grounded in the exact sources that matter, from projects, datasets, and tables to views, graphs, and user-defined functions. And because your data rarely lives in one place, Conversational Analytics reaches beyond native BigQuery tables to Lakehouse-managed Apache Iceberg tables and cross-cloud Lakehouse sources like Databricks Unity, AWS Glue, SAP and Salesforce, so you can break down data silos and analyze data across clouds from a single conversation.

As a data practitioner, you work with Conversational Analytics right inside BigQuery Studio and Data Canvas, and publish the agents you build to Gemini Enterprise, Data Studio, or your own application through the Conversational Analytics API, putting them in the hands of business users wherever they work.

“At MoneySuperMarket, BigQuery Conversational Analytics has changed how our teams get to insight. Analysis that used to take weeks can now be done in minutes, saving our financial analysts around half a day each week. By making analysis more self-serve, we’re helping teams create faster insight to support better product and commercial decision-making.” - Suzie Millar, Head of Data, Mony Group

Engineered trust and explainability

Accuracy in Conversational Analytics is by design, not aspirational: every agent is grounded in your business context, not a model's assumptions. That context comes from the Knowledge Catalog (glossaries, profile scans, and context bundles), BigQuery Graph for multi-hop queries, and your own verified queries and custom agent instructions. With the new Open Knowledge Format, the wiki your team already maintains can feed straight into Knowledge Catalog. At query time, Conversational Analytics leverages existing embeddings of your column values, generated by AI.GENERATE_EMBEDDINGS, to match your question to the right data, so asking about "Texas" finds rows stored as "TX."

Grounding only earns trust if the user can see it. So every answer is inspectable, providing:

Visible thinking steps: Review the agent's step-by-step reasoning and the exact SQL it generates before it returns an answer.
Context citations: See the precise sources behind every response, including tables, schema definitions, verified queries, and glossary terms used to calculate it.
Proactive disambiguation: When a prompt is vague, the agent asks targeted clarifying questions instead of guessing.
Long-term memory: The agent remembers what your terms and questions mean, so you don't have to disambiguate the same thing twice.

Fig 2. Generating answers that you can trust

Security and governance by design

One common barrier to scaling AI is governance. Reaching tens of thousands of users requires rigorous security, governance, and transparent cost controls. Conversational Analytics inherits BigQuery's governance model, so users only query data they are authorized to see and every query is logged for auditing within the BigQuery compliance framework. On top of that baseline, it supports Access Transparency (AxT), Customer-Managed Encryption Keys (CMEK), Private IP, and VPC Service Controls, and now guarantees data residency for data at rest and for ML processing within EU and US multi-region endpoints.

For your most engaged users, we also deliver the operational controls that scale demands: Configure Google Cloud-native cost controls so no user or project exceeds its allotment, cap an agent's maximum query size in bytes, and track usage through BigQuery labels on jobs.

Fig 3. Agent Observability and Monitoring

The power of BigQuery AI, in plain language

The agent doesn't just retrieve rows, but calls BigQuery's AI functions for you, turning advanced analysis into a question you can ask in plain language.

Find the "why," not just the "what": Ask what drove a change and the agent runs root-cause analysis with AI.KEY_DRIVERS, surfacing the exact segments behind the move.
See what's coming: Move past historical reporting by triggering AI.FORECAST and AI.DETECT_ANOMALIES right in the chat to project trends and flag outliers, with no model to build or manage.
Query your entire data estate: With object tables, the agent reasons over relational data and unstructured files together, PDFs, images, logs, and video, so a single conversation spans your whole estate.

Fig 4. Conversational Analytics leverages BigQuery AI functions

From answering questions to running the investigation

Conversational Analytics agents are moving from human-scale reactive analysis to agent-scale proactive action. You're no longer limited to asking a question and waiting for the answer.

Deep-dive mode: If you ask ‘Why a metric moved?’ the agent will build its own analytical plan, mapping the critical questions, working through a full multi-step investigation with no manual SQL, and minimizing analytical blind spots. The result is a comprehensive report you can download and share.

Fig 5. Deep Dive mode in Conversational Analytics

Agentic workflows: Deploy autonomous agents that monitor your data, reason over events, run multi-step workflows on a schedule, and deliver insights straight to your chat. You can set up a Monday-morning business report or daily anomaly detection across key metrics, each with a custom directive so they investigate only what you care about.

Fig 6. Scheduling Conversational Analytics agent workflows

Start talking to your data today

General availability of Conversational Analytics in BigQuery marks an official exit from the static dashboard era. By embedding Gemini’s deep cognitive reasoning directly into the data warehouse, we are enabling a self-managing environment that transforms raw data into active, corporate knowledge. This delivery is a key component of the Agentic Data Cloud, providing a true system of action that moves past retrospective reporting, incorporates security and governance by design and is engineered for enterprise trust.

If you are ready to get started, learn more from our documentation, reach out to your Google Cloud account representative, or get started in BigQuery Studio today to build and deploy your first agent.

Scaling Network Analysis for Fraud Prevention with BigQuery Graph

Mon, 29 Jun 2026 16:00:00 +0000

Based in the UK, Curve are building a financial super-app, a smart wallet that consolidates all your debit and credit cards into a single app and card, simplifying how millions of users spend, send and save money.

However, operating at this scale means confronting a high-volume, ever-evolving landscape of financial crime. While traditional fraud detection models are excellent at flagging suspicious individual transactions, they often miss the "bigger picture"—the complex networks and hidden relationships that characterize organized fraud rings.

To uncover these connections, we realized we needed to move beyond traditional relational data modeling. By partnering with Google Cloud to implement BigQuery Graph, we’ve been able to conduct deep network analysis at scale, helping us identify hidden fraud networks and achieve significant transaction savings.

The Challenge: The Multi-Hop Problem

Fraudsters rarely operate in isolation. They often share a subset of attributes across multiple accounts—such as a common device, a specific funding card, or shared contact information. In a standard relational database, identifying these links requires complex "multi-hop" analysis.

Attempting to scale this using standard SQL presented two significant hurdles:

Computational complexity: Uncovering a chain of connections (e.g., User A connects to User B, who connects to User C) requires multiple, massive self-joins. At our volume of millions of users and tens of millions of connections, these queries quickly became computationally expensive and difficult to maintain.
Data scale: Our most granular signals involve billions of potential connections. Standard relational approaches struggle to process these relationships without hitting performance bottlenecks or exhausting system resources.

The Solution: Native Graph Analytics in the data platform

We transitioned our network analysis to BigQuery Graph to take advantage of its native Graph Query Language (GQL) support. The primary advantage was the ability to stop moving our data and start connecting it directly within our existing environment. We had previously explored other popular graph databases - however, being able to keep our data within our BigQuery existing data warehouse gave us significant time and cost savings compared to having to migrate to a new graph database.

By modeling our payment ecosystem as a property graph—where users are nodes and their shared identifiers are edges—we simplified our architecture significantly. Instead of writing dozens of lines of complex JOIN logic, we can now use intuitive GQL syntax to "match" patterns of suspicious behavior across our entire dataset. This approach allows us to:

Traverse billions of connections: We can now analyze massive datasets, including user-level, device-level, and card-level connections, with high performance.
Unify our data experience: Because BigQuery Graph is built into the data platform, we can combine graph traversals with standard SQL analysis, search, and machine learning workflows in a single query. We could therefore leverage our existing SQL pipelines to build the nodes and edges tables, switch to GQL for traversing the graph, and then perform final aggregations with standard SQL. This flexibility makes it accessible to more analysts, without having to upskill in a new language.

Impact and Results

Since integrating BigQuery Graph into our fraud mitigation strategy, the impact on our operational efficiency and bottom line has been profound.

Financial impact: We estimate that the automated blocks triggered by these graph-based insights have saved Curve ~$12M in transaction losses in 2025 alone.
Precision and accuracy: Our graph-powered queries have achieved an accuracy of approximately 72% in identifying fraudulent users. This high precision allows our fraud mitigation agents to focus their manual reviews on high-certainty cases rather than chasing false positives.
Operational speed: Moving to GQL allowed us to streamline our graph queries and refresh our fraud rules more frequently. Previously we were limited to one-hop queries in our hourly rules, but GQL allowed us to optimize these slow-running scripts to stay one step ahead of organized crime.
From rules to ML: The faster we can traverse the network, the faster we can serve graph-based features to our machine learning models. While rebuilding and traversing the graph on a daily basis is sufficient for training models, it is simply too slow at inference-time when transactions can be authorised in less than a second. GQL is allowing us to move towards micro-batch or streaming traversals to serve fresh data to our fraud monitoring models.

Looking Ahead

Our success with BigQuery Graph has opened new doors for our data science and security teams. We are currently working on fully incorporating our highest-volume signals—including billions of IP address connections—into our real-time detection loops. We are also exploring native graph visualization to give our analysts a more intuitive way to explore and "see" fraud webs as they form.

By treating our data as a living network of relationships rather than just rows in a table, Curve is ensuring that our security remains as efficient and robust as our customer experience.

Synthesize the big picture and analyze trends with BigQuery's AI.AGG function

Mon, 29 Jun 2026 16:00:00 +0000

We recently announced the preview of the BigQuery AI.AGG() function. With AI.AGG(), you can use natural-language instructions within a single line of SQL to summarize or synthesize information over millions of rows of unstructured or even multimodal data.

While BigQuery already offers powerful AI functions that help you analyze individual rows of data, analyzing unstructured data at scale requires a different approach. AI.AGG() lets you ask questions from unstructured data such as logs and documents, for example:

What are the top three feature requests among the negative product reviews?
What kind of errors are users seeing most frequently, and how should I start investigating them?
In which specific scenarios is our automated agent consistently failing to resolve customer issues?

In this post, we'll dive deeper into the AI.AGG() function and look at a few of the use cases that it unlocks, including how it can be used in combination with BigQuery’s other managed AI functions for complex, intelligent data analysis.

Analyzing system logs with `AI.AGG()`

A great example of the power of AI.AGG() is analyzing system logging. Log messages, warnings, errors, and stack traces can contain extremely useful information for improving your service, but it can be time- and labor-intensive to investigate them manually — especially if you operate at scale and have thousands of them to review.

With AI.AGG(), you can easily analyze many logs at once, grouping and prioritizing them to decide which ones to dig deeper into first. In fact, our BigQuery engineering team used this exact approach while developing AI.AGG() — using the function to help identify edge cases related to input handling for the feature itself!

To demonstrate this, let’s analyze a public dataset of Apache Spark standard INFO logs available from Loghub. Often, clusters can run into issues like memory thrashing, clock drift, or broadcast bottlenecks without ever throwing a FATAL error. You can use AI.AGG() to analyze these seemingly normal logs for hidden inefficiencies. You can load the sample data file into BigQuery using any of the supported methods, such as the UI, CLI, or client libraries. The following example assumes you’ve loaded the log file into a dataset called bq_logs_demo and table named spark_logs_unstructured.

Notice how we construct the prompt here. We explicitly give the model permission to say "everything is fine," which prevents it from hallucinating errors, while instructing it to hunt for specific anomalies:

code_block: <ListValue: [StructValue([('code', "SELECT\r\n Component AS spark_component,\r\n COUNT(*) AS log_count,\r\n AI.AGG(\r\n Content,\r\n 'Analyze these Spark system INFO logs. Provide a 2-sentence summary: First, describe the normal operation of this component. Second, explicitly identify any hidden inefficiencies, latency spikes, repeated retries, or unusual patterns.'\r\n ) AS performance_analysis\r\nFROM\r\n `bq_logs_demo.spark_logs_structured`\r\nGROUP BY\r\n Component\r\nORDER BY\r\n log_count DESC;"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761acf070>)])]>

You can see in these results that AI.AGG() successfully acknowledges the "operating normally" messages while surfacing the critical diagnostic insights:

The query results pane showing the insights generated by AI.AGG() over the logs dataset.

Extracting categories from unstructured text and image data

Now, let’s look at some more use cases that demonstrate the flexibility of AI.AGG(), using one of BigQuery’s public datasets, cymbal_pets, a fictional pet supply shop. It includes a catalog of products carried by the store, with unstructured data like product names, descriptions, and images, making it a great example of the power of AI functions for handling unstructured data.

For example, let’s say you want to categorize the products in the dataset. The first hurdle in this case isn't applying labels to your products, but discovering what categories exist across the product catalog. With AI.AGG(), you can ask the model to analyze the raw product names and descriptions to identify the overarching categories for you.

code_block: <ListValue: [StructValue([('code', "-- Identify categories of products from product name and description\r\nSELECT\r\n AI.AGG(\r\n ('Product: ', product_name, ' - Description: ', description),\r\n 'What are the major categories of these products?' \r\n ) AS category_description\r\nFROM\r\n `bigquery-public-data.cymbal_pets.products`;"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761acf670>)])]>

This query returns a simple plaintext list of categories:

The plaintext result of categories determined by AI.AGG() over our products dataset.

This initial query is great for discovery, but a simple plaintext string isn't enough to build a reliable, automated data pipeline. To actually tag your data, you need to instruct AI.AGG() to return a structured format, like a JSON array. Then, you can use the structured categories as a parameter within another AI function, AI.CLASSIFY(), to actually label each product with its category.

The following SQL statement completes each of these steps in one script:

code_block: <ListValue: [StructValue([('code', "-- 1. Declare a variable to hold the array of categories\r\nDECLARE generated_labels ARRAY<STRING>;\r\n\r\n-- 2. Create a dataset to store the results\r\nCREATE SCHEMA IF NOT EXISTS categorized_cymbal_pets;\r\n\r\n-- 3. Generate the JSON string with AI.AGG and extract it into the variable\r\nSET generated_labels = (\r\n SELECT \r\n JSON_VALUE_ARRAY(\r\n AI.AGG(\r\n ('Product: ', product_name, ' - Description: ', description), \r\n 'Identify the major product categories. Return exactly one valid JSON array of strings. Do not include markdown code blocks, backticks, or conversational text.'\r\n )\r\n )\r\n FROM `bigquery-public-data.cymbal_pets.products`\r\n);\r\n\r\n-- 4. Feed the variable directly into AI.CLASSIFY\r\nCREATE OR REPLACE TABLE `categorized_cymbal_pets.categorized_products` AS (\r\nSELECT \r\n product_name,\r\n description,\r\n AI.CLASSIFY(\r\n ('Product: ', product_name, ' - Description: ', description),\r\n generated_labels\r\n ) AS assigned_category\r\nFROM \r\n `bigquery-public-data.cymbal_pets.products`\r\n);"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761acfa30>)])]>

You can now view the resulting table, which includes an assigned_category:

A preview of the categorized_products table which includes the new assigned_category column created by AI.AGG() and AI.CLASSIFY().

If you look closely at the intermediate table, you'll notice the structured categories changed slightly from the initial plaintext results. This happens for two reasons: First, LLMs are nondeterministic, meaning that they don't always give the exact same response to the same prompt. Second, the prompt was adjusted to accommodate the new output structure.

The returned product categories are structured as JSON by AI.AGG() as requested as part of the prompt.

With the table now labeled by category, you can group by the categories to do traditional SQL aggregation, or use AI.AGG() to consider each category separately.

For example, the following query fetches traditional metrics (like row counts) right alongside a synthesized AI summary of what those specific grouped products have in common:

code_block: <ListValue: [StructValue([('code', "-- Synthesize insights grouped by our newly assigned categories\r\nSELECT \r\n assigned_category,\r\n COUNT(*) AS item_count,\r\n AI.AGG(\r\n ('Product: ', product_name, ' - Description: ', description),\r\n 'Write a concise, one-sentence summary describing the common characteristics or purpose of the products in this category.'\r\n ) AS category_summary\r\nFROM \r\n `categorized_cymbal_pets.categorized_products`\r\nGROUP BY \r\n assigned_category\r\nORDER BY \r\n item_count DESC;"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761acf730>)])]>

Query results showing analyzing with AI.AGG() alongside more traditional SQL methods.

Unstructured data isn't limited to text. Because AI.AGG() natively supports multimodal inputs, you can return aggregated insights directly from image files.

The cymbal_pets Google Cloud project also contains a Cloud Storage bucket full of product photos. By creating an external object table, you can securely pass the image URIs directly into AI.AGG() and ask the model to summarize the visual content of the entire collection.

code_block: <ListValue: [StructValue([('code', "-- Summarize content of images in the object table\r\nSELECT\r\n AI.AGG(\r\n STRUCT(OBJ.GET_ACCESS_URL(ref, 'r')),\r\n 'What are the major categories of these images?'\r\n ) AS category_description\r\nFROM\r\n `bigquery-public-data.cymbal_pets.product_images`;"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761acf130>)])]>

Query results showing AI.AGG() surface product categories by analyzing the product images located in Google Cloud Storage.

How AI.AGG() works and best practices

To use AI.AGG() effectively in your own environment, it helps to understand how it processes data behind the scenes. Here’s what you need to know about context windows, error handling, and optimizing your pipelines.

1. Context windows and multi-level aggregation
LLMs have a specific context window and can have a hard time handling massive amounts of input. AI.AGG() solves this problem by automatically dividing your input rows into batches, aggregating those batches, and then aggregating the results of those batches into a final answer. This means you don’t have to worry about manually managing the context window when passing in large numbers of rows. Note that AI.AGG() won’t split up a row of data across batches, so make sure that each individual row is smaller than the context window, to avoid the row being skipped. Many smaller rows will give AI.AGG() more flexibility with how to batch each row.

2. Token usage with multi-level aggregation
Because AI.AGG() uses a multi-level aggregation structure, the total input tokens sent to the model may be higher than the raw tokens in your starting table (depending on how many rounds of aggregation are required). As a best practice, always reduce the number of input tokens by using LIMIT or pre-filtering your data upstream before passing it to AI.AGG().

3. Specifying your model endpoint
If you don’t specify a model endpoint, AI.AGG() will default to a recent model. However, for production pipelines, you often want explicit control:

Short-form names: You can use a short-form endpoint (e.g., gemini-2.5-flash), in which case AI.AGG() will use that model in the query execution region:

code_block: <ListValue: [StructValue([('code', "AI.AGG(\r\n input_data,\r\n instructions => 'Your instructions here.',\r\n endpoint => 'gemini-2.5-flash' \r\n)"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761acfe20>)])]>

Fully-qualified names: If the query execution region doesn’t support your desired model, or you prefer to use a global or multiregional endpoint, provide the fully qualified model name:

code_block: <ListValue: [StructValue([('code', "AI.AGG(\r\n input_data,\r\n instructions => 'Your instructions here.',\r\n endpoint => 'https://aiplatform.googleapis.com/v1/projects/[YOUR_PROJECT]/locations/global/publishers/google/models/gemini-3.5-flash'\r\n)"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761acfdc0>)])]>

4. Input and output modalities

Inputs: AI.AGG() supports text (via strings or references to text files) and image data. It also supports arrays of these types, though you should refer to the known issues documentation for edge cases regarding arrays of images.
Outputs: The function will always return a string. While you can prompt the model in your instructions to format the output as JSON or Markdown, keep in mind that the database engine does not strictly enforce this. Multimodal output (e.g., generating an image) is not currently supported.

5. Treatment of NULLs
AI.AGG() automatically skips NULL input rows without processing them. However, you must be careful when passing structured data. Like other BigQuery AI functions, AI.AGG() concatenates STRUCT fields similarly to the standard CONCAT() function. This means if even one field within your STRUCT is NULL, the entire row is treated as NULL and will be skipped.

Let's revisit our first categorization query. What if several rows of our products table are missing their description? Because of the NULL concatenation rule, those rows would be silently dropped from the analysis entirely. Here is how we can use IFNULL() to provide a fallback string, guaranteeing that every product is taken into account even if its description is blank:

code_block: <ListValue: [StructValue([('code', "-- Identify categories of products from product name and (optional) description\r\nSELECT\r\n AI.AGG(\r\n ('Product: ', product_name, ' - Description: ', IFNULL(description, 'No description provided')),\r\n 'What are the major categories of these products?' \r\n ) AS category_description\r\nFROM\r\n `bigquery-public-data.cymbal_pets.products`;"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f876c61e220>)])]>

6. Error handling
If AI.AGG() receives invalid input, or encounters an error during LLM processing, it will attempt to provide partial results. Rows containing invalid input or which were rejected by the LLM model will not be considered in the final results.

You can review exactly how many rows failed to process by checking your BigQuery job statistics, exactly as you would for scalar managed AI functions like AI.IF().

information showing an example of Gen AI function error details.

Give it a try!

These are just a few examples of the ways AI.AGG() can help analyze unstructured data. The AI.AGG() function is in preview in BigQuery now, so it’s available to all BigQuery users. Try it out on your own use cases!

You may also be interested in checking out BigQuery's other managed AI functions, AI.CLASSIFY(), AI.IF(), and AI.SCORE(), as well as general-purpose functions like AI.GENERATE(). We look forward to seeing what you build with them.

Boost BigQuery with Python: Managed Python UDFs now generally available

Mon, 22 Jun 2026 17:00:00 +0000

SQL is the industry standard for high-performance structured data analysis. However, expressing complex procedural logic, scientific computations, advanced string manipulations, or machine learning workflows in pure SQL can be highly challenging, if not impossible. That kind of work is better done with Python. Data practitioners often take on additional infrastructure management tasks — maintaining custom images and containers, and working with additional compute services — just to run simple helper functions with custom Python code and libraries.

Today, we are thrilled to announce the general availability (GA) of BigQuery Managed Python User-Defined Functions (UDFs).

This launch represents a major milestone in BigQuery’s extensibility strategy, allowing data scientists, engineers, and analysts to execute custom Python code directly and securely inside BigQuery using standard SQL queries or BigQuery DataFrames (BigFrames) in Python. With this release, Python UDFs are fully supported for production enterprise workloads and completely integrated into BigQuery's billing SKUs.

Bridging SQL and the Rich Python Ecosystem

BigQuery Managed Python UDFs run on BigQuery-managed serverless resources that automatically scales to billions of rows, without having to set up infrastructure or manage containers. BigQuery automatically handles the compilation, image building, security patching, deployment, and execution of your Python code, making it super simple to use Python functions in your SQL.

Core benefits

Flexibility: Access the vast Python ecosystem — including top-tier scientific and mathematical libraries like NumPy, SciPy, pandas, scikit-learn and more — directly in your SQL select statements.
Tight external API integration: Clean and enrich your BigQuery tables in real time by calling external web APIs or Google Cloud services such as Cloud Translation, Gemini Enterprise Agent Platform or custom microservices securely within your queries.
Fully managed and serverless: BigQuery handles the underlying container infrastructure and auto-scales performance dynamically.

Code example

Here is an example of a Python UDF that utilizes a popular Python package — beautifulsoup — to remove HTML tags. We use this function to process

StackOverflow answer bodies that are stored in a BigQuery public table:

code_block: <ListValue: [StructValue([('code', 'CREATE OR REPLACE FUNCTION `your_project.your_dataset.clean_html`(html_content STRING)\r\nRETURNS STRING\r\nLANGUAGE python\r\nOPTIONS (\r\n runtime_version = \'python-3.11\',\r\n entry_point = \'strip_tags\',\r\n packages = [\'beautifulsoup4>=4.12.0\']\r\n) AS r\'\'\'\r\nfrom bs4 import BeautifulSoup\r\n\r\ndef strip_tags(html_content):\r\n if not html_content:\r\n return ""\r\n soup = BeautifulSoup(html_content, "html.parser")\r\n return soup.get_text(separator=" ")\r\n\'\'\';'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f876c478670>)])]>

How to query it:

code_block: <ListValue: [StructValue([('code', 'SELECT \r\n id, \r\n `your_project.your_dataset.clean_html`(body) AS cleaned_answer_body\r\nFROM \r\n `bigquery-public-data.stackoverflow.posts_answers`\r\nLIMIT 100'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8760e01f10>)])]>

Advanced capabilities

For advanced users, Python UDF adds a set of capabilities to tune the performance as well as monitor the usage. Here are some examples.

Vectorized processing with Pandas PyArrow
To maximize throughput, the GA release supports direct processing of vectorized input as PyArrow RecordBatches. By processing columns of data in bulk rather than row-by-row, PyArrow eliminates Python serialization and conversion overhead, boosting performance by up to 10x for data-intensive calculations.

Configurable container resources
For heavy-duty data science and ML data preparation, you can now provision container memory (up to 16 GB) and CPU (up to 4 vCPUs) per function. This enables memory-intensive workloads (such as loading large serialized models or geospatial datasets) to run directly within the sandbox.

Customizable concurrency
Optimize your throughput and resource efficiency by configuring concurrent requests per container (up to 1,000 concurrent operations). This helps ensure that your scale-out execution is highly cost-effective and performs exceptionally well under heavy parallel loads.

Streaming logs and real-time metrics
Easily debug and monitor your production workloads. The BigQuery console now features a direct link from your query results to real-time CPU, memory, and concurrency metrics in Cloud Monitoring.

Billing

BigQuery Managed Python UDF are billed with BigQuery Services SKU. This SKU is fully eligible for BigQuery spend commitment-based usage discounts (CUDs), allowing you to maximize budget efficiency.

You can also get cost observability through INFORMATION_SCHEMA.JOBS as well as using billing labels MANAGED_ROUTINE_EXECUTION and MANAGED_ROUTINE_BUILD).

See more details in the Pricing section of the documentation.

Getting started

To get started with BigQuery Python UDFs, first check out product documentation.

Then, try out the functions published in the public BigQuery dataset. For example, run the following code in a BigQuery project to tokenize country names data from BigQuery public data. Under the hood, the token UDF utilizes the o200k_base tokenizer library.

code_block: <ListValue: [StructValue([('code', 'SELECT \r\n country_code,\r\n country_name,\r\n `bigquery-public-data`.python_udfs.tokenize(country_name) AS name_tokens,\r\n ARRAY_LENGTH(`bigquery-public-data`.python_udfs.tokenize(country_name)) AS token_count\r\nFROM \r\n `bigquery-public-data.census_bureau_international.country_names_area`\r\nORDER BY \r\n country_name'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8760e01820>)])]>

Or, try out this code lab to explore some advanced scenarios.

Then, to learn how to implement other advanced design patterns, we encourage you to explore our official public documentation guides:

Calling Google Cloud or online services (with connections): To connect to first-party Google Cloud services such as Gemini Enterprise Agent Platform or Cloud Translation, or external API endpoints securely using Cloud Resource connections, - check out the Call Google Cloud or online services in Python code guide.
BigQuery DataFrames (BigFrames) Python UDFs:To learn how to write, deploy, and scale custom Python functions natively from standard Jupyter notebook or Colab environments using BigQuery DataFrames, visit the Customize Python functions for BigQuery DataFrames guide.

Bring your Python workflows out of isolation and directly into the heart of your data warehouse today!

From AI potential to agentic reality: Driving the UK’s next chapter

Wed, 17 Jun 2026 08:00:00 +0000

The United Kingdom, and London in particular, continues to be one of the great hubs for AI development in Europe and the world. We’re home to Google DeepMind, of course, as well as significant AI unicorns — and Google Cloud customers — like Ineffable Intelligence, which is today announcing an important partnership with us.

A year ago, we joined you for the London Summit to showcase the vast potential of generative AI, including a major investment in upskilling the UK civil service. Today, as we welcome our partners once again to the historic vaults of Tobacco Dock, that potential has become an industrial-scale reality. In my conversations with leaders across both Whitehall and The City, the focus has moved from chatbots and media experiments to full-production execution. This is the moment of the agentic enterprise, where we shift from systems that simply chat with us to systems that can reason, plan, and execute multi-step workflows.

This transition is the cornerstone of the UK’s projected £400 billion economic boost from AI by 2030. At Google Cloud, we are the only provider offering the full integrated stack — custom silicon, frontier models, and planet-scale infrastructure — required to turn the Agentic Enterprise into a reality.

The new frontier of British enterprise and research

The banking sector is a key proving ground for this shift. And HSBC, one of the largest and most important financial institutions in the world, is showing the way. Today, we’re announcing a multi-year transformational partnership with HSBC to accelerate AI adoption across HSBC’s products and services globally. This new collaboration will further accelerate the shift towards AI-enabled ways of working across HSBC’s global operations.

HSBC will work with Google Cloud and Google DeepMind engineering teams to collaborate on new AI-powered tools and programmes, with access to Google’s latest agentic AI capabilities – including Gemini models and the Gemini Enterprise Agent Platform. The initial delivery focus on three areas: hyper‑personalised wealth management support, stronger financial crime risk management, and AI tools to enhance frontline/relationship manager client service

UK startups also continue to break new ground with technology, and AI in particular, as demonstrated by the work of frontier labs like Ineffable Intelligence. The company, which launched earlier this year, has chosen Google Cloud as its preferred cloud partner, utilizing Google’s full stack of AI-optimized hardware and tools to build and train Ineffable’s first generation of foundational models.

Led by David Silver, a former Google DeepMind researcher who was instrumental in the AlphaGo project, Ineffable Intelligence is taking a unique approach to AI development. The team are building systems that learn primarily through their own experience through reinforcement learning, instead of relying on the large-scale human-generated datasets behind language models. The ambition is to create a “superlearner” that develops knowledge through trial and error. This year, Ineffable Intelligence set a record for a European seed funding round of $1.1 billion, and now Ineffable Intelligence will support its training work by deploying one of the largest clusters of A5X, powered by the NVIDIA Vera Rubin NVL72 platform on Google Cloud, delivering massive computational scale.

To move from experimentation to true industrial production, businesses need more than just models; they need a roadmap. To help show them the way, we’re expanding our partnership with Deloitte, which will open a new AI Studio at its London campus. Developed in collaboration with Google Cloud, the studio will help British organisations move beyond AI experimentation to deploy autonomous, action-oriented AI systems at scale.

Deloitte is also committing to upskill 1,000 members of its UK AI and data workforce on Gemini Enterprise. This certification program will ensure that Deloitte’s AI and data engineers’ are equipped with the technical expertise to implement Google’s most advanced agentic architecture, providing UK clients with one of the largest pools of certified AI talent in the region.

Building a future-ready public sector

The blueprint for a modern digital government requires moving away from rigid legacy contracts toward agile, AI-driven public services. In collaboration with the Ministry of Housing, Communities and Local Government (MHCLG), the i.AI incubator, Google Deepmind, and Faculty, we are delivering tangible public sector reform and tools for reinvention that directly support the national goal to "get Britain building."

Agencies like MHCLG are already using a tool called Extract which was built using Google technology to help transform planning processes by reducing document processing times from two hours to just two minutes. Simultaneously, we are supporting trials of an AI planning tool — co-created with local planning authorities in Barnet, Dorset, and Camden — which aims to cut decision times for everyday applications by 50%. Furthermore, the Department for Transport (DfT) is utilizing Gemini to streamline public consultation analysis, a move projected to save £4 million annually.

Innovation on this scale also requires a secure, sovereign foundation. That is why Google Cloud is working to strengthen our UK data residency commitments, including measures like making Gemini 3.5 Flash, which features in-country AI processing, available by late June 2026 for sensitive sovereign use cases. We are giving British organizations the confidence to innovate within strict compliance boundaries.

To help keep businesses safe from the challenges posed by bad actors using AI and other digital threats, we also recently announced a comprehensive AI-powered cybersecurity platform — Google AI Threat Defense — which combines Wiz, Mandiant, Gemini & CodeMender to find, fix, and protect our customers from vulnerabilities.

Proven impact from the high street to public service

Autonomous agents are no longer a future prospect; they are delivering value across the UK economy today. Our work with THG Ingenuity, an ecommerce solutions provider, has delivered an 8x higher conversion rate via its AI Shopping Assistant. Starling is similarly empowering customers with "spending intelligence" tools for instant habit analysis around purchases and expenses. And Rightmove, has launched a beta version of an AI-powered conversational property search, built with Google’s Gemini models, enabling users to search for homes in their own words.

The breadth of this impact is visible across every sector: Kingfisher is pioneering retail-specific agentic applications; Openreach is driving field service optimization in telecommunications; andUnilever is using AI at scale across the entire value chain to drive growth and build desirable brands in the new era of consumer goods.

Meanwhile, VMO2 is streamlining complex data operations; Vodafone is executing a $1 billion partnership to redefine network performance; and WPP is integrating Gemini across creative workflows, whether that's generating high-fidelity campaign assets at speed and scale, powering AI agents, or training robotic camera operators.

Empowering the engine of growth for small to medium businesses and startups

The true measure of Britain’s AI success lies in its small and medium enterprises and startup ecosystem. Our AI Works research highlights a pivotal moment: AI has the potential to boost productivity for small and medium enterprises by 20% and unlock £198 billion in output for the UK economy. With 56% of smaller firms already seeking guidance, we have launched the AI Works for Britain upskilling initiative to ensure no business is left behind.

We also continue to foster the next generation of British unicorn startups through our ongoing partnership with Tech Nation at the London AI Hub. This sustained commitment ensures founders have the resources and community needed to scale, and this September, we will further this mission by hosting the Gemini Startup Forum: Cybersecurity in London to help startups build secure-by-design AI applications.

The Model Garden at Platform 37

Our belief in the UK’s potential is reflected in our physical footprint, too. We are continuing to invest in the UK's digital infrastructure to support growing demand: Our state-of-the-art data center in Waltham Cross launched in September 2025, a key part of our two-year, £5 billion investment to help power the UK's AI economy. And earlier this year, we opened our new office in London in Kings Cross, Platform 37, along with plans for The AI Exchange, a new public space dedicated to deepening understanding of AI.

Building on this momentum, we are excited to introduce The Model Garden at Platform 37, launching in the fourth quarter of 2026. This London-based hub is far more than a physical space; it serves as a strategic investment designed to fundamentally elevate how we engage with our most important customers. Blending the timeless aesthetics of a classic English garden with immersive, high-tech innovation — from living digital walls to a three-story atrium — The Model Garden acts as a physical marketplace for our best ideas.

The blueprint for the agentic enterprise

For UK businesses, civic leaders, and organizations to continue to lead in the AI moment, they must not only rethink the technology they use but also fundamental aspects of how we work. As we support thousands of organizations and millions of teams here and around the globe, we see three core strategies helping achieve success with AI:

Culture: We must reimagine our organizations for the future. True transformation means getting teams excited, enabled, and equipped to work with AI agents in completely new ways. It is about human-AI collaboration, not just automation.
Responsibility: We must build with safety and security in mind from day one. Protecting your users, your customers, and your brand is paramount. Our frontier models are built on a foundation of rigorous AI principles and secure-by-design infrastructure.
Sustainability: In an era of rising compute demands, we must scale in a way that is both financially viable and positive for our planet. At Google, we are committed to carbon-free energy 24/7, ensuring that the UK’s AI growth does not come at the cost of our climate goals.

Architecting the future together

Google Cloud is the primary partner for the UK’s agentic transition. We are moving beyond the hype of experimentation into the rigor of production. From the research labs of King's Cross to the diverse enterprises powering the high street, we are architecting a resilient, sovereign, and prosperous future for the United Kingdom.

Thank you to everyone who’s joining us in London — yesterday, today, and into the future. This year we’ve packaged up an exclusive on-demand experience, allowing you to stream the defining London Summit moments, available anywhere, anytime.

How Siemens "slices the elephant," advancing agentic workflows for industrial software development

Tue, 16 Jun 2026 07:00:00 +0000

For technology companies like Siemens, software is the nervous system of factories, energy grids, and transportation networks worldwide.

As a global leader in industrial AI, industrial software, and industrial automation, Siemens brings decades of domain expertise across factory and process automation, energy infrastructure, and intelligent transportation — expertise that no off-the-shelf AI solution can replicate. But innovation carries a heavy anchor: legacy code.

With codebases spanning hundreds of millions of lines developed for over more than a decade, Siemens faced a challenge that standard AI tools couldn't solve: understanding and modernizing this code and the applications which run on it. The scale and depth of industrial-grade software demand a fundamentally different approach. Existing coding assistants lacked the contextual depth required to navigate complex, multi-layered industrial codebases — a gap Siemens set out to close.

To solve this, Siemens and Google Cloud created Knowledge Fabric, an AI system for automating the software development lifecycle. It was built using knowledge graphs on Spanner Graph, the Google Agent Development Kit, Gemini API, Gemini Enterprise Agent Platform, Gemini CLI, and Anthropic Claude Code. In a pilot migrating existing frontiers to web-based interfaces, Knowledge Fabric reduced implementation effort, freeing engineers to focus on customer innovations while maintaining full system compatibility.

“By ingesting the entire software ecosystem into an intelligent agentic system equipped with custom knowledge graphs, we aren’t just helping developers optimize their development time; we are enabling autonomous agents to reason across the past to build the future,” said Franz Menzl, senior vice president, product creation excellence at Siemens. “This is about freeing engineers from repetitive work so they can focus on higher-value problem solving.”

The challenge: the complexity of industrial software

Modernizing large-scale industrial-grade software systems is often compared to rebuilding a jet while flying it. For Siemens, the challenge had four dimensions:

Scale: The repositories are massive — far exceeding the context windows of standard large language models.
Fragmentation: Critical knowledge was scattered across code, Jira tickets, Confluence pages, and scanned PDF manuals from the early 2000s.
Complexity: Tracing the link between a specific line of code and a functional requirement document from 10 years ago presented a challenge that no manual or conventional tooling approach could address efficiently. It’s a reality shared across the industry.
Responsibility: Systems must adhere to strict quality, compliance, and lifecycle requirements, often over 15 to 20 years of operation. AI‑generated outputs must therefore be explainable, traceable, and verifiable. Hallucinated or unvalidated changes are not merely inefficient but operationally unacceptable.

"We realized that standard RAG (retrieval-augmented generation) wasn't enough," said Agata Gołębiowska, technical lead, Google Cloud. "Code isn't just text; it has inherent structure. A class belongs to a file, which belongs to a module. Flattening that into a vector database meant losing the representation of relationships elements of the codebase."

The solution: A domain-aware Knowledge Fabric

To make this sprawling software environment navigable for AI-driven workflows, the teams built the Knowledge Fabric agent. This agent goes beyond keyword matching to “understand” the relationships between assets.

We use Spanner Graph to model the inherent structure of the codebase, applying the same rigor to documentation across formats. By mapping connections between these domains, we can link specific code snippets directly to requirements in a design document. Agents then traverse this graph, using tools to query the structure via Graph Query Language (GQL).

But GQL is only one piece. To enable semantic understanding, we generate embeddings for every node, using Spanner's Approximate Nearest Neighbors (ANN) algorithm to perform efficient vector search across the full codebase. Finally, we give agents full-text search capabilities, which can be combined with GQL to pinpoint nodes and edges with precision.

Combining these three methods lets an LLM agent answer complex queries, such as: "Which functions need to be updated if I change the logic in the Axis Control Panel?" The system traverses the graph — weighing keyword and semantic similarity — to identify dependencies, retrieve relevant documentation, and present a precise impact analysis.

This precise context is what lets a coding agent produce a valid, usable, and maintainable implementation.

"Slicing the elephant:" the agentic workflow

A key insight from the project was that AI agents struggle with massive, ambiguous tasks. To succeed, the team adopted a design pattern dubbed "slicing the elephant."

The system breaks a sweeping request like “refactor this module” into smaller, more manageable tasks, each handled by a specialized agent built with the Google Agent Development Kit (ADK):

Search agent: Acts as a deep-research specialist. It uses tools to explore the code graph and cross-reference findings with documentation in Agent Search.
User story agent: Interviews the product owner to gather requirements, then drafts detailed user stories with acceptance criteria linked to existing system contexts.
Architecture impact agent: Analyzes proposed changes against the graph to predict side effects before a single line of code is written.
Task breakdown agent: Consumes the analysis from the architecture impact agent and breaks the work into small, manageable tasks, each carrying all the context relevant to a specific change.
Coding agent: Implements the change described in a specific task. Reaching this step without context and prior analysis produces unusable code.

The system keeps a human in the loop at every step, which ensures reliable, production‑grade outcomes and keeps engineers focused on meaningful work rather than routine implementation.

"By slicing the elephant — breaking complex refactoring jobs into smaller, agent-led tasks — we observed a significant productivity increase," said Alexander Lomakin, project lead at Siemens. "We essentially gave the AI the roadmap it needed to navigate the complexity."

Pilot results: Faster, more efficient engineering

Developers saw results almost immediately.

Analyzing dependencies for a new feature once required senior engineers to spend several days navigating codebases and legacy documentation. With the Knowledge Fabric, the same work now takes far less time.

In a recent production pilot migrating legacy control panels to modern web‑based interfaces, the Knowledge Fabric reduced overall coding effort while preserving system integrity and industrial quality standards.

Engineers now spend more time creating customer value and less on repetitive work.

Get started

The Knowledge Fabric shows that generative AI can do more than write boilerplate code, it can also help teams modernize the legacy systems their businesses depend on most.

To learn more about building graph-based agents for your own legacy modernization:

Read about Spanner Graph.
Explore Agent Platform and find pre-built production-grade agents on Agent Garden
Check out the Agent Development Kit.
Read more on how Siemens is advancing industrial AI.

What’s new in data agents: Supercharging your AI workflows

Mon, 15 Jun 2026 17:00:00 +0000

The rise of AI agents is fundamentally disrupting applications and analytical systems. Generic AI platforms don't usually have access to the context stored within enterprise databases. This is because traditional data architectures often lack context for agents across the data estate, which can lead to agents being inaccurate. They’re also prone to security gaps due to a lack of granular access controls.

Google’s Agentic Data Cloud is an AI-native system of action that includes both operational and analytical systems. By infusing AI across the entire stack — from custom silicon to frontier Gemini models — we provide a deterministic, template-driven developer framework that allows agents to ground their reasoning in real-time enterprise data with near-100% accuracy, as well as unified governance.

Today, we’re making it easier to develop agents, with a whole host of new data agents and tools: for business analysts within Conversational Analytics; for data scientists, engineers, and database admins with a series of Google-built Data Agents that provide greater automation and intelligence; and finally, for developers, with Data Agent tools that help you better integrate with today’s open agentic ecosystem.

1. Conversational Analytics

To support developers building agents using natural language, we’re announcing expanded support for Conversational Analytics across Data Cloud.

Conversational Analytics in BigQuery integrates a sophisticated AI reasoning engine directly into BigQuery Studio, helping data and business teams go beyond writing manual SQL, leveraging business context to ground answers using multimodal synthesis and deep-dive research. Agentic workflows, in preview for select customers, automate root-cause analysis, and schedule actions — turning enterprise data into proactive, actionable intelligence.

Create agents for faster data insights with Conversational Analytics in BigQuery

Conversational Analytics in Lakehouse, now in preview, extends the Lakehouse unified infrastructure, so users can query distributed data lakes across AWS, Azure, and Google Cloud using natural language. This makes it possible to combine insights across cloud platforms without moving a single byte of data.

Conversational Analytics in AlloyDB, Spanner, and Cloud SQL, now in preview, supports out-of-the-box conversational AI, making data accessible for everyone. AlloyDB, Spanner, and Cloud SQL users can start natural-language conversations with their databases to gain visibility into their real-time operational data and capture analytical insights.

Use Conversational Analytics to get answers from your operational data

Looker Embedded Conversational Analytics, now generally available, allows you to embed agents directly into your custom applications and internal workflows via a low-code iframe implementation, making it easier to ship production-ready, conversational AI within any application. Additionally, with the Conversational Analytics API in Looker, you can create multi-turn conversational workflows that offer AI-powered recommendations, while also verifying and explaining the underlying SQL query. We are also significantly upgrading Looker’s core Conversational Analytics agent, which is already GA, with superior reasoning and semantic grounding, helping to eliminate ambiguity.

Embed agents directly into your applications for conversational AI

2. New data agents

To help data professionals move from reactive data management to proactive intelligence, and business analysts better interact with their dashboards, we’re announcing a new set of data agents that bring automation, intelligence, and natural language capabilities into their daily workflows.

Data Engineering Agent, now generally available, automates the heavy lifting of building and maintaining data pipelines. It transforms natural language requirements into optimized SQL or Python code for BigQuery and Dataflow, while proactively identifying and fixing pipeline breaks. By suggesting schema improvements and partitioning strategies, it ensures your data foundation is scalable, reliable, and performance-tuned without manual trial and error.

Data Science Agent accelerates the path from raw data to production-ready models. It assists data scientists by suggesting relevant features, generating boilerplate notebook code, and automating the technical documentation process.

Database Observability Agent, in preview with select Cloud SQL, AlloyDB, Spanner, and Bigtable customers, proactively monitors database performance and continuously identifies potential issues before they escalate. It then delivers intelligent recommendations and multi-turn remediation workflows for fast, comprehensive troubleshooting and optimization. It provides performance analytics for the entire database fleet, helping you quickly identify performance optimization opportunities across databases.
Database Onboarding Agent, in preview with select customers, takes the guesswork out of database selection and deployment. By evaluating your stated requirements — from simple use case descriptions, to complex enterprise needs — it recommends the best Google Cloud database and guides you through provisioning.
Looker Dashboard Agent, now in preview, enables conversational interaction with data within dashboards. Users can ask natural language questions and receive context-aware answers within the dashboard. This feature also provides AI-generated summaries that highlight key takeaways and insights from the dashboard.
Conversational Analytics in Gemini Enterprise, now in preview for Looker, BigQuery, and Lakehouse, brings governed intelligence built by data practitioners directly to business leaders. It serves as a "front door" to the Google Data Cloud, allowing business users to consume agents built in BigQuery, Looker, or Lakehouse without needing to access technical consoles. By publishing these agents from Google Data to Gemini Enterprise, organizations provide a single, grounded interface for precision data exploration and immediate answers to the business users.

Deep Research Agent, now in preview, uses the Knowledge Catalog to solve high-stakes, multi-layered business problems. It moves beyond simple search to build comprehensive research plans that synthesize intelligence from internal documents, BigQuery tables, and the public web. The result is a detailed report with dynamic visualizations and verifiable citations, that respect enterprise privacy and user permissions all the while.

3. Tools for data agents

Open-source standards for agentic development provide developers building AI applications and custom agents with a unified framework to access data and tools consistently and securely. Today, we are announcing the following tools to help ground your agentic development initiatives:

Data Agent Kit: now in preview, provides a standardized suite of skills and tools directly within preferred developer environments (IDE/CLI), empowering data practitioners to discover, transform, and action data at scale using the prescriptive guidance from the Agentic Data Cloud capabilities.
Managed MCP Servers for Databases, now generally available for AlloyDB, Spanner, Cloud SQL, Bigtable, and Firestore, fully manages the infrastructure required to connect AI models securely to your data, so you don’t have to host, secure, or scale MCP servers yourself. Now, developers can provide their agents with up-to-date context from across our database portfolio, so that your AI models can reason and act upon your most up-to-date enterprise data.
Managed MCP Server for Looker, now in preview, allows any MCP client or agent platform to query Looker's semantic models, extending governed BI insights across third-party applications.

Access Looker semantic models through Managed MCP Server

MCP Toolbox for Databases 1.0, now generally available, has achieved a major stability milestone, giving you the confidence to build production applications. We also overhauled the documentation, making the platform significantly more approachable for both human developers and autonomous agents.
QueryData for Cloud SQL, AlloyDB, and Spanner, now in preview, turns natural language questions into database queries. It’s built natively into these databases, and provides near-100% accuracy for natural language to SQL conversions through metadata, query examples, and evals.
Universal Commerce Protocol (UCP) Analytics powered by BigQuery, now in preview, enables merchants and developers to stream real-time events from UCP directly into BigQuery (see sample). This integration provides out-of-the-box observability for agentic commerce, allowing teams to monitor conversion funnels, track automated checkout performance, and identify system errors. By standardizing these metrics within BigQuery, businesses can bridge the gap between AI-driven transactions and existing business intelligence workflows.

Details on how to access the new agents and tools can be found from each of the documentation links on this page. Data agents are also available through Gemini Enterprise and the Google Cloud console.

Introducing the Open Knowledge Format

Fri, 12 Jun 2026 13:00:00 +0000

As foundation models continue to improve, the lack of relevant context often limits what they can do, especially as they are used to build agentic systems. While these models can help you write code, summarize documents, or analyze a dataset, they still need the right information to produce accurate and actionable results.

That’s why today, we’re introducing the Open Knowledge Format (OKF), an open specification that formalizes the LLM-wiki pattern into a portable, interoperable format. This is a vendor-neutral, agent- and human-friendly standard for representing the metadata, context, and curated knowledge that modern AI systems need.

As published, OKF v0.1 represents knowledge as a directory of markdown files with YAML frontmatter, with a small set of agreed-upon conventions that let wikis written by different producers be consumed by different agents without translation.

That's it. No complex compression scheme, no new runtime, no required SDK. A bundle of OKF documents is:

Just markdown — readable in any editor, renderable on GitHub, indexable by any search tool
Just files — shippable as a tarball, hostable in any git repo, mountable on any filesystem
Just YAML frontmatter — for the small set of structured fields that need to be queryable: type, title, description, resource, tags, and timestamp

If you've used Obsidian, Notion, Hugo, or any of the LLM wiki patterns that have emerged over the past year, the shape will feel familiar. OKF formalizes the small set of conventions needed to make these patterns interoperable.

Let’s take a look at the problem that OKF can solve for your organization, how it works, how to get started with it, and what’s next.

A fragmented context landscape

In most organizations, the information that foundation models use is overwhelmingly internal knowledge: the schema of a table, your business’ meaning of a metric, the runbook for an incident, the join paths between two systems, the deprecation notice for an old API, etc.

Today, these atoms of knowledge live in a variety of highly fragmented systems:

Metadata catalogs with their own APIs
Wikis, third-party systems, or in shared drives
Code comments, docstrings, or notebook cells
The heads of a few senior engineers

When an AI agent needs to answer "How do I compute weekly active users from our event stream?" it has to assemble the answer from these scattered, mutually incompatible surfaces. Every vendor offers its own catalog, its own SDK, its own knowledge-graph schema, and none of the knowledge is easily portable across products or organizations.

The result: Every agent builder is solving the same context-assembly problem from scratch, every catalog vendor is reinventing the same data models, and the knowledge itself is locked behind whichever surface created it.

Knowledge as a living wiki

Developer teams are changing how they build AI agents. Instead of using models to search the same documents for the same facts over and over, you can give your agents a shared markdown library that grows more useful over time. This lets your agents take on the drudgery of reading and updating their own files, while your team curates the content and manages it like code.

Andrej Karpathy, the prominent AI researcher and educator, articulates this idea most crisply in his LLM Wiki gist. "LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass," he writes. The bookkeeping that causes humans to abandon personal wikis is exactly what LLMs are good at.

Similar knowledge-as-Wiki pattern keeps reappearing under different names: Obsidian vaults wired to coding agents, the AGENTS.md / CLAUDE.md family of convention files, repos full of index.md and log.md artifacts that agents consult before doing real work, and "metadata as code" repositories inside data teams.

The pattern is compelling and powerful, but each instance is bespoke. Karpathy's wiki and your team's wiki and a vendor's catalog export may all look alike (markdown, frontmatter, cross-links), but none of them are intentionally designed to cooperate. There is no agreed-upon answer to what fields every document should carry, or what filenames mean what. As a result, the knowledge encoded in wikis remains siloed within the original teams, leading to redundant effort whenever a new agent is built.

What's missing is a format, not another service

The answer to this problem isn’t another knowledge service. You need a format, a way to represent knowledge that:

Anyone can produce, without an SDK
Anyone can consume, without an integration
Survives moving between systems, organizations, and tools
Lives in version control alongside the code it describes
Is readable by humans and parseable by agents: the same file, no translation layer

By design, OKF is that format.

How OKF works: The design in one screen

An OKF bundle is a directory of markdown files representing concepts: anything you want to capture, including tables, datasets, metrics, playbooks, runbooks, and APIs. Each concept is one file. The file path is the concept's identity:

code_block: <ListValue: [StructValue([('code', 'sales/\r\n├── index.md\r\n├── datasets/\r\n│ ├── index.md\r\n│ └── orders_db.md\r\n├── tables/\r\n│ ├── index.md\r\n│ ├── orders.md\r\n│ └── customers.md\r\n└── metrics/\r\n│ ├── index.md\r\n └── weekly_active_users.md'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f87615fe640>)])]>

Each concept document has a small block of YAML front matter for structured fields and a markdown body for everything else:

code_block: <ListValue: [StructValue([('code', '---\r\ntype: BigQuery Table\r\ntitle: Orders\r\ndescription: One row per completed customer order.\r\nresource: https://console.cloud.google.com/bigquery?p=acme&d=sales&t=orders\r\ntags: [sales, revenue]\r\ntimestamp: 2026-05-28T14:30:00Z\r\n---\r\n\r\n# Schema\r\n\r\n| Column | Type | Description |\r\n|---------------|-----------|------------------------------------------|\r\n| `order_id` | STRING | Globally unique order identifier. |\r\n| `customer_id` | STRING | FK to [customers](/tables/customers.md). |\r\n\r\n# Joins\r\n\r\nJoined with [customers](/tables/customers.md) on `customer_id`.'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f87615fe220>)])]>

Concepts link to each other with normal markdown links, turning the directory into a graph of relationships that is richer than the parent/child links implied by the file system. Bundles can optionally include index.md files (for progressive disclosure as agents navigate the hierarchy) and log.md files (for chronological history of changes).

The full v0.1 specification (including conformance criteria, cross-linking rules, and the small number of reserved filenames) fits on a single page.

Three principles behind the design

1. Minimally opinionated. OKF requires exactly one thing of every concept: a type field. Everything else (e.g., what types exist, what other fields to include, what sections the body has) is left to the producer. The spec defines the interoperability surface, not the content model.

2. Producer/consumer independence. OKF cleanly separates who writes the knowledge from who consumes it. A bundle hand-authored by a human can be consumed by an AI agent. A bundle generated by a metadata export pipeline can be browsed in a visualizer. A bundle synthesized by one LLM can be queried by another. The format is the contract; the tooling at each end is independently swappable.

3. Format, not platform. OKF is not tied to any specific cloud, database, model provider, or agent framework. It will never require a proprietary account or SDK to read, write, or serve. We're publishing it as an open standard because the value of a knowledge format comes from how many parties speak it, not from who owns it.

What we're shipping with the spec

To make the format concrete, we're publishing reference implementations at both the producer and consumer ends:

An enrichment agent that walks a BigQuery dataset, drafts an OKF concept document for every table and view, then runs a second LLM pass that crawls authoritative documentation and enriches each concept with citations, schemas, and join paths.
A static HTML visualizer that turns any OKF bundle into an interactive graph view in a single self-contained file; no backend, no install on the viewing side, no data leaves the page.
Three ready-to-browse sample bundles: GA4 e-commerce, Stack Overflow, and Bitcoin public datasets, produced by the reference agent and committed to the repo as living examples of conformant OKF.

These are proofs of concept, deliberately. The agent demonstrates one way to produce OKF; nothing about the format requires a specific agent framework or LLM. The visualizer demonstrates one way to consume it; nothing about the format requires HTML or a graph view. We expect (and want!) the ecosystem of producers and consumers to grow far beyond what we've shipped.

Where we go from here

OKF v0.1 is a starting point, not a finished standard. The format will evolve as more producers and consumers emerge and as we collectively learn what knowledge representations agents actually need in practice.

We're publishing in the open from day one because that's the only way a knowledge format earns its name, whether you're building a knowledge catalog, an enrichment pipeline, a wiki tailored to AI agents, or anything in the AI knowledge domain.

From here, we encourage you to:

Read the spec (it's short!)
Write a producer for your source system, your database, your documentation site
Write a consumer: a viewer, a search index, an agent that reasons over bundles
Try the reference implementation against your own data
File issues, send PRs, or propose extensions: The spec is versioned and explicitly designed for backward-compatible growth

The repo, the spec, and the sample bundles are available in GitHub. We have also updated Google Cloud’s Knowledge Catalog to be able to ingest Open Knowledge Format and serve it to our agents. You can find the relevant code and examples here.

The format itself is the contribution. The tools we've shipped exist to make it real, and to lower the cost of trying it out. Whatever shape your knowledge takes today, OKF is designed to be the lingua franca it can be exchanged for tomorrow.

^{Published by the Google Cloud Data Cloud team. Open Knowledge Format is an open specification; contributions, alternative implementations, and adoption beyond Google products are all explicitly welcomed.}

^{In addition to the authors, this work came together thanks to key ideas from many others at Google, and we thank them for their contributions.}

Transform dashboards into interactive data experiences with Looker agents

Thu, 11 Jun 2026 16:00:00 +0000

Dashboards have long served as a primary way for organizations to extract insights from data, but they can fall short in agile environments: Dashboards aren’t interactive and don’t allow you to ask follow-up questions. This forces users to step outside their workflows or turn to data analysts to get the answers they need. Today, we are introducing Looker dashboard agents in preview, embedding intelligent, conversational data agents directly within dashboards and empowering users to explore their business intelligence (BI) data using natural language.

Start a conversation with a Looker dashboard agent

Interactive agent-led investigations

Traditionally, dashboards have presented a static view of data. With dashboard agents in Looker, users can explore their data directly within the dashboard interface. Users can start a conversation by clicking the Gemini icon and asking natural-language questions to receive contextual insights.

The accuracy of a data agent depends on the business context it is provided, and its ability to map appropriate metrics and dimensions to users’ inquiries. The Looker dashboard agent has direct context about the user’s applied filters, cross-filters, and pre-curated tiles, helping it to generate highly relevant and accurate answers to complex business questions.

Should a query require more data, the agent can access underlying Explores to uncover additional information. These insights are paired with relevant charts and natural language explanations to simplify data exploration.

Explore data beyond dashboard to uncover deeper insights

Tailor the agent to your business

Data analysts curate dashboards to provide business users with precise perspectives on organizational data. To maintain this kind of consistent and reliable analytical environment, the Looker dashboard agent is highly configurable. Analysts can add context on top of the Looker semantic layer by providing natural-language instructions directly to the agent. This way, they can define exactly how the agent interprets unique business logic and tailors responses for the target audience. By enabling self-serve data analysis, dashboard agents help analyst teams scale to meet the increasing data demands of the business.

Configure Looker dashboard agents

Inherited trust and transparency

For users to adopt an AI-based system, they must trust the information it provides them. When generating an insight, the Looker dashboard agent explicitly shows its work by displaying intermediate reasoning, referenced dashboard tiles, and applied filters. Additionally, the administrator needs to trust users only have access to data and insights to which they are authorized. The dashboard agent is backed by Looker’s governance model, managed through standard permissions.

We are actively working on additional capabilities for the Looker dashboard agent, including support for iframe embedding, allowing organizations to bring dashboard agents alongside Looker dashboards into any essential portal or application.

Enable dashboard agents today

With Looker version 26.08.11 and later, administrators can activate the dashboard agent capability by toggling "Enable Chat with Dashboard" within the Gemini in Looker settings. Once enabled, authorized users will see the Gemini icon and can begin chatting with their dashboard data immediately. Please explore our support documentation for more detailed information.

Deep dive: How Lightning Engine delivers 4.9x faster Apache Spark performance

Wed, 10 Jun 2026 17:00:00 +0000

From foundational ETL and analytics to the frontier of generative AI, Apache Spark serves as the architectural backbone for global data processing. However, as data volumes scale, the trade-off between performance and infrastructure costs can be a limiting factor for growth. In the agentic era, where autonomous agents can trigger thousands of concurrent, multi-hop queries, this performance bottleneck directly dictates your unit economics.

We are excited to announce the general availability of Lightning Engine for Managed Service for Apache Spark, available across both our serverless and managed clusters deployment modes. Designed to address these scaling challenges directly, it is fully compatible with modern Spark workloads and requires zero changes to your existing data pipelines.

Whether you choose the zero-ops simplicity of our serverless deployment mode or the fine-grained infrastructure control of our managed clusters deployment mode, Lightning Engine serves as the unified performance engine to supercharge your job execution. By validating Lightning Engine across more than one million real-world workloads, we have fine-tuned it for industrial-grade stability as well as reliable performance gains.

With this general availability release, Lightning Engine delivers:

Up to 4.9x faster performance than standard open-source Spark
2x the price-performance over the leading high-speed Spark alternative

Let’s take a closer look at how Manager Service for Apache Spark achieves these great results.

Under the hood: Vectorized native execution

Traditional Spark execution is often bottlenecked by JVM execution overhead and garbage collection pauses. Lightning Engine bypasses these limitations by compiling Spark physical query plans into native C++ instructions optimized for Single Instruction, Multiple Data (SIMD) vectorization.

Built on the open-source Gluten and Velox runtimes with specialized Google-engineered enhancements, this native execution layer accelerates your most demanding data processing tasks with:

Vectorized sort: Accelerates sorting operations by processing data columnarly in native memory, significantly reducing CPU cycle overhead.
Accelerated window functions: Speeds up calculations performed across sets of rows (such as moving averages, aggregations, and deduplication) by executing them directly within the native C++ layer.
Smart fallback: If a query contains an operator or custom Java UDF that is not natively supported, the engine's intelligent push-down layer automatically and gracefully transitions that specific sub-tree back to the JVM, avoiding unnecessary data format conversions and preserving overall execution stability.

Optimized Cloud Storage and BigQuery connectors

High-performance compute is useless if the engine is starved for data. With Lightning Engine, we’ve optimized our storage connectors to ensure that reading data from Cloud Storage and BigQuery isn’t the bottleneck. Optimizations include:

Direct path connection: Bypasses multiple node hops and uses bi-directional streaming with Cloud Storage. This allows seek operations and vectorized readV APIs to run without reopening streams, accelerating scan times for complex, deeply nested Parquet or ORC files.
Metadata call reduction: Managing large-scale partitioned tables often comes with a hidden performance tax: the time spent simply listing files. Lightning Engine utilizes lexicographic listing in the driver to collect metadata and transmit it directly to executors, eliminating redundant Cloud Storage API calls and dramatically reducing Cloud Storage metadata costs.
Native BigQuery connector: Directly consumes BigQuery data in Arrow format. By avoiding the expensive conversion from Arrow to JVM UnsafeRow, the engine eliminates serialization overhead to accelerate scan times.

Broadcast joins and advanced query optimization

Lightning Engine incorporates an advanced, cost-based query optimizer inspired by Google's F1 and Spanner query engines, and introduces several custom optimization rules. Examples include:

Single HashTable caching: In standard broadcast joins, Spark builds join hash tables repeatedly across tasks. Lightning Engine builds the hash table once per executor and caches it, eliminating redundant CPU cycles and reducing the executor's memory footprint.
Aggregation pushdown: Automatically pushes partial aggregations below join shuffles. This minimizes the volume of data that must be transferred across the network, drastically reducing expensive shuffle stages.
Auto shuffle partitioning: Dynamically and adaptively determines the optimal number of shuffle partitions for each individual query stage based on runtime statistics, preventing out-of-memory (OOM) spills without over-partitioning.

Learn more technical details and hear Lowe’s experience with Lightning Engine from Google Cloud Next ‘26

Getting started

These updates are live and ready to use today! You can enable Lightning Engine directly through the Google Cloud console or via the gcloud CLI.

To submit a serverless batch job with Lightning Engine enabled, specify the premium tier in your Spark properties:

code_block: <ListValue: [StructValue([('code', 'gcloud dataproc batches submit pyspark my_script.py \\\r\n --region=us-central1 \\\r\n --properties=dataproc:dataproc.tier=premium \\\r\n --properties=spark:spark.dataproc.lightningEngine.runtime=native'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761dc5280>)])]>

To spin up a new managed cluster with Lightning Engine and Native Query Execution (NQE) enabled, run the following command in your terminal:

code_block: <ListValue: [StructValue([('code', 'gcloud dataproc clusters create my-optimized-cluster \\\r\n --region=us-central1 \\\r\n --image-version=2.3 \\\r\n --engine=lightning \\\r\n --enable-component-gateway \\\r\n--properties=spark:spark.dataproc.lightningEngine.runtime=native'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f876089bdf0>)])]>

Alternatively, navigate to the Managed Service for Apache Spark page in the Google Cloud console, click Create Cluster, select Cluster on Compute Engine, and choose Lightning Engine under the cluster configuration settings to automatically activate query acceleration for your workloads.

What’s new with Google Data Cloud

Wed, 10 Jun 2026 16:00:00 +0000

July 6 - July 10

New Lakehouse managed tables now in preview
Lakehouse tables for Apache Iceberg are now in preview and available in the console. By using Google-managed Apache Iceberg tables in Lakehouse, you can eliminate the overhead of maintaining duplicate data pipelines and complex synchronization logic between BigQuery and open-source engines. This unified table format delivers native, multi-engine read and write interoperability, allowing you to run concurrent DML/DDL operations across diverse analytics tools on a single, shared storage layer. Built-in automated table management handles painful background optimization tasks like compaction and partition tuning, freeing up your team to focus on building rather than managing storage maintenance.

June 1 - June 5

Beyond the Query: Powering AI Agents with Bigtable, Firestore & Memorystore
Discover the latest advancements in Google Cloud's NoSQL Database portfolio, including Bigtable, Firestore, and Memorystore. This series is designed for a broad audience: whether you are exploring these databases for the first time or are an existing user looking to leverage the new capabilities announced at Next '26.

Register here to secure your spot!

Cloud Engineer's AI Toolkit Workshops: Solve data-driven challenges with BigQuery, AlloyDB, Gemini and more. Hosted by Google Cloud Labs, this highly technical event is built specifically for Platform Engineers, SREs, and cloud infrastructure teams ready to bridge the gap between AI prototypes and production-grade deployments. Look out for more locations coming soon

Toronto - June 25 (Data Cloud) | RSVP Here
Chicago - June 30 (Data Cloud) | RSVP Here
Start a 10-day Bigtable free trial with a 1 node SSD cluster and up to 500GB of storage capacity. With no credit card required to start, you can easily ingest workloads and manage workloads that require low-latency, high-throughput, and predictable access. Plus, new Google Cloud customers get $300 in free credits on signup.

May 11 - May 15

Managed Service for Apache Airflow has launched a wave of new features, including the general availability of Airflow 3.1, AI-powered agentic troubleshooting, a new managed Airflow MCP Server for custom agent integration, and declarative YAML-based orchestration pipelines—discover all the details in the full blog post.

April 20 - April 24

Google-built ODBC Driver for BigQuery is now available in Preview
We are excited to announce the launch of the new, Google-built ODBC driver for BigQuery. This new open-source driver provides a direct, high-performance connection for applications to BigQuery and is developed entirely in-house by Google. Download a new driver and connect your application to BigQuery.

April 13 - April 17

We announced we are reintroducing Data Studio to play a significant role in the AI era, expanding from data visualizations and reports to host BigQuery conversational agents and data apps built in Colab notebooks.
We announced BigQuery Graph is now available in preview, offering an easy-to-use, highly scalable graph analytics solution, empowering data professionals to model, analyze and visualize massive-scale relationships in an entirely new way.

April 6 - April 10

We introduced Conversational Analytics for Looker Embedded environments, enabling users to add natural language experiences to their own custom data-driven applications, powered by Gemini.
We expanded Looker’s capabilities for faster ad-hoc analysis, with the introduction of self-service Explores, enabling you to bring your own data to Looker’s semantic layer and gain instant access to insights in a governed data environment.

March 23 - March 27

We showed you how you can scale your reads with Cloud SQL autoscaling read pools. This feature allows you to provision multiple read replicas that are accessible via a single read endpoint and to dynamically adjust your read capability based on real-time application needs.
Our customers are leveraging the full power of Conversational Analytics and Looker to drive major business and technical breakthroughs in the AI era. Companies like Telenor, Pet Circle, Fluent Commerce, Lighthouse Intelligence, Wego, and ROLLER are turning data into insights and actions, grounded by Looker’s semantic layer.

March 16 - March 20

We introduced an enhanced Gemini assistant in BigQuery Studio, transforming the agent from a code assistant into a fully context-aware analytics partner.

February 23 - February 27

We introduced managed and remote MCP support for Google Cloud databases, including AlloyDB, Spanner, Cloud SQL, Bigtable and Firestore, to power the next generation of agents. This announcement extends the ability for AI models to plan, build, and solve complex problems, connecting to the database tools our customers leverage daily as the backbone of their work environment.
We outlined how you can build a conversational agent in BigQuery using the Conversational Analytics API to help you build context-aware agents that can understand natural language, query your BigQuery data, and deliver answers in text, tables, and visual charts.

February 16 - February 20

Our customers are leveraging the full power of Looker to drive major business and technical breakthroughs. Companies like Arrive, Audika, Carousell, Framebridge, GumGum, Intel, Overdose Digital, Ocean Network Express, Subskribe and Promevo are leveraging Looker’s newest AI-driven capabilities, including Conversational Analytics, to transform data to insights and actions, and empower their entire organization with a single source of truth, powered by Looker’s semantic layer.

February 2 - February 6

Join us on March 4 for our webinar, Win Your AI Strategy with Cloud SQL Enterprise Plus, to learn how to power your generative AI workloads with 3x higher performance and 99.99% availability. Register today to discover how to build a scalable, enterprise-grade foundation for your most demanding AI applications.

January 26 - January 30

We introduced Conversational Analytics in BigQuery, which allows users to analyze data using natural language. Conversational Analytics in BigQuery is an intelligent agent that generates, executes and visualizes answers grounded in your business context directly in BigQuery Studio, making data insights for data professionals more conversational.
We outlined how data products have become the foundation for AI agents, providing the context needed to make autonomous agents reliable and trusted for real business use, backed by organized business logic and semantic understanding.
We highlighted how you can supercharge data analytics workflows, and outlined Google Cloud’s AI agent offerings for data engineering, data science, and development tools, so you can integrate agentic workflows in your applications, empower your teams and speed discovery.

January 19 - January 23

We have fundamentally reimagined Firestore with pipeline operations for Enterprise edition. Experience a powerful new engine featuring over a hundred new query features, index-less queries, new index types, and observability tooling to improve query performance. Seamlessly migrate using built-in tools and leverage Firestore’s existing differentiated serverless foundation, virtually unlimited scale, and industry-leading SLA. Join a community of 600K developers to craft expressive applications that maximize the benefits of rich queryability, real-time listen queries, robust offline caching, and cutting-edge AI-assistive coding integrations.
Introducing Google Cloud SQL on MSSQLTips: We are highlighting a new technical guide published on MSSQLTips titled "Introducing Google Cloud SQL." This article serves as an essential resource for SQL Server administrators and developers exploring Google Cloud's fully managed database service. It provides a detailed overview of Cloud SQL capabilities, including high availability, security integration, and the seamless transition of on-premises SQL Server workloads to the cloud, making it an ideal resource for those planning their migration strategy.
We are excited to announce the Public Preview of Microsoft Entra ID (formerly Azure Active Directory) integration with Cloud SQL for SQL Server. Designed to tackle the challenge of identity sprawl in multi-cloud environments, this integration allows organizations to govern database access using their existing Microsoft identity infrastructure. Key benefits include centralized identity management, enhanced security features like Multi-Factor Authentication (MFA), and simplified user administration through direct group mapping. This feature is available for SQL Server 2022 and supports both public and private IP configurations.

January 12 - January 16

Google-built JDBC Driver for BigQuery is now available in Preview
We are excited to announce the launch of the new, Google-built JDBC driver for BigQuery. This new open-source driver provides a direct, high-performance connection for Java applications to BigQuery and is developed entirely in-house by Google. Download a new driver and connect your Java application to BigQuery.
Troubleshoot Airflow tasks instantly with Gemini Cloud Assist investigations: Cloud Composer just got smarter. We are excited to announce that Gemini Cloud Assist investigations are now available directly within Cloud Composer 3. Instead of manually sifting through raw logs, you can now simply click "Investigate" on a failed Airflow task. Gemini analyzes logs and task metadata to identify failure patterns—such as resource exhaustion or timeouts—and provides actionable recommendations driven by Gemini Cloud Assist to resolve the issue. This integration shifts the debugging experience from manual toil to automated root cause analysis, significantly reducing the time required to restore your pipelines. Learn more about AI-assisted troubleshooting.

Modernizing Healthcare: How Alcidion achieved greater stability and performance with AlloyDB

Mon, 08 Jun 2026 16:00:00 +0000

In clinical informatics, every second counts. For Alcidion, a global leader in smart health solutions, the mission is simple but critical: use technology to reduce cognitive load for clinicians and present the right information at the right time to save lives.

Whether it’s managing patient flow in an emergency department or ensuring a patient is in the correct ward to avoid adverse outcomes, Alcidion’s flagship platform, Miya Precision, serves as a dynamic intelligent care platform for modern hospitals. To power this mission, the platform recently underwent a major architectural transformation, migrating from a legacy Microsoft SQL Server environment to Google Cloud’s AlloyDB for PostgreSQL.

The challenge: overcoming performance bottlenecks

Operating in an industry where data integrity and uptime are non-negotiable, Alcidion faced several technical and operational hurdles with its previous setup:

Operational overhead: Managing persistent backends for SQL Server required significant manual effort. The team had to manually balance database loads between elastic pools to maintain performance while trying to optimize costs. They also had to constantly manage the gap between allocated and used space to prevent shared pools from being consumed by excessive slack space.
Performance latency: Complex JSON data processing, critical for modern health informatics, was taking up to 30 minutes for certain jobs.
Stability concerns: The team sought a more stable Kubernetes environment and a persistent backend that could scale without constant administrative intervention.

The solution: a smooth migration to AlloyDB

Alcidion used the Database Migration Service (DMS) to move from SQL Server to AlloyDB, achieving a remarkably efficient cutover. The total learning and migration process took under one month, with the core database move completed in only one and a half weeks.

By creating custom synchronization tools and using Google Cloud’s managed services, the team reduced the final transition window to just 15 minutes. Alcidion achieved this by spinning up a new Google Cloud instance synchronized to the active one, with both accessible via unique fully qualified domain names. The new environment remained in read-only mode for customer validation.

During the final cutover, the old instance was set to read-only, synchronization was halted, and external integration links were toggled to the new environment. This streamlined process allowed users to log into the new instance and resume work within minutes, with the primary delay being DNS record updates.

Alcidion chose a fully managed AlloyDB service to eliminate control plane tasks and administrative overhead. This shift allows their engineering team to focus on clinical innovation and product development rather than "managing the container" or the underlying database infrastructure.

Being able to cut over to AlloyDB in about 15 minutes had our users back to work almost immediately. For a system clinicians rely on around the clock, that kind of smooth transition gave Alcidion real confidence.

The results: impact by the numbers

The shift to AlloyDB and Google’s Agentic Data Cloud has delivered immediate, quantifiable improvements for Alcidion and its healthcare customers:

Faster data processing: Data processing that previously relied on SQL Server stored procedures — a process that became increasingly time-consuming as data volumes grew — has been transformed. By migrating to AlloyDB and using BigQuery and Dataflow for processing, Alcidion has seen jobs that once took 30 minutes now complete in just 5 to 60 seconds.
Enhanced stability: The migration has delivered a step-change in reliability. In the previous environment, the team faced monthly disruptions, ranging from failed scheduled maintenance to connectivity issues that required manual intervention. In contrast, AlloyDB and Google Cloud’s compute services have proven exceptionally stable, allowing the team to move away from the "firefighting" mode associated with frequent infrastructure crashes.
Reduced cognitive load: By simplifying their backend and clinical dashboards, Alcidion’s SREs have significantly reduced their administrative burden. This shift has freed the team to focus on high-value innovation, such as refining predictive analytics and generative AI that empower clinicians to make informed clinical decisions faster.

Future vision: AI and beyond

Alcidion isn't stopping at database modernization. The move to AlloyDB is a foundational step for their next phase of growth:

AlloyDB columnar engine: The team is exploring the columnar engine for a second round of query optimization and real-time analytics.
Generative AI apps: Alcidion is actively working with Google to use AlloyDB’s Gemini Enterprise Agent Platform integration to perform concept analysis and pick out critical clinical insights from vast datasets.

By moving to AlloyDB, Alcidion has improved its stability and performance and built a strong foundation to keep delivering smarter, safer care to hospitals worldwide.

Ready to modernize your database? Learn more about how AlloyDB can transform your operational workloads.

What's new for Managed Service for Apache Spark clusters

Thu, 04 Jun 2026 16:00:00 +0000

At Google Cloud, our goal is to let you run large-scale analytical and data science workloads with maximum efficiency so you can process big data pipelines, machine learning, and ETL tasks.

We recently announced that the Dataproc service is now Managed Service for Apache Spark, reflecting our deep integration with the Agentic Data Cloud.

To support the diverse architectural needs of today’s modern data teams, we offer the service in two distinct deployment modes: serverless and managed clusters. The serverless deployment mode completely abstracts infrastructure management for ephemeral or ad-hoc jobs, while the managed clusters deployment mode is designed for teams that require fine-grained infrastructure customization, persistent environments, long-running stateful processing, or native integration with custom Compute Engine hardware configurations.

When it comes to managed cluster deployments, we’ve re-imagined the experience from the ground up, focusing on three core pillars: making Spark faster by supercharging execution speeds, easier to run by maximizing resource obtainability and reducing operational overhead, and smarter by embedding AI directly into the development and operational lifecycle.

This blog post focuses specifically on what we announced at Google Cloud Next ‘26 for the Managed Spark clusters deployment mode: providing enhanced flexibility to fine-tune performance and cost through native execution engine, smarter scaling policies, and Gemini-powered extensions. For the latest of the serverless deployment mode, check out this blog.

Faster, with the Lightning Engine native execution engine

Arguably the biggest update for Managed Spark clusters is Lightning Engine, which introduces massive performance gains for Spark DataFrame/Dataset APIs and heavy Spark SQL queries. Powered by a native, C++ vectorized execution engine built on Velox and Gluten, with specialized internal enhancements, Lightning Engine bypasses JVM execution bottlenecks by compiling query plans into native instructions optimized for SIMD (Single Instruction, Multiple Data) vectorization.

This native execution engine delivers:

Up to 4.9x faster performance than standard open-source Spark
up to 2x the price-performance over the leading high-speed Spark alternative

Crucially, taking advantage of these performance gains doesn’t require any code changes to your existing Spark applications. Because your jobs complete faster, you directly reduce your aggregate Compute Engine runtime hours and overall spend.

To enable Lightning Engine on your managed clusters, simply specify the Lightning Engine option when you’re creating a cluster.

Learn technical details and hear Lowe’s experience with Lightning Engine

Easier: Maximize resource obtainability via Flexible VMs

Temporary localized shortages of a specific machine type can stall cluster creation or interrupt autoscaling. To dramatically improve cluster resilience against capacity constraints, Flexible VMs for Managed Spark clusters are now generally available.

Flexible VMs allow you to define up to ten ranked machine types for your master, primary, and secondary worker nodes. Managed Service for Apache Spark pairs this preference with automated regional zone placement, dynamically scanning the entire region to fulfill your capacity requests using the best available hardware layout. This helps ensure your pipelines spin up predictably, drastically reducing resource availability errors, and maximizing your ability to capture cost-effective Spot VM capacity during periods of peak demand.

Easier: Zero-scale clusters and scheduled stops

To give you better fiscal control over persistent and developmental environments, we recently announced the general availability of two highly requested FinOps features: zero-scale clusters and cluster scheduled stops.

Zero-scale clusters: You can now provision environments that use exclusively secondary workers (Spot VMs), enabling the cluster to automatically scale down to absolutely zero worker nodes when no processing is active, leaving only the master node online to preserve metadata.
Cluster scheduled stops: This feature lets you configure automated cluster shutdown policies based on specific idle-time limits or a precise future timestamp.

Because these features are natively integrated, they reduce the operational friction of having to delete and reconstruct your environment, while you can stop paying for idle compute overhead during nights and weekends.

Smarter: Managed Service for Apache Spark MCP Server

To bridge the gap between generative AI and data engineering, we launched the Model Context Protocol (MCP) server for Managed Service for Apache Spark. This open-standard integration allows LLMs and AI assistants to securely and dynamically interact with your Managed Spark clusters using natural language.

By utilizing the MCP server, your AI agents can securely connect to your data platform under existing IAM permissions. This allows agents to perform cluster-based operations, such as creating a cluster, submitting a job, or adjusting an autoscaling policy, directly from your AI application.

Smarter: Accelerating AI with the Data Agent Kit

The Google Cloud Data Agent Kit extension allows data scientists, engineers, and developers to manage their entire data workload lifecycle directly within their preferred development environment. We rolled out native support for this extension on Managed Spark clusters, enabling teams to seamlessly build and deploy specialized Data Agents for code generation and data wrangling.

Developers can choose to use Antigravity 2.0, Google's standalone, agentic development platform or bring these agentic capabilities into their preferred IDE including VS Code, Claude Code, or Codex via the Data Agent Kit extensions and plugins. By pairing this streamlined workflow with the raw processing power of managed clusters, these intelligent agents can securely execute complex workflows directly over petabyte-scale data lakes. Specifically, the Data Agent Kit enables developers to:

Build and orchestrate pipelines: Author multi-node data pipelines and generate comprehensive code documentation using natural language.
Perform real-time debugging: Leverage Gemini Cloud Assist to sift through executor logs, pinpoint root causes of job failures, and recommend actionable fixes.
Easily connect to Spark resources: Instantly attach to serverless Spark runtimes or managed clusters without manual network configuration or local Spark installations.
Streamline Git and CI/CD management: Commit, merge, and deploy code directly from your IDE of choice, triggering automated testing and deployment pipelines without friction.

Smarter: Next-generation Lakehouse

We recently launched Lakehouse, which delivers read/write interoperability between engines like Managed Service for Apache Spark and BigQuery. By leveraging the Lakehouse runtime catalog as a unified, serverless metadata layer, it removes data silos and the need for complex translation layers. This agentic-first approach allows organizations to process open formats directly from Google Cloud Storage, or even query remote AWS datasets using the newly introduced cross-cloud Lakehouse, all while maintaining a single source of truth for security and governance.

For customers utilizing Managed Spark clusters, this integration unlocks several powerful new capabilities. Data teams can now accelerate their most demanding ETL and data science workloads by up to 4.9x using the optimized Lightning Engine.

Next-gen runtimes: Cluster Image 3.0 with Spark 4.1

Keeping pace with the open-source ecosystem, we rolled out Cluster Image 3.0 in preview, built with Apache Spark 4.1 and that features an upgraded default Java runtime, Java 21. Spark 4.1 introduces a set of core open-source capabilities, including real-time mode for structured streaming. This enables your Spark environment to support real-time streaming with continuous, sub-second latency processing.

Get started today

These updates are live and ready to use today in Managed Spark clusters! You can enable these new features directly through the Google Cloud console or via the gcloud CLI.

To spin up a new Managed Cluster and natively unlocking the performance of Lightning Engine, run the following command in your terminal:

code_block: <ListValue: [StructValue([('code', 'gcloud dataproc clusters create my-optimized-cluster \\\r\n --region=us-central1 \\\r\n --image-version=2.3 \\\r\n --engine=lightning \\'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761839b50>)])]>

Alternatively, navigate to the Managed Service for Apache Spark page in the console, click Create cluster, and select ‘Enable Lightning Engine’ under the cluster configuration settings to automatically activate Lightning Engine for your Spark jobs.

We look forward to hearing about the environments you build and run as Managed Service for Apache Spark clusters!

What’s new in serverless Managed Service for Apache Spark

Wed, 03 Jun 2026 16:00:00 +0000

Whether you use it for data preparation, real-time interactive queries, AI model training, or something entirely different, running Apache Spark at scale is demanding — you shouldn’t have to manage the underlying infrastructure too.

Late last year, we announced the general availability (GA) of our serverless Managed Service for Apache Spark runtime version 3.0, prioritizing speed, simplicity, and reliability. Since then, customer use of Managed Service for Apache Spark for data science has nearly doubled year over year. This is a testament to our belief that using Google Cloud is the easier, smarter, and faster place to run your Apache Spark workloads.

In this blog, let’s dive into a few key features that make our serverless Apache Spark offering a great fit for a wide range of workflows, including feature engineering, GPU-accelerated model training and tuning, semantic search, RAG, building AI agents and applications, and more.

Zero-setup onboarding

The most significant barrier to entry for a cloud service is often the "time to magic moment" — the interval between creating a project and running your first workload. Previously, with serverless Spark, you still needed to manually configure IAM roles, VPC networking, and firewall rules before submitting a single job.

In the serverless Spark 3.0 runtime version, zero-setup onboarding significantly reduces the time to launch your first workload on serverless Spark. It does so by automating the following steps:

Permissions: Necessary IAM roles and permissions are automatically provisioned to the appropriate service accounts.
Networking: Private Google Access is auto-enabled on subnets, and system firewall policies are configured automatically.
API management: Enabling APIs is now more efficient; you can just enable the Managed Service for Apache Spark API instead of manually having to enable several different APIs, as you did previously.

Fast startup for SLA-sensitive workloads

Latency matters, especially for interactive data science and SLA-sensitive batch pipelines. Historically, serverless Spark startup times could take several minutes. With the 3.0 runtime, we’ve dropped startup times by 75% across both standard and premium tiers, delivered automatically without any code or configuration changes and at no additional cost.

This massive improvement qualifies serverless Spark for a much broader range of SLA-sensitive workloads, and we’re always looking to optimize startup times even further.

"Serverless Spark allowed us to quickly reap benefits by removing the need for fine-grain machine management. This drove faster model development and significantly reduced our data processing costs." - César Narnajo, Principal Engineer, Moloco

Better GPU obtainability

Support for Dynamic Workload Scheduler (DWS) Flex Start Mode in the serverless 3.0 runtime version allows serverless Spark to queue customer requests for a configurable duration when GPUs are unavailable. This feature addresses the obtainability challenges for high-demand accelerators like NVIDIA A100 and L4 that are the subject of frequent regional shortages. By pausing workloads until the necessary GPU capacity becomes accessible with DWS, you can dramatically increase obtainability and reliability for your latency-sensitive AI/ML workloads.

First-class support for Apache Spark 4.x

The serverless Spark 3.0 runtime version supports current and upcoming Apache Spark 4.x innovations, including Spark Connect, which supports a decoupled client-server architecture that enables remote connectivity from any client.

Enhanced multi-zonal support

To protect global enterprise workloads from zonal outages or hardware stockouts, the serverless Spark 3.0 runtime introduces enhanced multi-zonal support by default. The service can now automatically allocate execution nodes across multiple zones within a single region to help ensure obtainability.

Crucially, we do not charge for cross-zonal network traffic between nodes in a region, providing high availability without the traditional multi-zone tax. This is another benefit that you can realize by bringing your global Apache Spark workloads to Google Cloud.

Looking ahead

In addition to the above, we’re also continuing to innovate and push the boundaries of ease of use in areas such as history-based autotuning and goal based autoscaling.

Get started today

You can take advantage of these features today by specifying runtime_version: 3.0 in your batch workloads or interactive sessions. To run your first workload on serverless Spark, perform the following simple steps:

Enable the Managed Service for Apache Spark API.
If you aren’t the project owner, ask your project admin for the serverless Managed Service for Apache Spark Editor (roles/dataproc.serverlessEditor) role on the project.

Now you’re ready to start running your workloads on the Serverless 3.0 runtime version. For more details, visit our updated documentation and access serverless Managed Service for Apache Spark in the Google Cloud console.

Accelerating data lakes: Optimizing Apache Iceberg and Spark with gcs-analytics-core

Tue, 02 Jun 2026 16:00:00 +0000

Many data engineers spend significant time managing compatibility and getting best performance across multiple analytics engines. To help solve this pain point, we are excited to announce gcs-analytics-core, a new open-source Java library designed to centralize and accelerate analytics optimizations for Google Cloud Storage (GCS).

With this, you get the flexibility to select your preferred analytics engine while achieving high performance on GCS. The gcs-analytics-core library provides optimizations across various analytics engines that you use today on GCS, like the Iceberg Spark engine and plan to expand to other analytics engines by the end of this year.

Built to be shared across major data processing frameworks like Apache Spark, this library consolidates and improves performance for analytics workloads on GCS. Available natively in the Apache Iceberg Java runtime starting from version 1.11.0, this library improves read operations for columnar formats like Parquet.

What is the gcs-analytics-core library?

The gcs-analytics-core library is a centralized optimization layer that sits between your analytics engines — such as Apache Spark, Trino, and Apache Hive — and the underlying GCS Java SDK. It intercepts read calls and injects performance enhancements, providing a consistent experience without requiring framework-specific tuning.

For Apache Iceberg users, it integrates into the GCSFileIO implementation, replacing traditional sequential reads with parallelized strategies to minimize latency and maximize throughput.

Key technical optimizations

The library introduces specific optimizations designed to reduce time spent on I/O and end-to-end execution time:

Vectored I/O (threaded): This feature improves read performance by fetching multiple data ranges in parallel within a single operation, reducing the overhead of GCS calls. Without this feature, the system needs to issue a separate call for each data range, increasing both the number of operations and open file latency for each request.
Smart Parquet prefetching: When reading Parquet data, analytics engines typically perform an initial read of the file’s footer, which contains the data structure and information about where specific data ranges are located. The library automatically prefetches this footer data in a single chunk (typically 50KB–100KB), avoiding the multiple network calls that often occur when engines repeatedly seek backward to fetch metadata..

Spotlight: Apache Iceberg integration

We delivered the first major integration of this library into Apache Iceberg. With Iceberg 1.11.0 or later, analytics engines utilizing Iceberg’s GCSFileIO can leverage these performance enhancements. To adopt the library in your environment, verify your Iceberg catalog is configured to use the native GCS FileIO:

code_block: <ListValue: [StructValue([('code', '# Spark configuration example\r\nspark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.gcp.gcs.GCSFileIO'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761b0cc10>)])]>

Because the core optimizations are embedded within the updated Iceberg runtime and the GCS connector architecture, you automatically benefit from Parquet footer prefetching and multi-threaded vectored reads — with no complex custom tuning required.

You can follow the specific integration details in Apache Iceberg Issue #14326.

Catalog compatibility

The gcs-analytics-core library is compatible with all Iceberg catalogs including the REST catalog, Hive, and other metadata management systems. By decoupling the performance optimizations from the catalog management layer, the library provides consistent read improvements without requiring adjustments to your existing infrastructure setup so you can scale across diverse data lake architectures.

TPC-DS Performance Benchmarks using Spark

To validate these improvements, end-to-end benchmarking was performed using an open source Apache Spark cluster with an Iceberg catalog configured to use GCSFileIO along with the gcs-analytics-core library.

The benchmark leveraged the industry-standard TPC-DS schema across varying dataset sizes (from 1GB up to 10TB), specifically comparing the new library's optimizations against the default GCSFileIO implementation, which uses sequential vectored reads.

By alleviating the I/O bottleneck at the storage layer, compute engines spend less time waiting for network responses (scan time) and more time processing data (execution time).

Here are the end-to-end TPC-DS benchmark results showcasing the percentage improvement when enabling gcs-analytics-core:

TPC-DS schema size	Scan time improvement	Execution time improvement
1 GB	71.51%	32.61%
10 GB	48.48%	18.94%
100 GB	40.98%	10.95%
1 TB	35.86%	3.38%
10 TB	18.40%	1.58%

As the data shows, there is a consistent improvement across all dataset sizes. The library is effective for the complex query patterns in TPC-DS, delivering scan time reductions that directly lower overall query execution time.

Get started

Before running your Spark workloads, confirm that the following requirements and configurations are met:

Use Apache Iceberg Spark runtime 1.11.0+ and the iceberg-gcp-bundle 1.11.0+.
Configure your catalog to use GCSFileIO.
Enable the gcs-analytics-core optimization flag (spark.sql.catalog.$CATALOG_NAME.gcs.analytics-core.enabled=true).
Enable vectorized I/O (spark.sql.iceberg.vectorization.enabled=true) to achieve read performance.

code_block: <ListValue: [StructValue([('code', 'spark-submit \\\r\n --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.11.0,org.apache.iceberg:iceberg-gcp-bundle:1.11.0 \\\r\n --conf spark.sql.catalog.$CATALOG_NAME=org.apache.iceberg.spark.SparkCatalog \\\r\n --conf spark.sql.catalog.$CATALOG_NAME.io-impl=org.apache.iceberg.gcp.gcs.GCSFileIO \\\r\n --conf spark.sql.catalog.$CATALOG_NAME.gcs.analytics-core.enabled=true \\\r\n --conf spark.sql.iceberg.vectorization.enabled=true \\\r\n <your-application-jar-or-script>'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f8761b0cf10>)])]>

The gcs-analytics-core library is open source and available for developers to contribute to the project and explore the source code. Our implementation and micro-benchmark configurations are part of the repository and can be referenced for your contributions or validations.

GitHub repository: GoogleCloudPlatform/gcs-analytics-core
Documentation: Review the design document for deep architectural details.

We want to hear about your experience. If you test this on your own datasets, please feel free to open an issue on GitHub or share your results with the community. We look forward to seeing how you utilize these optimizations in your data lakes.

The fully-managed Remote MCP Server for AlloyDB is now Generally Available

Mon, 01 Jun 2026 16:00:00 +0000

AI agents possess incredible reasoning capabilities and can perform increasingly complex actions. But the reliability of agentic outcomes depends entirely on the quality of the context they can access — context that is frequently locked away in operational databases.

To bridge this gap, we are excited to announce the Remote Model Context Protocol (MCP) Server for AlloyDB is now generally available.

The Model Context Protocol (MCP) is an open-source standard that gives LLMs a secure, consistent way to connect to external data sources. As part of Google Cloud’s recent rollout of 50+ Google-managed MCP servers, this new integration makes it easier than ever for both interactive and autonomous agents to securely harness the full power of your enterprise data. For example, you can now ask an AI agent for an up-to-the-millisecond view of your delivery fleet by connecting it to your real-time logistics data in AlloyDB, avoiding inaccuracies due to stale data and reducing the need for manual reporting.

Why AlloyDB is the strong foundation for agentic apps

By connecting MCP to AlloyDB, your agents get access to the premier database built for enterprise-grade AI. AlloyDB delivers the scale, speed, and intelligence required for the most demanding agentic workloads:

Supercharged vector performance: Scale to over 10 billion vectors at up to 6x the speed of standard PostgreSQL for vector queries (and up to 10x faster for filtered queries) with the ScaNN index.
Advanced search and reranking: Power multimodal applications with hybrid search via RUM (in Preview) and intelligent reranking through Reciprocal Rank Fusion (RRF) or Gemini Enterprise Platform models.
Real-time intelligence: Efficiently generate millions of embeddings using built-in AI Functions to facilitate low-latency, real-time agentic experiences.
Unified data access: Give agents a single PostgreSQL interface to seamlessly join operational data in AlloyDB with analytical data in BigQuery or archived data in Iceberg tables via Lakehouse Federation.
Enterprise-grade scale: Rest easy with a 99.99% SLA, autopilot database optimizations, and auto-scaling read pools with up to 20 nodes.

Why Remote MCP matters for AlloyDB

Local MCP servers are great for local development, but communicating over standard input/output (stdio) streams becomes difficult when you scale to production workloads. It is both architecturally complex and administratively burdensome to provision and manage all of the infrastructure and security guardrails you need to run agents for high-value use cases that interact with sensitive operational data.

The Remote MCP Server for AlloyDB runs on fully-managed Google Cloud infrastructure and exposes an HTTP endpoint that connects your AI applications to your data. This solves key challenges for teams building agents on PostgreSQL:

Centralized discovery: Find, secure, and manage your database's MCP server using Agent Registry.
Fully-managed HTTP endpoints: No need to deploy or maintain the infrastructure required for connectivity. Configure your agent to use the endpoint to get started.
Fine-grained authorization: Instead of using shared database passwords or API keys, you use Identity and Access Management (IAM) to restrict agents to specific tables, schemas, or views. With the read-only execute SQL tool, you can prevent your agent from making accidental changes and deletions from your database.
Operational instance management: The AlloyDB toolset gives agents the ability to do more than run queries. Agents can update instances, export and import data, create backups, and restore clusters.
Model Armor protection: Model Armor provides optional prompt and response security to screen and filter data, defending against prompt injections or accidental data exfiltration.
Audit logging: Every query, action, and tool call goes to Cloud Audit Logs, giving security teams a full audit trail.

Let's see it in action: A quick demo

Getting started with the AlloyDB Remote MCP server is a straightforward process. To see it in action in your own environment, you can follow our new Codelab, which guides you through these essential steps:

API & environment prep: Enable the AlloyDB, Compute Engine, and Gemini Enterprise APIs in your Google Cloud project.
Provision your database: Deploy your AlloyDB cluster, create your database, and import your sample data.
Enable data access API: Permit the Data Access API on your AlloyDB instance.
Connect the agent: Configure your MCP client by providing the remote endpoint (https://alloydb.googleapis.com/mcp). Pass your Google Cloud IAM credentials using an OAuth 2.0 bearer token in the HTTP Authorization header.

Once the connection is established, your agent can provide reliable, grounded answers to complex business questions using your real-time operational data. By performing introspection queries, the agent automatically understands your database schema – including tables and columns – enabling it to construct sophisticated joins and queries to fulfill user requests accurately.

Once your agent has access to the AlloyDB toolset, it can execute queries, analyze operational trends, and dynamically rank text data using AlloyDB AI functions like AI.RANK().

Security remains paramount: the Remote MCP Server for AlloyDB integrates seamlessly with Model Armor. This provides protection against sensitive data leaks, even if the agent’s service account possesses broad access permissions within the database.

Watch the full demo below!

What's next

By enabling agents to interact securely with transactional data, we are embracing an architecture where AI agents can reliably access and act upon your enterprise’s single source of truth.

Ready to build? Discover AlloyDB with a 30-day free trial, and dive into the Remote MCP for AlloyDB Codelab to start powering your enterprise agentic applications today.

Modeling a digital twin of a food supply chain using BigQuery Graph

Mon, 01 Jun 2026 16:00:00 +0000

The example of a growing restaurant

Imagine you are running a restaurant chain. You just can't physically feel and touch things to know how your business operates. You need tools and a digital replica of your business to sense the health of the business for you.

The friction of growth

Growth creates a unique kind of friction that spreadsheets simply weren't built to solve:

The bullwhip effect: Small downstream demand shifts swell into upstream inventory tidal waves.
SOP drift: Tiny departures from standard prep work eventually erode the entire brand vibe.
The food safety blast radius: One contaminated ingredient creates a messy, complex map of risk across the network.
Maverick spend: The "million-dollar leak" caused by local managers purchasing ingredients off-contract.

The digital twin

Digital models empower us to ask more insightful questions about the world, but they also force a critical choice in how we structure data. While traditional relational tables have been the standard, we must ask: are they still the right tool for everything? Given that our world is inherently interconnected, perhaps shifting to graph-based models is the natural evolution for capturing reality.

When managing thousands of assets, complex supply chains, or global logistics networks, traditional relational databases require massive, resource-intensive SQL joins to trace dependencies. This architecture creates a latency gap between physical events and operational awareness.

Modeling with BigQuery Graph

BigQuery Graph allows you to build a digital twin of your entire supply chain within your existing data platform. By turning your physical world—items, recipes, and locations—into a searchable map of nodes and edges, you gain a new level of clarity.

1. Defining the Semantic Layer

Instead of moving data to a new database, you create a Graph View over your existing tables. This tells BigQuery exactly how your tables relate to one another.

Query Language:

code_block: <ListValue: [StructValue([('code', '# Build the Graph Nodes & Edges\r\nCREATE or REPLACE PROPERTY GRAPH `restaurant.bombod`\r\nNODE TABLES (\r\n `restaurant.item` label item properties all columns,\r\n `restaurant.location` label location properties all columns,\r\n `restaurant.itemlocation` label itemlocation properties all columns\r\n)\r\nEDGE TABLES (\r\n `restaurant.bom`\r\n KEY(bomKey)\r\n SOURCE KEY (childItemLocation) REFERENCES `restaurant.itemlocation`(itemLocationKey)\r\n DESTINATION KEY (parentItemLocation) REFERENCES `restaurant.itemlocation`(itemLocationKey)\r\n LABEL consists_of properties all columns\r\n);'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f876157b460>)])]>

Image of a fictitious restaurant supply chain modeled using BigQuery Graph

Precision in practice

How does this change daily operations? It moves the business from panic to precision.

Surgical recalls: If a supplier reports a Listeria breakout, you walk the graph forward to find exactly which menu items in which specific restaurants are affected.
Weather risk analysis: When a hurricane threatens a distribution center, you don't see a list of stores; you see the blast radius. You identify the locations critically dependent on that hub and reroute supplies.

2. Executing the search

Graph Queries are a new tool for modelers and data scientists to query their data - it simplifies complex multi-domain data concepts and simplifies querying and makes data analysis a simpler more natural representation of problem articulation. For example: If I want to know which all locations handle chicken I could run a graph query as shown below:

To investigate a specific complaint or risk, you run a search on the model using graph query language.

Graph Query Language

code_block: <ListValue: [StructValue([('code', "# Navigate to the source of a specific ingredient issue\r\nGraph restaurant.bombod\r\nMATCH (a:itemlocation)-[c:consists_of]->(b:itemlocation) \r\nWHERE b.itemKey LIKE '%Chicken%'\r\nRETURN to_json([to_json(a),to_json(c),to_json(b)]) as result"), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x7f876157bee0>)])]>

Source of a foul odor - modeled as a graph

Building for the future

To get the most out of your digital twin, follow these guiding principles:

Focus on structure: Use graphs for relationships and dependencies; keep daily sales totals in relational tables.
Clean your keys: Spend time on data engineering; a graph is only as strong as its connections.
Capture edge properties: Store metadata like lead times or shipping costs directly on the edges to increase the model's utility.

Conclusion

The restaurant industry has outgrown the relational way of treating business data only as a list. By building inter-domain relationships as a digital twin with BigQuery Graph, you move from reactive problem solving to proactive modeling. It’s time to stop managing your network with a list and start seeing the connections in seconds.

Get started today

Check out the tutorial here
Visit the BigQuery documentation: find overview and quickstart guide.
Share your feedback: join our community, and get your questions answered via bq-graph-preview-support@google.com.
Related blog: Introducing BigQuery Graph

From petabytes to predictions: Easy BigQuery insights in Google Sheets

Fri, 29 May 2026 16:00:00 +0000

Many organizations’ single source of truth is data that resides in BigQuery, Google’s governed, secure and petabyte-scale data platform. However, the "last mile" of ad-hoc analysis, modeling, and reporting often happens where business users are most comfortable: Google Sheets.

Bridging this gap usually involves exporting data as CSVs. But this is inefficient, creating data silos, version control problems, and security and governance risks. Connected Sheets helps to eliminate this trade-off, turning the familiar Google Sheets interface into a direct, live window into your BigQuery data platform, letting you analyze petabytes of data quickly, securely, and easily.

In this post, we’ll do a quick overview of Connected Sheets, walk through real-world use cases, and show you how to perform enterprise-grade data analysis using BigQuery directly in Google Sheets.

A live window into the single source of truth

Business users often wait days or weeks for simple reports. Connected Sheets solves this by letting you analyze your critical data via a secure, direct connection to billions of rows of live data, with no SQL required.

For data admins, this architecture is appealing because it maintains a strong security and governance posture. They can provision access to specific tables or views, confident that the underlying data cannot be altered from a Connected Sheet. Admins can also take advantage of Google Workspace’s enterprise data protections to control reading, sharing, and copying data throughout its lifecycle.

For end users, the benefit is immediate agility and ease of use. They can use familiar tools like pivot tables, charts, calculated columns, and formulas to analyze billions of rows of live data as if it were a local file, balancing centralized control with the business's demand for speed. End users don’t have to learn technical concepts like databases, schemas, tables, and query languages like SQL to access, analyze, and visualize the data.

Key use cases and core journeys

We consistently hear about three primary use cases for Connected Sheets from customers across industries.

1. Self-service exploratory analysis: Data teams provide access to curated tables and datasets in BigQuery. Business Analysts in sales, operations, finance, or marketing can then build their own pivot tables or charts that run over the entire live data source directly from Sheets, then filter data to answer day-to-day questions, freeing the data team from a constant backlog of ad-hoc requests.

Example: Deep-dive investigation

Scenario: A sales manager analyzes millions of global transactions to review quarterly performance.
Action: Using a Connected Sheets pivot table, they quickly create a pivot table to summarize revenue by region and product line. When they spot an anomaly — an unexpected revenue spike in EMEA, for example — they simply double-click the summarized value to drill down and learn more about exactly what led to that value.
Outcome: Connected Sheets instantly queries and retrieves the precise, granular transaction rows behind that summary value, making it easy and fast to find the root cause.

2. Operational reporting: Business users can create live, refreshable, and easy-to-understand dashboard-like views of their data that their partner teams can rely on and share with executives and leads.

Example: Automated executive summary

Scenario: An operations lead provides weekly updates on sales invoices to their leadership, based on a BigQuery dataset with millions of rows.
Action: The operations lead creates their Connected Sheet and builds a series of charts to visualize invoice trends over time. They then configure the sheet to automatically refresh on a schedule every Monday morning, so it’s always ready ahead of their executive review.
Outcome: The manual routine of exporting data and pasting it into workbooks is completely eliminated. Leadership gets a reliable report and analysis powered by the latest warehouse data.

3. Hybrid data modeling: Data practitioners often need to blend governed warehouse data with real-time manual inputs and annotations. For example, a finance team might pull revenue data from BigQuery and combine it with manual procurement entries from your ERP system in a separate tab, using VLOOKUP to create a consolidated view for month-end reporting.

Example: Custom business metrics

Scenario: A financial analyst calculates custom commission payouts based on live sales data from your CRM system. The commission tier logic changes frequently and isn't modeled in the central data warehouse.
Action: Instead of requesting a new data pipeline from their data team, the analyst can add a calculated column directly within the Connected Sheet. They use standard spreadsheet formulas (like IF or IFS) to apply custom business logic directly against the BigQuery data.
Outcome: The analyst retains the flexibility to model scenarios and calculate metrics quickly, while maintaining governed BigQuery data as their single source of truth.

Getting started

Connecting Google Sheets to BigQuery is straightforward and requires only a Google Workspace account and a billing-enabled Google Cloud project. There are two primary ways to establish a connection and create a Connected Sheet.

Path 1: Starting from Sheets
This is the typical workflow for users who work primarily within spreadsheets.

Open a new Google Sheet.
Navigate to Data > Data Connectors > Connect to BigQuery.
Select your billing-enabled Google Cloud project.
Browse available datasets, select a Saved Query to connect right away, or input a custom SQL query.
Click Connect.

Path 2: Starting from BigQuery
This workflow is common for data analysts starting from the Google Cloud console.

Navigate to the BigQuery UI in the console.
In the Explorer pane, locate the table or query result you wish to analyze.
Click the Export menu (or the three-dot action menu) next to the asset.
Select Open in > Connected Sheets.

From petabytes to predictions with Connected Sheets

We designed Connected Sheets to help you bridge the gap between the scalability of the cloud and the flexibility of the spreadsheet. With Connected Sheets, we’re making it easier than ever for organizations to put data into the hands of the people who need it.

To explore these features, connect your BigQuery data to Google Sheets today. For more technical details, visit the Connected Sheets documentation.

Data Analytics

Frontier and Center: Who evaluates the evaluations?

Difficulty, measured

The cliff you couldn't see

We're not alone

A benchmark we trusted turned out to be broken

When two maps disagree

Evaluate your evals

Shift into high gear with agents: Securing the software-defined vehicle

AI-driven experiences with Nexus SDV

Cloud-native under the hood

Defense in depth with Google Cloud Security controls

Start your journey with Nexus SDV

Conversational analytics in BigQuery brings trusted agentic reasoning to everyone

Conversational analytics for enterprise data

Engineered trust and explainability

Security and governance by design

The power of BigQuery AI, in plain language

From answering questions to running the investigation

Start talking to your data today

Scaling Network Analysis for Fraud Prevention with BigQuery Graph

The Challenge: The Multi-Hop Problem

The Solution: Native Graph Analytics in the data platform

Impact and Results

Looking Ahead

Synthesize the big picture and analyze trends with BigQuery's AI.AGG function

Analyzing system logs with AI.AGG()

Extracting categories from unstructured text and image data

How AI.AGG() works and best practices

Give it a try!

Boost BigQuery with Python: Managed Python UDFs now generally available

Bridging SQL and the Rich Python Ecosystem

Code example

Advanced capabilities

Billing

Getting started

From AI potential to agentic reality: Driving the UK’s next chapter

The new frontier of British enterprise and research

Building a future-ready public sector

Proven impact from the high street to public service

Empowering the engine of growth for small to medium businesses and startups

The Model Garden at Platform 37

The blueprint for the agentic enterprise

Architecting the future together

How Siemens "slices the elephant," advancing agentic workflows for industrial software development

The challenge: the complexity of industrial software

The solution: A domain-aware Knowledge Fabric

"Slicing the elephant:" the agentic workflow

Pilot results: Faster, more efficient engineering

Get started

What’s new in data agents: Supercharging your AI workflows

1. Conversational Analytics

2. New data agents

3. Tools for data agents

Introducing the Open Knowledge Format

A fragmented context landscape

Knowledge as a living wiki

What's missing is a format, not another service

How OKF works: The design in one screen

Three principles behind the design

What we're shipping with the spec

Where we go from here

Transform dashboards into interactive data experiences with Looker agents

Interactive agent-led investigations

Tailor the agent to your business

Inherited trust and transparency

Enable dashboard agents today

Deep dive: How Lightning Engine delivers 4.9x faster Apache Spark performance

Under the hood: Vectorized native execution

Optimized Cloud Storage and BigQuery connectors

Broadcast joins and advanced query optimization

Learn more technical details and hear Lowe’s experience with Lightning Engine from Google Cloud Next ‘26

Getting started

What’s new with Google Data Cloud

July 6 - July 10

June 1 - June 5

May 11 - May 15

April 20 - April 24

April 13 - April 17

April 6 - April 10

Analyzing system logs with `AI.AGG()`