<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/"><channel><title>Developers &amp; Practitioners</title><link>https://cloud.google.com/blog/topics/developers-practitioners/</link><description>Developers &amp; Practitioners</description><atom:link href="https://cloudblog.withgoogle.com/blog/topics/developers-practitioners/rss/" rel="self"></atom:link><language>en</language><lastBuildDate>Thu, 04 Jun 2026 15:29:16 +0000</lastBuildDate><image><url>https://cloud.google.com/blog/topics/developers-practitioners/static/blog/images/google.a51985becaa6.png</url><title>Developers &amp; Practitioners</title><link>https://cloud.google.com/blog/topics/developers-practitioners/</link></image><item><title>Scaling AI Agents: A Step-by-Step Guide to Deploying ADK on GKE Autopilot</title><link>https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While building AI agents locally using Google’s Agent Development Kit (ADK) is an excellent way to prototype, production-ready agents require a robust, scalable infrastructure. For developers looking to move beyond simple instances and into the world of managed container orchestration, Google Kubernetes Engine (GKE) Autopilot offers the perfect balance of flexibility and ease of use.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this tutorial, I will walk you through building a technical agent with ADK and deploying it to GKE Autopilot. We will focus on utilizing Gemini on Vertex AI as the core model and ensure highest security standards by implementing Workload Identity for permission management.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Understanding the GKE ADK Architecture&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploying an ADK agent on GKE Autopilot involves more than just running a container. We leverage GKE's native capabilities to handle scaling and security. Our architecture consists of an ADK-based Python application packaged as a Docker image and stored in Artifact Registry. This container runs as a Deployment on GKE Autopilot, where it communicates securely with Vertex AI using Workload Identity—mapping a Kubernetes Service Account to a Google Cloud IAM Service Account.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;To expose the agent to the world, we use the Kubernetes Gateway API, the modern successor to Ingress, which provides a cleaner separation of concerns and native support for Google Cloud Load Balancing.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Prerequisites&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Before we begin, ensure you have the following tools and accounts ready:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Python 3.10 or higher.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;uv&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for package management.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud SDK (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) installed and configured.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;A Google Cloud project with billing enabled.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;kubectl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command-line tool.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;jq&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for parsing JSON responses.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The following APIs enabled: Kubernetes Engine, Artifact Registry, and Vertex AI.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 0: Configuring Google Cloud and Authentication&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Before interacting with Google Cloud services, you must authenticate your environment and set the active project. This ensures that both the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; CLI and your local Python environment can access Vertex AI.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;Login to Google Cloud SDK&lt;/strong&gt;:&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud auth login&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Set your active project&lt;/strong&gt;:&lt;span style="vertical-align: baseline;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud config set project [PROJECT_ID]&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Setup Application Default Credentials (ADC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This is crucial for the ADK library to authenticate with Vertex AI during local testing.&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud auth application-default login&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Define Environment Variables&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To ensure we can easily reuse our configuration in subsequent steps, let's export our project, region, and cluster name as environment variables. &lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;export PROJECT_ID=$(gcloud config get-value project)
export REGION=us-central1
export CLUSTER_NAME=adk-cluster&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Step 1: Provisioning GKE Autopilot&lt;/h3&gt;
&lt;p&gt;GKE Autopilot is the recommended way to run Kubernetes without managing nodes. It allows you to focus on your agent deployment while Google manages the infrastructure. Starting the cluster creation now allows it to provision in the background while we build the agent.&lt;br/&gt;&lt;br/&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud container clusters create-auto $CLUSTER_NAME --region $REGION&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While the cluster is provisioning, we can move on to building our agent.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Building the Agent with ADK&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, let's create our agent. Start by creating a folder for the agent code:&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;mkdir adk-agent
cd adk-agent&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Initialize a new Python project with uv:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv init&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Add dependencies&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv add google-adk&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a new agent using the adk cli&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv run adk create weather_agent&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;You will be asked to choose a model for the root agent. Choose &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemini-2.5-flash&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (Number 1). Next you will be asked to choose a backend. Choose &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Vertex AI&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (Number 2). Next you will be asked to enter your Google Cloud project ID. Enter your project ID. Next you will be asked to enter your Google Cloud region. Choose a region of your choice. Example: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;us-central1&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The previous command scaffolded a new directory &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;weather_agent&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with the following structure:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;weather_agent/
├── .env
├── __init__.py
└── agent.py&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;ADK requires the agent code to be in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;agent.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file. Let's edit the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;agent.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file to add a simple tool for the agent.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt; from google.adk import Agent
# Define a simple tool for the agent
def get_weather(city: str) -&amp;gt; str:
    """Returns the current weather in a city."""
    return f"The weather in {city} is 90 degrees Fahrenheit and sunny."
# Initialize the agent with Vertex AI and Gemini
root_agent = Agent(
    name="weather_agent",
    model="gemini-2.5-pro",
    tools=[get_weather]
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;span style="vertical-align: baseline;"&gt;&lt;code style="vertical-align: baseline;"&gt;agent.py&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file is the entry point for the agent. It is used to define the agent and its tools. The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;get_weather&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; function is a simple tool that returns the current weather in a city. For the purpose of this tutorial, we are using a hardcoded value for the weather. In a real-world scenario, you would use an API to get the current weather.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: Testing the Agent Locally&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before deploying the agent to GKE Autopilot, we need to test it locally to ensure it works as expected. Run the following command to start the agent in debug mode with the web UI:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;uv run adk web&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Open &lt;/span&gt;&lt;a href="http://localhost:8000" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;http://localhost:8000&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in your browser and you should see the ADK web UI. You can then interact with your agent by typing messages in the chat interface.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;If the agent returns a message like "The weather in [CITY] is 90 degrees Fahrenheit and sunny." Congratulations! your ADK agent is working. Now you can proceed to the next step.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 4: Preparing for GKE Autopilot&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The ADK cli has a built-in command to deploy the agent to GKE Autopilot. However the default settings are not suitable for a production environment. For example, the default settings do not use Workload Identity for authentication with Vertex AI and to expose the Web UI via a Load Balancer on port 80.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We will instead manage the lifecycle of the container ourselves. First we need to containerize the agent.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;.dockerignore&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file in the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk-agent&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory to prevent your local virtual environment from being copied into the image:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;.venv
.adk
__pycache__
*.pyc
.env&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Dockerfile&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for your agent in the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk-agent&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory. We will use a multi-stage build to keep the final production image lightweight and secure:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;# Stage 1: Build the virtual environment
FROM python:3.10-slim AS builder

# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

# Set working directory
WORKDIR /app

# Force uv to use the system Python and use copy instead of symlinks
ENV UV_PYTHON_PREFERENCE=only-system
ENV UV_LINK_MODE=copy
ENV UV_COMPILE_BYTECODE=1
ENV UV_PYTHON=/usr/local/bin/python3

# Install dependencies
# We copy only files needed for installation to maximize cache
COPY pyproject.toml uv.lock ./
# Note: We don't use --frozen yet as the host lock file might be slightly out of sync
# but sync will update it in the builder stage.
RUN uv sync --no-install-project --no-dev --no-cache

# Copy the agent code
COPY . .
# Sync the project itself
RUN uv sync --no-dev --no-cache

# Stage 2: Runtime image
FROM python:3.10-slim

WORKDIR /app

# Copy the pre-built environment from the builder
COPY --from=builder /app/.venv /app/.venv
# Copy the application code (including weather_agent folder)
COPY . .

# Add the environment to the PATH
ENV PATH="/app/.venv/bin:$PATH"
ENV PYTHONUNBUFFERED=1

# Run the ADK API server
# We point to the weather_agent folder
CMD ["adk", "api_server", ".", "--host", "0.0.0.0", "--port", "8080"]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Build and push the image to Artifact Registry:&lt;br/&gt;&lt;br/&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;# Create repository
gcloud artifacts repositories create adk-repo --repository-format=docker --location=$REGION

# Build and push
gcloud builds submit --tag $REGION-docker.pkg.dev/$PROJECT_ID/adk-repo/gke-agent:latest&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 5: Implementing Workload Identity for Security&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Security is paramount. Instead of hardcoding API keys, we use Workload Identity to grant the GKE pod permission to access Vertex AI.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;1. Create an IAM Service Account&lt;/strong&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud iam service-accounts create adk-gke-sa&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;2. Grant Vertex AI permissions&lt;/strong&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud projects add-iam-policy-binding $PROJECT_ID \

    --member="serviceAccount:adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;3. Allow the Kubernetes Service Account to impersonate the IAM SA&lt;/strong&gt;:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud iam service-accounts add-iam-policy-binding adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com \
    --role="roles/iam.workloadIdentityUser" \
    --member="serviceAccount:$PROJECT_ID.svc.id.goog[default/adk-ksa]"&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Step 6: Deploying the Agent to GKE&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Now, we define the Kubernetes resources. Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deployment.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; that includes the Service Account annotation for Workload Identity. Replace &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$PROJECT_ID&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$REGION&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with your actual project ID and region.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;apiVersion: v1
kind: ServiceAccount
metadata:
  name: adk-ksa
  annotations:
    iam.gke.io/gcp-service-account: adk-gke-sa@$PROJECT_ID.iam.gserviceaccount.com
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: adk-agent
spec:
  replicas: 2
  selector:
    matchLabels:
      app: adk-agent
  template:
    metadata:
      labels:
        app: adk-agent
    spec:
      serviceAccountName: adk-ksa
      containers:
      - name: adk-agent
        image: $REGION-docker.pkg.dev/$PROJECT_ID/adk-repo/gke-agent:latest
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits: 
            cpu: "1"
            memory: "1Gi"
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: adk-service
spec:
  selector:
    app: adk-agent
  ports:
  - port: 80
    targetPort: 8080&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Apply the configuration:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl apply -f deployment.yaml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Check the status of the deployment:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl get pods -w&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the pods are running, you can use kubectl port-forward to access the agent locally:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl port-forward svc/adk-service 8080:80&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Since we deployed the agent without Web UI, we can't access it at &lt;/span&gt;&lt;a href="http://localhost:8080" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;http://localhost:8080&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. However, we can still interact with it using the API and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;curl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a new terminal, run the following commands:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;# Create a new session
curl -X POST http://localhost:8080/apps/weather_agent/users/u_123/sessions/s_123

# Run a message
curl -s -X POST http://localhost:8080/run \
-H "Content-Type: application/json" \
-d '{
"appName": "weather_agent",
"userId": "u_123",
"sessionId": "s_123",
"newMessage": {
    "role": "user",
    "parts": [{
    "text": "Hey whats the weather in new york today"
    }]
}
}' | jq .&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;curl&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command will return the response in JSON format. The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;jq&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command is used to parse the JSON response and display it in a more readable format. . You should see a response like:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;{
    "sessionId": "s_123",
    "messages": [
        {
            "role": "assistant",
            "parts": [
                {
                    "text": "The weather in New York today is sunny with a high of 90 degrees Fahrenheit."
                }
            ]
        }
    ]
}&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;(Optional) Step 7: Exposing via Gateway API and HTTPS load balancer&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, we expose the agent using the GKE Gateway API with a Google-managed TLS certificate. This is the recommended, production-grade approach — Google will automatically provision and renew the certificate for your domain.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;NB: GKE supports other options to provision certificates. You can use Let's Encrypt with cert-manager, pre-shared certificates, or any other certificate authority. You can check the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/secure-gateway#secure-using-ssl-certificate"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for more details.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, reserve a static IP address for your load balancer:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud compute addresses create adk-agent-ip --global
export AGENT_IP=$(gcloud compute addresses describe adk-agent-ip --global --format="value(address)")
echo "Your IP: $AGENT_IP"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Point your domain's DNS &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;A&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; record at &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;$AGENT_IP&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. Example: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk.mydomain.com&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a Google-Managed Certificate. Replace &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;adk.yourdomain.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with your actual domain::&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud compute ssl-certificates create adk-cert --domains adk.yourdomain.com --global&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gateway.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with the following content:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-plain"&gt;&lt;code&gt;# Gateway: HTTPS load balancer with the managed certificate and static IP
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: adk-gateway
spec:
  gatewayClassName: gke-l7-global-external-managed
  listeners:
  - name: https
    protocol: HTTPS
    port: 443
    tls:
      mode: Terminate
      options:
        networking.gke.io/pre-shared-certs: adk-cert
  addresses:
  - type: NamedAddress
    value: adk-agent-ip
---
# HTTPRoute: forward traffic to the ADK service
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: adk-route
spec:
  parentRefs:
  - name: adk-gateway
  hostnames:
  - "api.yourdomain.com"
  rules:
  - backendRefs:
    - name: adk-service
      port: 80
---
apiVersion: networking.gke.io/v1
kind: HealthCheckPolicy
metadata:
  name: adk-health
  namespace: default
spec:
  default:
    checkIntervalSec: 15
    timeoutSec: 5
    healthyThreshold: 1
    unhealthyThreshold: 2
    logConfig:
      enabled: false
    config:
      type: HTTP
      httpHealthCheck:
        port: 8080
        requestPath: /health
  targetRef:
    group: ""
    kind: Service
    name: adk-service&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Apply the configuration:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl apply -f gateway.yaml&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Certificate provisioning can take up to 20 minutes. Monitor the status with:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;gcloud compute ssl-certificates describe adk-cert --global&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the status shows &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Active&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, your agent is live at &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://api.yourdomain.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. You can test it with:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;# Create a new session
curl -X POST https://api.yourdomain.com/apps/weather_agent/users/u_124/sessions/s_124

# Run a message
curl -s -X POST https://api.yourdomain.com/run \
-H "Content-Type: application/json" \
-d '{
"appName": "weather_agent",
"userId": "u_124",
"sessionId": "s_124",
"newMessage": {
    "role": "user",
    "parts": [{
    "text": "Hey whats the weather in new york today"
    }]
}
}' | jq .&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Conclusion &amp;amp; Looking Ahead&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By following these steps, you have successfully deployed a production-ready AI agent built with ADK onto GKE Autopilot that invokes Gemini on Vertex AI with Workload Identity for authentication. This setup ensures that your agent can scale horizontally to meet demand while maintaining a high security posture.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As you look ahead, consider integrating more complex tools or leveraging GKE's multi-cluster capabilities for even greater resilience. For more details on the technologies used here, explore the official &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and the &lt;/span&gt;&lt;a href="https://github.com/google/adk" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To avoid ongoing charges, remember to delete the GKE cluster and the Artifact Registry repository when finished:&lt;/span&gt;&lt;/p&gt;
&lt;pre class="language-bash"&gt;&lt;code&gt;kubectl delete -f gateway.yaml
kubectl delete -f deployment.yaml
gcloud compute addresses delete adk-agent-ip --global
gcloud compute ssl-certificates delete adk-cert --global
gcloud container clusters delete $CLUSTER_NAME --region $REGION
gcloud artifacts repositories delete adk-repo --location $REGION&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description><pubDate>Thu, 04 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog_Hero_Image_Resizing.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Scaling AI Agents: A Step-by-Step Guide to Deploying ADK on GKE Autopilot</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog_Hero_Image_Resizing.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/scaling-ai-agents-a-step-by-step-guide-to-deploying-adk-on-gke-autopilot/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Abdel Sghiouar</name><title>Senior Cloud Developer Advocate</title><department></department><company></company></author></item><item><title>Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers</title><link>https://cloud.google.com/blog/topics/developers-practitioners/build-ai-agents-faster-with-gcs-google-cloud-storage-mcp-server/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud Storage (GCS) is a foundational component of the modern agentic tech stack and the preferred home for unstructured data at scale. As enterprises deploy agents in production, the critical focus has shifted to turning data into context and building secure, standardized integrations to access context. This is the core of smart storage: making unstructured data inherently agent-ready by turning passive objects into rich context for reasoning. Whether it’s automating complex financial workflows or diagnosing system failures in seconds, AI success now depends on how seamlessly agents can leverage this intelligence to make smart, high-stakes decisions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog, we will share &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;three&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; examples of agents built by customers using GCS, and then share how you can securely and reliably connect your agents to GCS using &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (MCP). Combined with smart storage features like auto annotations and object contexts, GCS MCP server makes the whole agent deployment process easy and simple.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Real-world agent success on Google Cloud Storage&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are seeing incredible innovation from customers leveraging MCP and Google’s agentic tech stack to solve complex business problems:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Palo Alto Networks&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; built the Strata Co-Pilot agent, a screen-aware AI assistant that guides network security administrators through complex configuration flows—either by highlighting steps or executing them directly. The agent is powered by the Gemini Live API, with GCS serving as its “historical memory” connected via the GCS MCP server.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Airwallex &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;developed an AI Assistant that understands user context, answers questions, and executes workflows on their behalf. For example, it can smartly analyze expense policy documents and generate detailed approval workflows - a task that would normally take hours to do manually. GCS and GCS metadata are used by the agent to store documents and the extracted information, respectively.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=F0Kw_eD5Y04"
      data-glue-modal-trigger="uni-modal-F0Kw_eD5Y04-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_zruL8XX.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Introducing Airwallex AI Assistant: Your concierge for effortless global finance&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-F0Kw_eD5Y04-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="F0Kw_eD5Y04"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=F0Kw_eD5Y04"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Snap's &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Job Optimization Agent analyzes Flink and Spark job specs, metadata, and historical metrics stored on GCS across thousands of jobs to find optimization opportunities, generate cost estimates, and tune configurations. Using this agent, Snap is already seeing investigation time reduced from 30 minutes to 30 seconds!&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In all these three agents, the GCS MCP server handles data operations as well as enforces standard RBAC and access policies. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Connecting agents to GCS using MCP &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;MCP has rapidly emerged as the &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;universal &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;standard for connecting agents to data sources, but building custom servers from scratch is often a slow, distracting process that diverts focus from innovation. This path introduces significant development overhead and risk, as it forces you to manage everything from authentication and error handling to keeping pace with GCS’s evolving capabilities. To solve this, GCS offers two powerful MCP server options — Remote and Local — allowing you to offload the foundational plumbing and focus on creating value.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Remote MCP server: Fully-managed &lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Connecting your agents to the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/use-cloud-storage-mcp"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage MCP server&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; requires zero infrastructure deployment. By simply pointing your agent configuration to the managed endpoint, you gain immediate access to your unstructured data on GCS, allowing you to scale your agentic workloads effortlessly without the burden of operational overhead. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Because the Cloud Storage MCP server follows the open MCP standard, it works seamlessly with major agentic frameworks like ADK and is compatible with MCP clients. You can easily connect clients like Google Antigravity and Anthropic’s Claude by adding a Custom Connector in the settings. Simply point it to your Cloud Storage MCP endpoint, and you are ready to start building — no complex configuration files required.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/image1_9FCB2cO.gif"
        
          alt="image1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Connecting an agent to storage requires robust security and governance. GCS MCP server is built on Google Cloud's standard identity, observability, and security frameworks:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Identity-first security&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Authentication is handled entirely through Identity and Access Management (IAM) rather than shared keys. This ensures agents can only access data (buckets and objects) explicitly authorized by the user.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Full observability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To track agent activity, every request and action taken via these MCP servers is logged in Cloud Audit Logs. This provides security teams with a record of every interaction, maintaining visibility alongside ease of access.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;MCP security - content scanning&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: You can optionally configure the MCP endpoint with Google’s content security service, Google Cloud Model Armor. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;This allows you to implement security controls against common MCP attack vectors—such as direct and indirect prompt injection attacks, MCP Tool poisoning attacks, and malicious URL/SQL injections—as well as prevent the leakage of sensitive data.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Storage MCP servers are perfect for most production use cases; however, as with all remote servers, you lose the capability to fully customize your MCP tools.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;2. Local MCP Server: Self-managed for controlled customization&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;While the Remote server handles standard data access, Local MCP is the right choice when you need to build &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;custom tools&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; specific to your business logic. For example, if your agent needs to perform specialized data transformations—such as redacting PII or adding context from another internal system—whenever it reads a file from GCS, a Local MCP server allows you to define those unique capabilities&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The GCS Local MCP server is an open-source &lt;/span&gt;&lt;a href="https://github.com/googleapis/gcloud-mcp/tree/main/packages/storage-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; of Google-maintained tools that provides you with a reliable bridge to your data. Here are a few tips to keep in mind while designing custom tools:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Provide precise, clear descriptions to minimize incorrect invocations by the models&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Implement model-friendly error handling for models to understand their mistakes and self-correct&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The GCS Local MCP is now also a part of the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/pre-built-tools-with-mcp-toolbox"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;MCP Toolbox for Databases&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a single open-source repository containing connectors for major data services such as GCS, BigQuery, AlloyDB, Spanner, and Cloud SQL, making it easier to monitor and manage your data ecosystem. The Toolbox offers simplified development with reduced boilerplate code, enhanced security through OAuth2 and OIDC, and end-to-end observability with OpenTelemetry integration.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Get started&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you are optimizing an existing process like Snap or automating workflow creations like Airwallex, your unstructured data is one of your agent's greatest assets.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Explore the generally available &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/use-cloud-storage-mcp"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GCS Remote MCP Server&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Check out our GCS Local MCP&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;a href="https://github.com/googleapis/gcloud-mcp/tree/main/packages/storage-mcp" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to start building custom tools today, or use it as part of &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/pre-built-tools-with-mcp-toolbox"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;MCP Toolbox for Databases&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:storage-ai@google.com"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Reach out&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to us to discuss your Agent use case with GCS data.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Tue, 02 Jun 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/build-ai-agents-faster-with-gcs-google-cloud-storage-mcp-server/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Hero-image.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Hero-image.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/build-ai-agents-faster-with-gcs-google-cloud-storage-mcp-server/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Himanshu Kohli</name><title>Product Manager, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Manjul Sahay</name><title>Product Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Experimenting with TPUs, GKE Managed DRANET, and Multi-cluster Inference Gateway</title><link>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-tpus-gke-managed-dranet-and-multi-cluster-inference-gateway/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;What happens when your workload fails in one region but you need access to service? This is a common case for availability and uptime. With recent enhancement to the Kubernetes ecosystem and capabilities like &lt;/span&gt;&lt;a href="https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Dynamic Resource Allocation (DRA)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://gateway-api-inference-extension.sigs.k8s.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Inference Gateway.&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;span style="vertical-align: baseline;"&gt;I decided to experiment with these capabilities in Google Cloud for a simple test using an AI inference workload.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog, we will explore this setup and you can also jump straight into the detailed configs in this codelab &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/gke-inference-gateway-multi-cluster-tpus-dranet#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Build multi-cluster GKE Inference Gateway, with TPUs , Cloud Storage FUSE and managed DRANET.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Building blocks &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To build out this experiment, use the following products, features, and tools:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Google Kubernetes Engine &lt;/span&gt;&lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;(GKE) managed DRANET&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: This is a managed feature that lets you request and share resources among Pods. This supports &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#use-rdma-interfaces-gpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GPUs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#use-non-rdma-interfaces-tpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;TPUs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. In this test TPUs were used in two different regions with networking assigned using managed DRANET.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-multi-cluster-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;&lt;strong&gt;Multi-cluster GKE Inference gateway&lt;/strong&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Load balances your AI/ML inference workloads across multiple GKE clusters. This works in a failover situation which is what my experiment intended to test. The type which supports this is the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/gateway-api#gatewayclass"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Multi-cluster Cross-region internal Application Load Balancer&lt;/span&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;code style="vertical-align: baseline;"&gt;gke-l7-cross-regional-internal-managed-mc&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/storage/docs/cloud-storage-fuse/overview"&gt;&lt;strong&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage FUSE&lt;/span&gt;&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Provides a way to store data, models, checkpoints, and logs directly in Cloud Storage. To speed up the deployment, an open source gemma model was downloaded to this storage for retrieval. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Virtual private Cloud (VPC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The foundational global network providing isolated, secure communication for the internal load balancers and compute nodes&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/fleets-overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Fleets&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Fleets group the separate regional clusters under a unified management control plane&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/tpu/docs/v6e"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;TPU v6e&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Google's custom AI accelerators that provide the high-performance compute required to serve the model. The VM family type used was the  &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ct6e-standard-4t&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; in a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/tpu/docs/v6e#configurations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;2x2 Slice&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Design pattern example&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;The aim is to deploy a LLM model (Gemma 3) onto 2 GKE clusters in different regions. Each cluster will use 4 TPU v6e chips. The model should be stored in Cloud Storage. The workload is served using GKE Inference Gateway which supports multi-clusters. The traffic should be routed to the region closest to the user and failover to the other region if one region fails.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-build.max-1000x1000.png"
        
          alt="1-build"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div data-draftjs-conductor-fragment='{"blocks":[{"key":"ct469","text":"Putting it together","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"a673f","text":"To get access to the TPUs for your project in two regions you have to ensure you have the necessary quota in those regions.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":90,"length":15,"key":0}],"data":{}},{"key":"8ufpl","text":"Begin: Set up the environment","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":6,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"3hun0","text":"Create a standard VPC, with firewall rules and subnet in the same zone as the reservation.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":9,"length":12,"key":1}],"data":{}},{"key":"afkbe","text":"Create a proxy-only subnet this will be used with the Internal regional application load balancer attached to the GKE inference gateway.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":9,"length":17,"key":2}],"data":{}},{"key":"23sv0","text":"Set up firewall rules allowing traffic and health checks.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"b83on","text":"Reserve static internal IP addresses in both regions for the Gateway.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"5sqev","text":"Provision a Cloud Storage FUSE bucket and configure a dedicated IAM Service Account. Bind this to a Kubernetes Workload Identity so your pods can securely mount the bucket and read the model weights directly.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"65eu0","text":"Next: Create standard GKE clusters and node pools","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":49,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"3nj2n","text":"Deploy two separate GKE clusters in your chosen regions configured.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"d6395","text":"Enable the Gateway API (--gateway-api=standard) and the Cloud Storage FUSE CSI driver (--addons GcsFuseCsiDriver) during cluster creation.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":24,"length":22,"style":"CODE"},{"offset":87,"length":25,"style":"CODE"},{"offset":24,"length":22,"style":"ITALIC"},{"offset":87,"length":25,"style":"ITALIC"}],"entityRanges":[{"offset":55,"length":30,"key":3}],"data":{}},{"key":"37hd5","text":"Create dedicated TPU v6e node pools (ct6e-standard-4t) for both clusters.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":37,"length":16,"style":"CODE"},{"offset":37,"length":16,"style":"ITALIC"}],"entityRanges":[],"data":{}},{"key":"e6o1h","text":"Enable managed DRANET on these TPU node pools by setting the flags\n ---accelerator-network-profile=auto, and\n --node-labels=cloud.google.com/gke-networking-dra-driver=true.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":68,"length":35,"style":"CODE"},{"offset":110,"length":61,"style":"CODE"},{"offset":68,"length":35,"style":"ITALIC"},{"offset":110,"length":62,"style":"ITALIC"}],"entityRanges":[{"offset":31,"length":14,"key":4}],"data":{}},{"key":"e6iod","text":"Next: Establish the global mesh via Fleet Registration","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":54,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"8nj7o","text":"Register both GKE clusters to a unified GKE Fleet by following the fleet creation and registration setup.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":66,"length":38,"key":5}],"data":{}},{"key":"6f71o","text":"Enable Multi-Cluster Service Discovery and Multi-Cluster Ingress on your fleet.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"cbent","text":"Designate your primary region as the configuration hub to act as the control plane for routing rules across both regions.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"2k3c3","text":"Next: Deploy the AI Workload","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":28,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"b56k8","text":"Use a temporary Kubernetes job to download the Gemma 3 (gemma-3-27b-it) model weights directly into your Cloud Storage bucket.","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":56,"length":14,"style":"CODE"},{"offset":56,"length":14,"style":"ITALIC"}],"entityRanges":[],"data":{}},{"key":"lihp","text":"Define a ResourceClaimTemplate that explicitly requests the managed DRANET device class (deviceClassName: netdev.google.com) with the allocation mode set to \"All\".","type":"unordered-list-item","depth":0,"inlineStyleRanges":[{"offset":9,"length":21,"style":"CODE"},{"offset":89,"length":34,"style":"CODE"},{"offset":9,"length":21,"style":"ITALIC"},{"offset":89,"length":34,"style":"ITALIC"}],"entityRanges":[],"data":{}}],"entityMap":{"0":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/kubernetes-engine/docs/how-to/tpus#ensure-quota-od-spot"}},"1":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/vpc/docs/create-modify-vpc-networks#create-custom-network"}},"2":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create"}},"3":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://cloud.google.com/kubernetes-engine/docs/concepts/cloud-storage-fuse-csi-driver"}},"4":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#enable-dra-driver-tpu"}},"5":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://cloud.google.com/kubernetes-engine/docs/how-to/creating-fleets"}}}}'&gt;
&lt;h2 class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="czag4-0-0"&gt;&lt;span data-offset-key="czag4-0-0"&gt;Putting it together&lt;/span&gt;&lt;/h2&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="4apjo" data-offset-key="31jqe-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="31jqe-0-0"&gt;&lt;span data-offset-key="31jqe-0-0"&gt;To get access to the TPUs for your project in two regions you have to ensure you have the &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/tpus#ensure-quota-od-spot" role="button"&gt;&lt;span data-offset-key="31jqe-1-0"&gt;necessary quota&lt;/span&gt;&lt;/a&gt;&lt;span data-offset-key="31jqe-2-0"&gt; in those regions.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="4apjo" data-offset-key="9e8ff-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9e8ff-0-0"&gt; &lt;/div&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9e8ff-0-0"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Begin:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Set up the environment. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/create-modify-vpc-networks#create-custom-network"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;standard VPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, with firewall rules and subnet in the same zone as the reservation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;proxy-only subnet&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; this will be used with the Internal regional application load balancer attached to the GKE inference gateway&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Set up firewall rules allowing traffic and health checks.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Reserve static internal IP addresses in both regions for the Gateway.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Provision a Cloud Storage FUSE bucket and configure a dedicated IAM Service Account. Bind this to a Kubernetes Workload Identity so your pods can securely mount the bucket and read the model weights directly.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Create standard GKE clusters and node pools.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy two separate GKE clusters in your chosen regions configured.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable the Gateway API (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--gateway-api=standard&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) and the&lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/cloud-storage-fuse-csi-driver"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage FUSE CSI driver&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--addons GcsFuseCsiDriver&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) during cluster creation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create dedicated TPU v6e node pools (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ct6e-standard-4t&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) for both clusters.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable managed DRANET on these &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#enable-dra-driver-tpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;TPU node pools&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; by setting the flags &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;---accelerator-network-profile=auto&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--node-labels=cloud.google.com/gke-networking-dra-driver=true&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Establish the global mesh via Fleet Registration.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Register both GKE clusters to a unified GKE Fleet by following the&lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/creating-fleets"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;fleet creation and registration setup&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Enable Multi-Cluster Service Discovery and Multi-Cluster Ingress on your fleet.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Designate your primary region as the configuration hub to act as the control plane for routing rules across both regions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy the AI workload.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Use a temporary Kubernetes job to download the Gemma 3 (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemma-3-27b-it&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) model weights directly into your Cloud Storage bucket.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Define a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;ResourceClaimTemplate&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; that explicitly requests the managed DRANET device class (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deviceClassName: netdev.google.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; ) with the allocation mode set to "All".&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: resource.k8s.io/v1\r\nkind: ResourceClaimTemplate\r\nmetadata:\r\n  name: all-netdev\r\n  namespace: default\r\nspec:\r\n  spec:\r\n    devices:\r\n      requests:\r\n      - name: req-netdev\r\n        exactly:\r\n          deviceClassName: netdev.google.com\r\n          allocationMode: All&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a74250&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy your inference server (e.g. vLLM) on the TPU nodes in both regions. Ensure the pod spec utilizes node selectors for the 2x2 TPU topology, requests exactly 4 TPUs, and mounts the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;netdev&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; claim. This guarantees your pods utilize the dedicated accelerator networking alongside standard Ethernet.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Configure the Multi-Cluster Inference Gateway.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Install the necessary Custom Resource Definitions (CRDs) so Kubernetes can process specialized routing objects like the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy an &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;AutoscalingMetric&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to track hardware utilization, such as KV cache usage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Use Helm to group the independent AI deployments from both regions into a single, logical &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferencePool&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy the Cross-Region Gateway and its associated &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;HTTPRoute&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to manage incoming global traffic.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Apply health checks and backend policies to the pool to ensure load balancing relies on your custom hardware metrics.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Configure an &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to instruct the gateway to route prompts to the region with the highest availability, avoiding overloaded TPUs.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: gateway.networking.k8s.io/v1\r\nkind: Gateway\r\nmetadata:\r\n  name: cross-region-gateway\r\n  namespace: default\r\nspec:\r\n  gatewayClassName: gke-l7-cross-regional-internal-managed-mc\r\n  addresses:\r\n  - type: networking.gke.io/named-address-with-region\r\n    value: &amp;quot;regions/europe-west4/addresses/gemma-gateway-ip-europe-west4&amp;quot;\r\n  - type: networking.gke.io/named-address-with-region\r\n    value: &amp;quot;regions/us-east5/addresses/gemma-gateway-ip-us-east5&amp;quot;\r\n  listeners:\r\n  - name: http\r\n    protocol: HTTP\r\n    port: 80\r\n---\r\napiVersion: gateway.networking.k8s.io/v1\r\nkind: HTTPRoute\r\nmetadata:\r\n  name: gemma-route\r\n  namespace: default\r\nspec:\r\n  parentRefs:\r\n  - name: cross-region-gateway\r\n    kind: Gateway\r\n  rules:\r\n  - backendRefs:\r\n    - group: networking.gke.io\r\n      kind: GCPInferencePoolImport\r\n      name: gemma-pool\r\n      port: 8000\r\n---\r\napiVersion: networking.gke.io/v1\r\nkind: HealthCheckPolicy\r\nmetadata:\r\n  name: gemma-health-check\r\n  namespace: default\r\nspec:\r\n  targetRef:\r\n    group: networking.gke.io\r\n    kind: GCPInferencePoolImport\r\n    name: gemma-pool\r\n  default:\r\n    config:\r\n      type: HTTP\r\n      httpHealthCheck:\r\n        requestPath: /health\r\n        port: 8000\r\n---\r\napiVersion: networking.gke.io/v1\r\nkind: GCPBackendPolicy\r\nmetadata:\r\n  name: gemma-backend-policy\r\n  namespace: default\r\nspec:\r\n  targetRef:\r\n    group: networking.gke.io\r\n    kind: GCPInferencePoolImport\r\n    name: gemma-pool\r\n  default:\r\n    timeoutSec: 100\r\n    balancingMode: CUSTOM_METRICS\r\n    trafficDuration: LONG\r\n    customMetrics:\r\n      - name: gke.named_metrics.tpu-cache\r\n        dryRun: false\r\n        maxUtilizationPercent: 60\r\n---\r\napiVersion: autoscaling.gke.io/v1beta1\r\nkind: AutoscalingMetric\r\nmetadata:\r\n  name: tpu-cache\r\n  namespace: default\r\nspec:\r\n  selector:\r\n    matchLabels:\r\n      app: gemma-server\r\n  endpoints:\r\n  - port: 8000\r\n    path: /metrics\r\n    metrics:\r\n    - name: vllm:kv_cache_usage_perc\r\n      exportName: tpu-cache\r\n---\r\napiVersion: inference.networking.x-k8s.io/v1alpha2\r\nkind: InferenceObjective\r\nmetadata:\r\n  name: gemma-objective\r\n  namespace: default\r\nspec:\r\n  priority: 10\r\n  poolRef:\r\n    name: gemma-pool\r\n    group: &amp;quot;inference.networking.k8s.io&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a74130&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;div data-draftjs-conductor-fragment='{"blocks":[{"key":"5k3m6","text":"Testing the Failover","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":20,"style":"BOLD"}],"entityRanges":[],"data":{}},{"key":"38ue0","text":"Verify the highly available architecture by simulating a primary region outage. Once the primary deployment is taken offline, the Gateway automatically detects the failure and seamlessly reroutes all subsequent user requests to the active secondary cluster, ensuring continuous availability without dropping traffic.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"44u08","text":"Next Steps","type":"header-three","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"3k54t","text":"Take a deeper dive into a hands-on codelab and more information on these features review the following.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"ohd6","text":"Hands-on Codelab: Build multi-cluster GKE Inference Gateway, with TPUs , Cloud Storage FUSE and managed DRANET","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":18,"length":92,"key":0}],"data":{}},{"key":"4jgt1","text":"Document set: DRANET","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":14,"length":6,"key":1}],"data":{}},{"key":"ep7ne","text":"Documentation: AI Hypercomputer","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":15,"length":16,"key":2}],"data":{}},{"key":"3c9h1","text":"Want to ask a question, find out more or share a thought? Please connect with me on Linkedin.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[{"offset":84,"length":8,"key":3}],"data":{}}],"entityMap":{"0":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://codelabs.developers.google.com/codelabs/gke-inference-gateway-multi-cluster-tpus-dranet"}},"1":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/kubernetes-engine/docs/how-to/config-auto-net-for-accelerators"}},"2":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://docs.cloud.google.com/ai-hypercomputer/docs/overview"}},"3":{"type":"LINK","mutability":"MUTABLE","data":{"url":"https://www.linkedin.com/in/ammett/"}}}}'&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="czag4-0-0"&gt;
&lt;h3 class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="czag4-0-0"&gt;&lt;span data-offset-key="czag4-0-0"&gt;Testing the Failover&lt;/span&gt;&lt;/h3&gt;
&lt;/div&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="9un4f-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9un4f-0-0"&gt;&lt;span data-offset-key="9un4f-0-0"&gt;Verify the highly available architecture by simulating a primary region outage. Once the primary deployment is taken offline, the Gateway automatically detects the failure and seamlessly reroutes all subsequent user requests to the active secondary cluster, ensuring continuous availability without dropping traffic.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="ef2kc-0-0"&gt; &lt;/div&gt;
&lt;h2 class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="ef2kc-0-0"&gt;&lt;span data-offset-key="ef2kc-0-0"&gt;Next Steps&lt;/span&gt;&lt;/h2&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="1r2f1-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="1r2f1-0-0"&gt;&lt;span data-offset-key="1r2f1-0-0"&gt;Take a deeper dive into a hands-on codelab and more information on these features review the following.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;ul class="public-DraftStyleDefault-ul" data-offset-key="6fjff-0-0"&gt;
&lt;li class="Draftail-block--unordered-list-item public-DraftStyleDefault-unorderedListItem public-DraftStyleDefault-reset public-DraftStyleDefault-depth0 public-DraftStyleDefault-listLTR" data-block="true" data-editor="cl1on" data-offset-key="6fjff-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="6fjff-0-0"&gt;&lt;span data-offset-key="6fjff-0-0"&gt;Hands-on Codelab: &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://codelabs.developers.google.com/codelabs/gke-inference-gateway-multi-cluster-tpus-dranet" rel="noopener" role="button" target="_blank"&gt;&lt;span data-offset-key="6fjff-1-0"&gt;Build multi-cluster GKE Inference Gateway, with TPUs , Cloud Storage FUSE and managed DRANET&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li class="Draftail-block--unordered-list-item public-DraftStyleDefault-unorderedListItem public-DraftStyleDefault-depth0 public-DraftStyleDefault-listLTR" data-block="true" data-editor="cl1on" data-offset-key="9ku8e-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="9ku8e-0-0"&gt;&lt;span data-offset-key="9ku8e-0-0"&gt;Document set: &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/config-auto-net-for-accelerators" role="button"&gt;&lt;span data-offset-key="9ku8e-1-0"&gt;DRANET&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li class="Draftail-block--unordered-list-item public-DraftStyleDefault-unorderedListItem public-DraftStyleDefault-depth0 public-DraftStyleDefault-listLTR" data-block="true" data-editor="cl1on" data-offset-key="3fjdr-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="3fjdr-0-0"&gt;&lt;span data-offset-key="3fjdr-0-0"&gt;Documentation: &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://docs.cloud.google.com/ai-hypercomputer/docs/overview" role="button"&gt;&lt;span data-offset-key="3fjdr-1-0"&gt;AI Hypercomputer&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="Draftail-block--unstyled" data-block="true" data-editor="cl1on" data-offset-key="f0ecg-0-0"&gt;
&lt;div class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr" data-offset-key="f0ecg-0-0"&gt;&lt;span data-offset-key="f0ecg-0-0"&gt;Want to ask a question, find out more or share a thought? Please connect with me on &lt;/span&gt;&lt;a class="TooltipEntity" data-draftail-trigger="true" href="https://www.linkedin.com/in/ammett/" rel="noopener" role="button" target="_blank"&gt;&lt;span data-offset-key="f0ecg-1-0"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span data-offset-key="f0ecg-2-0"&gt;.&lt;/span&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;</description><pubDate>Tue, 02 Jun 2026 07:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-tpus-gke-managed-dranet-and-multi-cluster-inference-gateway/</guid><category>Networking</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dra.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Experimenting with TPUs, GKE Managed DRANET, and Multi-cluster Inference Gateway</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dra.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-tpus-gke-managed-dranet-and-multi-cluster-inference-gateway/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ammett Williams</name><title>Developer Relations Engineer</title><department></department><company></company></author></item><item><title>Developer's guide to Gemini Enterprise and A2UI integration</title><link>https://cloud.google.com/blog/topics/developers-practitioners/guide-to-gemini-enterprise-and-a2ui-integration/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you've built a chatbot, you know this conversation:&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;User:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "Book a table for two tomorrow at 7pm." &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "Okay, for what day?" &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;User:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "Tomorrow." &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "What time?"&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A date picker would have ended this in one tap. But until recently, agents had no standard way to render a date picker — or a map, or a multi-select list — inside the chat surface they live in. They could only return text or markdown for generic usage. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we're walking through how to fix that with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, an open protocol for agent-driven user interfaces, and how to integrate an A2UI-enabled agent with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Enterprise (GE)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; so your agent renders rich and interactive UI natively in the GE chat surface — and in your own custom frontend if you want one. We'll use a working restaurant-finder agent — built with the Google Agent Development Kit (ADK), the A2A protocol, and Gemini — as the reference. The full source is on &lt;/span&gt;&lt;a href="https://github.com/wadave/agent-a2ui-demo" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and there's a &lt;/span&gt;&lt;a href="https://youtu.be/_5AaYwyqVio" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;2-minute demo video.&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=_5AaYwyqVio"
      data-glue-modal-trigger="uni-modal-_5AaYwyqVio-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_4GYPUpq.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Gemini Enterprise and A2UI integration demo&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-_5AaYwyqVio-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="_5AaYwyqVio"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=_5AaYwyqVio"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The problem: agents speak text, but users want UI&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Most agent frameworks today return strings. That's fine for short answers, but it breaks down quickly:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Multi-turn slot filling&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (date, time, party size) burns turns and patience.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Choices among options&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (which restaurant? which insurance plan?) become long bulleted lists the user has to copy-paste back.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Spatial information&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (locations, routes, floor plans) is reduced to addresses.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Developers have tried to patch this by sending HTML or JavaScript fragments, but that introduces real risks: cross-site scripting, UI injection from a remote agent you don't fully control, and visual drift from the host app's design system. What's needed is a way to transmit UI that's &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;safe like data and expressive like code&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;What A2UI is&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://a2ui.org/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A2UI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is an open protocol, &lt;/span&gt;&lt;a href="https://developers.googleblog.com/introducing-a2ui-an-open-project-for-agent-driven-interfaces/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;introduced by Google&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and co-developed with the Flutter team and product teams behind Gemini Enterprise. Instead of returning text or HTML, an agent returns a JSON payload that describes a UI: a tree of &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;components&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (Card, Text, Button, ChoicePicker, Image, …) and a separate &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;data model&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; holding the values those components display.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Three properties make this useful in practice:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Declarative, not executable.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The payload is data. The client only renders components from a pre-approved &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;catalog&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, so a remote agent can't inject arbitrary code or steal credentials through a UI widget.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Streaming-friendly.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The format is a flat list of small JSON messages, so the LLM can emit them incrementally and the client can paint as they arrive.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Framework-agnostic.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The same agent response renders through Lit, Angular, Flutter, or native mobile. The agent doesn't know — or care — what's on the other end.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A2UI is also &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;transport-agnostic&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. The messages ride inside whatever pipe you already use: A2A JSON-RPC, AG-UI, WebSockets, SSE. In our reference implementation, A2UI rides inside the &lt;/span&gt;&lt;a href="https://a2aprotocol.ai/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A2A protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; as &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;DataPart&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; objects with the MIME type &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;application/json+a2ui&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Where A2UI sits in the stack&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A2UI is one piece of a four-layer stack. Confusion usually comes from conflating these layers — they're each doing a different job:&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;div align="left"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table&gt;&lt;colgroup&gt;&lt;col/&gt;&lt;col/&gt;&lt;col/&gt;&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope="col" style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p style="text-align: left;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Layer&lt;/strong&gt;&lt;/p&gt;
&lt;/th&gt;
&lt;th scope="col" style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p style="text-align: left;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Owns&lt;/strong&gt;&lt;/p&gt;
&lt;/th&gt;
&lt;th scope="col" style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p style="text-align: left;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Examples&lt;/strong&gt;&lt;/p&gt;
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;App experience&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Client shell and conversation state — chat window, input box, message history&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;CopilotKit, AG-UI&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Pixel drawing&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Turning component descriptions into actual rendered UI&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Lit, Flutter, Angular&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Conversation pipeline&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Client–server transport — sending messages, receiving responses&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A2A Protocol&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Cargo (data format)&lt;/strong&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The thing flowing through the pipeline that &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;describes&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; the UI&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;td style="vertical-align: top; border: 1px solid #000000; padding: 16px;"&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A2UI&lt;/span&gt;&lt;/p&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Read top to bottom: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;CopilotKit/AG-UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; owns the app experience. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Lit/Flutter/Angular&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; own the rendering. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;While &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;CopilotKit&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;AG-UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; provide valuable abstractions, they remain strictly optional for implementing &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;In this architecture, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2A&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; serves as the underlying conversation pipeline, while &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; represents the structured cargo that actually traverses that pipe.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That separation is why the same A2UI payload renders identically in three very different deployment shapes:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bespoke web app&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — a custom client shell (like the reference repo's Lit &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;frontend/&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;) plus a custom A2UI renderer.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;CopilotKit / AG-UI app&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — CopilotKit owns the chat shell, an A2UI renderer is registered inside it for rich cards.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Enterprise&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — GE &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;is&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; the shell, the renderer, and the transport client. You only build the agent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;So for the GE path, the stack collapses to two layers you control: the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2A endpoint&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (your agent) and the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI cargo&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; it emits. The other two layers are GE's responsibility. CopilotKit and AG-UI are great if you're building a standalone product UI elsewhere — they're just out of scope for embedding an agent inside Gemini Enterprise.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Pattern revisions&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The protocol evolves quickly, and different clients support different revisions. Two patterns are common today:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Inline pattern&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — the agent sends a component tree with the data baked into each component (the pattern Gemini Enterprise renders today).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Decoupled pattern&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — the agent sends the component tree and the data model as separate messages, so subsequent turns can update one without re-sending the other. This reduces tokens and latency for long-running conversations and is the direction the protocol is heading.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The reference repo serves &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;both&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; patterns from one backend, picking which to emit per request based on the client's &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;X-A2A-Extensions&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; header. As new revisions ship, you add another catalog and the same negotiation pattern keeps working.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;How A2UI works inside Gemini Enterprise&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Gemini Enterprise ships with a built-in A2UI renderer. For the developer, that means the integration story is short:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build your A2A agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, embedding an A2UI catalog and example payloads alongside the regular tool definitions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Register the agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with Gemini Enterprise as an A2A endpoint. (Use &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;make register-gemini-enterprise&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; in the reference repo.)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A GE admin shares the agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with employees, just like any other agent in the GE catalog.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At runtime, the flow looks like this:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The user types a request in the GE chat. GE calls your agent's A2A endpoint and sends along &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;GE's own A2UI catalog&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — the list of UI components GE knows how to render.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Your agent decides whether a UI widget is the right response. If yes, it emits an A2UI JSON message (e.g., a &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;ChoicePicker&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; of restaurant options). If no, it falls back to text. Both can coexist in the same response.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;GE receives the JSON, validates it against its catalog, and renders the widget natively in &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;GE's own design language&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; — so it visually matches the rest of the chat surface.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;When the user interacts with the widget (selects three options, picks a date), GE serializes the interaction back into JSON and sends it to your agent as the next turn. Your agent processes structured input, not free-form text.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One thing worth flagging: because your agent doesn't ship its own renderer for GE, you don't need to choose a frontend framework to start. Your A2A endpoint can run anywhere — Cloud Run, GKE, on-prem — and GE handles the rendering.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;High-level architecture example&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The reference implementation is an ADK backend on Cloud Run designed to plug seamlessly into Gemini Enterprise.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-overview.max-1000x1000.jpg"
        
          alt="1-overview"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Gemini Enterprise connects directly to your agent using standard A2A JSON-RPC calls.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The agent serves the inline message pattern expected by the Gemini Enterprise managed UI.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Custom components like &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;GoogleMap&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; render via Google Maps Embed iframes, with the API key injected server-side so the LLM never sees it.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The following demonstration illustrates how Google Maps functions as a live, interactive component within Gemini Enterprise rather than a static image. Leveraging A2UI's streaming-friendly architecture, the agent updates the map view in real-time—dropping pins and adjusting coordinates incrementally as results arrive from the Maps API.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2-maps-ge.max-1000x1000.png"
        
          alt="2-maps-ge"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;See it running, then build your own&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Detailed implementation guide&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://github.com/wadave/agent-a2ui-demo/blob/main/docs/implementation_details_guide.md" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here.&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Demo video&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (2 minutes, end-to-end with both the Lit shell and Gemini Enterprise): &lt;/span&gt;&lt;a href="https://youtu.be/_5AaYwyqVio" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://youtu.be/_5AaYwyqVio&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI spec and component reference&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://a2ui.org/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;a2ui.org&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Enterprise updates&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, including the A2UI renderer: &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/whats-new-in-gemini-enterprise"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;What's new in Gemini Enterprise&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A2UI generative UI announcement&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://developers.googleblog.com/a2ui-v0-9-generative-ui/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Introducing A2UI generative UI&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you're already building agents on Google Cloud, the fastest path is to clone the reference repo, run &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;make local-backend&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; for a local smoke test, and then &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;make register-gemini-enterprise&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; to wire it into GE. From there, swap in your own catalog, your own tools, and your own domain. The next time a user asks your agent for "a table for two tomorrow at 7pm," the answer can be a date picker instead of another question.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 29 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/guide-to-gemini-enterprise-and-a2ui-integration/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Developer's guide to Gemini Enterprise and A2UI integration</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/guide-to-gemini-enterprise-and-a2ui-integration/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Dave Wang</name><title>Forward Deployed Engineer, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Yuan Tian</name><title>Software Engineer, Google Cloud AI</title><department></department><company></company></author></item><item><title>A Guide to AI Cold Starts on Cloud Run</title><link>https://cloud.google.com/blog/topics/developers-practitioners/a-guide-to-ai-cold-starts-on-cloud-run/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I saw a developer asking on Reddit if there &lt;/span&gt;&lt;a href="https://www.reddit.com/r/googlecloud/comments/1s8yzn1/is_there_a_sane_way_to_manage_cloud_run_cold/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;was any “sane way” to manage Cloud Run cold starts for AI across multiple regions.&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; They were experiencing startup latencies of up to 20 seconds, a frustrating gap where the infrastructure is spinning up while the user waits for a response.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The discussion was full of developers who had almost given up on serverless GPUs, with some even migrating back to GKE just to escape the latency. I decided it was time to dive deep into the Mechanics of AI Cold Starts and see if we could find that "sane way."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;During my research into &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/tutorials/gpu-gemma-with-ollama"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;hosting models like Gemma 4 on Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, I had the privilege of co-presenting at Google Cloud Next '26 with Oded Shahar (Senior Engineering Manager for Cloud Run) and our guest speaker Ajay Nair (Global VP of Platform at Elastic). &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In our session,&lt;/span&gt; "Build AI architectures with custom models on Cloud Run&lt;span style="vertical-align: baseline;"&gt;," Ajay shared the production-hardened strategies that allow Elastic to serve millions of daily requests across 17+ model variants, all while maintaining the 'scale-to-zero' efficiency of Cloud Run. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=7L5gQHcinzE"
      data-glue-modal-trigger="uni-modal-7L5gQHcinzE-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        &lt;img src="//img.youtube.com/vi/7L5gQHcinzE/maxresdefault.jpg"
             alt="A YouTube video that discusses the production-hardened strategies that allow Elastic to serve millions of daily requests across 17+ model variants, all while maintaining the &amp;#x27;scale-to-zero&amp;#x27; efficiency of Cloud Run."/&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
      &lt;figcaption class="article-video__caption h-c-page"&gt;
        
          &lt;h4 class="h-c-headline h-c-headline--four h-u-font-weight-medium h-u-mt-std"&gt;Build AI architectures with custom models on Cloud Run&lt;/h4&gt;
        
        
      &lt;/figcaption&gt;
    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-7L5gQHcinzE-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="7L5gQHcinzE"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=7L5gQHcinzE"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ajay showed us that the secret isn't just in the model, but in treating GPUs as fungible compute rather than infrastructure to manage.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I realized then that minimizing cold start latency isn't just about the model, it's about the infrastructure patterns and architectural decisions that keep it fast, scalable, and secure.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The anatomy of an AI cold start&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;official Google Cloud GPU best practices&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; explain, an AI cold start is a shift from standard web microservices. You aren't just booting code, you're moving gigabytes of weights into a specialized physical accelerator.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Think of it as a four-phase race. If you don't optimize each step, you're going to lose your users.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 1: Infrastructure Provisioning (~5s)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Run allocates the physical GPU and injects pre-installed NVIDIA drivers. Since Google manages the drivers for you, you don't have to bloat your Dockerfile.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 2: Block-Level Container Image Streaming (1-2s)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Run uses "image streaming," meaning it pulls only the blocks needed to boot. Your 15GB CUDA image can actually start as fast as a tiny Node.js app!&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 3: Engine Initialization (5-15s)&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is where your inference engine (vLLM, Ollama) warms up. This is a massive CPU-heavy task, and it's where most people get throttled without realizing it.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 4: Model Loading &amp;amp; VRAM Transfer &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is the final hurdle - moving those model weights from storage into the GPU memory. Unlike standard web apps where CPU is king, GPU memory is your primary constraint here. If your &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/decoding-high-bandwidth-memory-a-practical-guide-to-gpu-memory-for-fine-tuning-ai-models/?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;model’s weights don’t fit entirely within the GPU memory&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, performance degrades significantly as it swaps to slower system RAM.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Best practices to handling AI cold starts&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To build a "sane" production environment, here are a few crucial levers you can pull, informed by the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;official Google Cloud documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; on AI inference with GPUs.&lt;/span&gt;&lt;/p&gt;
&lt;h2 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Optimize Phase 4&lt;/span&gt;&lt;/h2&gt;
&lt;h3 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Pick the Right Deployment Option&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 4 is the "final hurdle" where you move gigabytes of weights from storage into GPU memory. Your &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices#loading-storing-models-tradeoff"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;choice of storage&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; determines how fast this transfer happens:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Storage (Concurrent Download) - Fastest:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using the Google Cloud CLI &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;(&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud storage cp&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;) &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; allows you to download model files in parallel. This is the &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/cloud-run/cloud-run-gpu-rtx-pro-6000?content_ref=can%20complete%20the%20steps%20within%20limited%20storage%20environments%20like%20cloud%20shell%20this%20codelab%20demonstrates%20how%20to%20load%20the%20model%20concurrently%20from%20cloud%20storage%20during%20container%20startup#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;recommended method&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for massive weights because it maximizes network throughput and drastically reduces transfer time. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Storage (FUSE) - Easiest:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This provides "zero-code" changes by mounting a bucket as a local file system. However, because it does not parallelize the initial download, it is significantly slower for large model weights&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Container Image - Best for &amp;lt;10GB: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Baking weights into your image is efficient for smaller models thanks to Cloud Run's Image Streaming. For models over 10GB, however, the import and streaming overhead can become a bottleneck.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Internet: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Avoid this. It is the slowest and least predictable path for production inference.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Model Format &amp;amp; Size&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Optimizing your model's format and size is a direct "hack" to shorten Phase 4 (Model Loading &amp;amp; VRAM Transfer). Because this phase is constrained by how fast you can move gigabytes of data into VRAM, smaller and more efficient files are critical.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt; 4-bit Quantization: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;This is the ultimate cold start hack. Smaller weights mean fewer gigabytes to pull from storage, which directly accelerates the download and transfer portion of Phase 4,&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fast Formats: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Pick a model format with fast load times like &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;GGUF&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to minimize startup time. For the fastest performance, move away from Python "pickle" files and use Safetensors for zero-copy loading.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Ensure VRAM Fit: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Use quantized models to ensure the weights fit entirely within the GPU memory. If the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/decoding-high-bandwidth-memory-a-practical-guide-to-gpu-memory-for-fine-tuning-ai-models/?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;model exceeds VRAM&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, Phase 4 will stall as the system swaps to significantly slower RAM. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Optimize Phases 3 &amp;amp; 4: Infrastructure &amp;amp; Network Levers&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These infrastructure settings provide the necessary resources to accelerate the most demanding parts of the startup process.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/cpu#startup-boost"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Startup CPU Boost (Accelerates Phase 3) &lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This feature temporarily doubles your CPU power during startup. A 1 vCPU instance boosts to 2 vCPUs for the duration of startup and the first 10 seconds of serving. It is essential for Phase 3, as engine initialization is a massive CPU-heavy task.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/networking-best-practices?content_ref=for%20the%20best%20networking%20performance%20for%20cloud%20run%20services%20use%20the%20second%20generation%20execution%20environment%20when%20routing%20traffic%20with%20direct%20vpc%20egress#direct-vpc-throughput"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Direct VPC Egress &amp;amp; PGA (Accelerates Phase 4)&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Utilizing&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Direct VPC Egress&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Private Google Access (PGA)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; ensures your model weight traffic stays on Google’s internal high-speed backbone. This optimizes the network path to shorten the time spent moving gigabytes of weights into VRAM.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Concurrency Tuning (Cold Start Avoidance): &lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In Cloud Run, "&lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/about-instance-autoscaling?content_ref=request%20concurrency%20calculates%20the%20number%20of%20instances%20by%20averaging%20the%20request%20concurrency%20per%20second%20over%20a%201%20minute%20and%2010%20minute%20period%20and%20divides%20this%20by%20the%20maximum%20concurrency"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;concurrency&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;" refers to the maximum number of requests a single instance can handle before the platform scales out to start a new one. For AI workloads, you must tune this setting in tandem with your model engine's internal parallelism flags (e.g., &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;--max-num-seqs &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for vLLM or &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;OLLAMA_NUM_PARALLEL&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for Ollama). &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Use the official &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices#max-concurrent-requests"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud formula&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to find your ideal Cloud Run concurrency:&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: center;"&gt;&lt;span style="vertical-align: baseline;"&gt;(&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Number&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;of&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;model&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;instances&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;∗&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;parallel&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;queries&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;per&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;model&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;)+(&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;number&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;of&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;model&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;instances&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;∗&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;ideal&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;batch&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;size&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Example: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;If your instance loads 3 model instances onto the GPU, and each model instance can handle 4 parallel queries with an ideal batch size of 4, you would set your Cloud Run maximum concurrent requests to 24: &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;(3×4)+(3×4)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;How the math works: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The goal is to keep the GPU fully saturated while ensuring users aren't stuck in a long queue. In this example, the total of 24 concurrent requests is split into two functional groups:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Active Processing (12 requests): &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Calculated as (3 instances×4 queries), this represents the total number of requests the GPU can actively process at any given moment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The "Next Batch" Buffer (12 requests): &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Calculated as (3 instances×4 batch size), these are the requests waiting "on deck" inside the container. As soon as the GPU finishes the first batch, it immediately picks up these waiting requests.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By tuning this value as high as your VRAM allows (usually 10-20 users), one warm instance can serve many requests without triggering a new scale-out event and the cold start that comes with it.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;Scaling Controls (Tuning the Threshold)&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While the formula above defines your maximum capacity, you can also tune when Cloud Run decides to start the next instance. Cloud Run's autoscaler typically targets 60% utilization, but for long-running AI cold starts, you can increase this threshold to 80% or 90% via &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/scaling-controls"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Scaling Controls&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Concurrency Target&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Increasing this allows you to "pack" more requests into a single warm instance before triggering a scale-out.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;CPU Target&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Increasing the CPU target prevents the platform from starting a new instance just because initialization or high-intensity inference spiked the CPU utilization.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Scaling &amp;amp; Reliability Strategies&lt;/span&gt;&lt;/h2&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/Gemini_Generated_Image_nc8hjhnc8hjhnc8h.max-1000x1000.png"
        
          alt="Gemini_Generated_Image_nc8hjhnc8hjhnc8h"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="znx9k"&gt;Sometimes the best way to handle a cold start is to avoid it entirely or manage it proactively.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The Single-Region "Always-On" Tradeoff&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you are deploying globally, the cost of keeping minimum instances set to 1 in every region adds up. Instead, consider an 'always-on' service in just one region. A 100ms global network delay is a much better user experience than a 20s local cold start.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;The 15-Minute Grace Period: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;A common question is 'How long will my instance stay warm after a request?' Cloud Run generally keeps instances alive for &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;15 minutes &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;after they become idle (processing zero requests). If your traffic is predictable and comes in every 10–12 minutes, you might not even need an 'always-on' service, the platform’s default shutdown policy will keep a warm instance ready for your next user.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt;Note&lt;/strong&gt;: While this idle time is "free" for standard request-based services, remember that GPU services require instance-based billing, so you will be billed for the duration the instance remains warm between requests.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The "Wake-Up Call" Strategy &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Sometimes the best way to handle a cold start is to proactively mask it. If your UI can predict an upcoming request, for example, when a user clicks "New Chat" or begins hovering over a text area, you can send a lightweight health check to your service  immediately. By the time the user finishes typing their prompt, the first two phases of the cold start (Infrastructure Provisioning and Container Image Streaming) are already finished in the background. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Pro-Tip: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Use &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Non-Inference Endpoints &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;To make this "wake-up call" as fast as possible, always use a non-inference endpoint rather than sending a dummy prompt like "hi". &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Why it’s faster:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Non-inference endpoints (like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;/v1/models&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; for vLLM or &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;/api/tags &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;for Ollama) are handled by the container’s web server the moment it starts. They don’t have to wait for the slow "Phase 4" model loading and VRAM transfer to complete before sending a success response.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;No Chat Pollution: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Because these endpoints don't trigger the model's completion logic, they won't interfere with the user's actual chat history or accidentally trigger session creation in your backend.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Recommended Endpoints:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;vLLM: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GET /health &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or GET &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GET /v1/models&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Ollama: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GET /api/tags&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GET /api/version&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Tune Startup Probes for VRAM &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI models take significant time to move gigabytes of weights from storage into GPU memory (Phase 4). If your startup check fails too many times, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/healthchecks?content_ref=prevents+the+containers+from+being+shut+down+prematurely+before+the+containers+are+up+and+running"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run will assume your container is broken and kill it&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To prevent this:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Increase the Failure Threshold&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Use a high &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;failureThreshold&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (e.g., 60 or more). Since the total allowed startup time is the product of &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;failureThreshold \times periodSeconds&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, a threshold of 60 with a 5-second period gives your model a healthy 5-minute window to load.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Utilize the 30-Minute Maximum&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: While standard services are limited to 4 minutes, Cloud Run supports a total startup time of up to 30 minutes (1,800 seconds) for intensive workloads.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Avoid False Positives (The Ollama Fix)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Be careful with engines like Ollama, which may open a TCP port as soon as the service starts, but &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;before&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; the model is actually in VRAM. Always ensure you are &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;preloading models&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; during the container's entrypoint script to ensure the startup probe only passes once the model is truly ready for inference.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Lessons from Elastic’s strategy&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In our NEXT ‘26 session, Ajay Nair highlighted three architectural decisions that allowed Elastic to treat GPUs as fungible compute, rather than infrastructure to manage:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bypass the Compilation Tax: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;By setting&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt; enforce_eager=True&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; in vLLM, they traded a tiny bit of throughput for cold starts that finish in less than a minute rather than multiple minutes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Standalone Checkpoints: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;They avoided the latency of runtime adapter-switching by pre-merging each LoRA variant into a standalone checkpoint.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;One Workload, One Service:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Each independently-scalable workload — defined by model, task adapter, and traffic shape — is deployed as its own Cloud Run service. This produces 30+ services across ~15 model families, with some models split by task (e.g., v5 retrieval vs. clustering) or by query/passage role.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to get started?&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Optimizing the cold start process&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; is the difference between a hobby project and a production-ready application. The best part? Cloud Run handles the NVIDIA driver and CUDA installation for you, starting the instance in about 5 seconds.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For a deeper dive, the official documentation is your best friend:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu-best-practices"&gt;Best practices: AI inference on Cloud Run with GPUs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/gpu"&gt;Configure GPU for Cloud Run services&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/cpu#startup-boost" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Startup CPU boost for Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For the full technical breakdown, I highly recommend watching the recording of the &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=7L5gQHcinzE" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;session&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; from Google Cloud Next '26. It provides the most comprehensive blueprint for hosting high-performance open models on serverless infrastructure."&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Happy building!&lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;Special thanks to Sara Ford and Shane Ouchi from the Cloud Run team and to Zac Li from Elastic for the helpful review and feedback on this article.&lt;/span&gt;&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 27 May 2026 17:23:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/a-guide-to-ai-cold-starts-on-cloud-run/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/cold_start.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>A Guide to AI Cold Starts on Cloud Run</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/cold_start.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/a-guide-to-ai-cold-starts-on-cloud-run/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Shir Meir Lador</name><title>Head of AI Engineering, Google Cloud Developer Relations</title><department></department><company></company></author></item><item><title>Shipping features to production just got easier with new feature flags in AppLifecycle Manager</title><link>https://cloud.google.com/blog/products/application-development/new-feature-flags-in-applifecycle-manager/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Many development teams are familiar with the hesitation that comes right before pushing a new feature live. As AI helps developers write code faster, the gap between rapid code generation and safe production deployment continues to grow.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Feature flags offer a practical way to manage this risk by separating the act of deploying code from the act of releasing a feature to users. Instead of a single, high-risk launch event that affects all users simultaneously, teams can ship code to production with new features hidden by default in a controlled manner.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To help teams adopt this workflow, we are announcing the public preview of AppLifecycle Manager Feature Flags (ALM FF). This service provides a rule-based solution to manage software behavior across Google Cloud, helping you support rapid development without sacrificing production stability.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Read on to learn four ways these feature flags will help accelerate your deployment.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;1. Decouple for safety and velocity&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The core mission of ALM FF is to increase development velocity by decoupling your feature releases from your code deployments. Traditionally, releasing a feature requires a binary deployment — a high-risk event that affected all users simultaneously.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With ALM FF, you can ship code to production with new features disabled by default. This allows your team to move faster, deploying code continuously while choosing the exact moment to enable a feature via a toggle. If an issue is detected, the flag acts as an instant kill switch, disabling the problematic feature immediately without the need for a full, time-consuming code rollback.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_YPP5fhI.max-1000x1000.png"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;2. Gradual enablement with precise targeting&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Safety is  about precision. ALM FF leverages the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Common Expression Language (CEL)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to implement sophisticated logic for gradual feature enablement.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Percentage feature enablement:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Instead of a global launch, you can ramp up a feature to 1%, 5%, or 50% of your traffic. This allows you to monitor system health and performance metrics incrementally, ensuring stability before reaching your entire user base.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Precise allowlisting:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You can target specific internal teams, beta testers, or early-access customers by allowlisting their identifiers. This ensures that only the intended audience sees a feature during its initial validation phase.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_AT9I4Zt.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;3. Dynamic configuration for the AI era&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Beyond simple toggles, ALM FF offers a dynamic way to inject configuration into your applications. By using string-type flags, you can update application behavior — such as system prompts for LLM integrations—in real-time. This allows product managers and business owners to tweak AI responses and application logic without requiring any code changes or infrastructure rollouts.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_7li94CT.max-1000x1000.png"
        
          alt="3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;4. Built on open standards&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We believe safety should not mean lock-in. ALM FF is built on the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;OpenFeature&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; standard, utilizing industry-standard SDKs and the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;flagd&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; evaluation engine. This ensures your feature management patterns are portable and follow best practices without adding Google-specific dependencies to your core application code.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Get started&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;ALM FF is now in public preview. To take control of your releases, you can:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Review the docs:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/saas-runtime/docs/flags/flags-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Public Documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Onboard today:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/saas-runtime/docs/flags/flags-quickstart"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Quickstart Guide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Give us feedback:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Help us &lt;/span&gt;&lt;a href="https://forms.gle/boGXCgKyoB7Lr6yd9" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;shape the future of feature management&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Thu, 21 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/application-development/new-feature-flags-in-applifecycle-manager/</guid><category>Developers &amp; Practitioners</category><category>Application Development</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Shipping features to production just got easier with new feature flags in AppLifecycle Manager</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/application-development/new-feature-flags-in-applifecycle-manager/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Erol-Valeriu Chioasca</name><title>Product Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Securing Your Gemini and Google API Keys</title><link>https://cloud.google.com/blog/topics/developers-practitioners/api-keys-are-open-secrets/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, AI services rely heavily on API keys. To run AI agents, users provide API keys that signify paid tokens, subscriptions, or paid accounts. While API keys are easy to use, it is just as easy to use them unsafely. The result of a hijacked key is a compromised environment that is misused or abused by perpetrators.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I decided to write this blog post after seeing a thread in the r/googlecloud subreddit asking for a tutorial so users can go and protect themselves. &lt;strong&gt;In this post, you will find a few simple steps you can take to reduce your risks and improve the security of API keys created by Google&lt;/strong&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You use Google API keys to access Gemini and other AI Google products as well as Google Cloud APIs. In fact, a Gemini API key is actually a standard Google API key behind the scenes. While I will be focusing on Google API key security, you can apply some of these recommendations to API keys and product tokens created elsewhere.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 1: Generate a New API Key&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Regardless of where you start, you end up creating a new API key in one of Google Cloud projects. You probably will use &lt;/span&gt;&lt;a href="https://console.cloud.google.com/apis/credentials"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Credentials&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; under the "APIs &amp;amp; Services" menu in the Cloud console.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/api_services_credentials.max-1000x1000.png"
        
          alt="api_services_credentials"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Or you may use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud services api-keys create&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/services/api-keys/create"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;command&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; instead. Or there is some other interface which will create a new Google Cloud API key. Regardless of the path and the interface, you need to do the following:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create the key in a stand alone project that is not used for any other purpose.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Restrict API access and client applications for the new API key.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These steps limit the potential reach of the key and greatly simplify troubleshooting activities &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;if something goes wrong&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;API Restrictions&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;API restrictions define exactly which services can be accessed using a specific API key. To keep your environment secure, always limit this list to the absolute minimum set of services required. While the Google Cloud console now prevents the creation of entirely unrestricted keys, it can still be tempting to add extra APIs to "future-proof" or speed up development. However, we strongly advise against this. By strictly adhering to the principle of least privilege, you significantly reduce the potential damage (or "blast radius") if a key is ever accidentally exposed or hijacked.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It is also important to audit keys generated automatically through integrated developer tools. For example, creating an API key in Firebase restricts the use to 24 APIs including Datastore, Firestore, Cloud SQL Admin and others.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/api_key_restrictions.max-1000x1000.png"
        
          alt="api_key_restrictions"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you use Firebase to store your website you probably will not use most of them. When you create an API key to use with AI Studio, restrict it to only "Gemini API".&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Attention points:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;If you search for an API that you want to select but it is missing, this API is probably not enabled in the Google Cloud project that you use. Go to the &lt;/span&gt;&lt;a href="https://console.cloud.google.com/apis/library"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;API Library&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in your Cloud console, find the API by name and enable it first.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;You can do all actions using the Cloud console or gcloud CLI. Other interfaces (e.g. Firebase) may not provide you with access to all parameters of the API keys&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Application Restrictions&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Similar to API restrictions that limit what services your key can be used for, Application Restrictions limit the applications which can use the key. For example, if you create an API key only for use with Google AI Studio, setting up the application restrictions to the website "&lt;/span&gt;&lt;a href="https://aistudio.google.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://aistudio.google.com/&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;" will prevent using your key by automations that utilize Gemini and consume a high volume of tokens at scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can set up one or more restrictions of one of the following types:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Website&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;/&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Web application&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using the list of URLs&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Services&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using the list of IPv4 or IPv6 address or a subnet masks&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;iOS applications&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using the list of Bundle IDs&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Android applications&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; using the list of pairs of the package name and certificate fingerprint&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Note that you can restrict the key to a single application type only. Create a designated API key for each application type. Having a key per application type helps when observing the key usage and investigating potentially compromised keys.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Store API key&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;I want to reiterate that the API key is not paired with your identity. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;ANYONE&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; can use it. So, storing the key securely is as important as restricting the key use in Step 1.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The rule is simple: NEVER EVER store the key where it can be easily seen.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;If you use an API key in your application&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, store it in &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/secret-manager/docs/best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Secret Manager&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or a similar secret management service. Secret Manager allows you to inject your API key into &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/configuring/services/secrets"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/secret-manager/docs/secret-manager-managed-csi-component"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; environments easily. However, to elevate the key protection you may want to read the key in your code instead. See &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/secret-manager/docs/samples/secretmanager-get-secret"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for an example.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;If you use an API key with an external application&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; that asks you to type in the key, take extra steps to explore how the application manages your key. You would need to find out how the key is stored and how it is used in the requests. For Web applications, you may use browser developer tools to inspect application traffic and ensure that the key is never sent in an unencrypted communication channel. For example, Google AI Studio uses encrypted local storage and sends the key via a TLS-encrypted channel.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;If Something Goes Wrong&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;What to do if you suspect that your key is compromised?&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The straightforward action is the same as with a credit card. First thing ‒ delete the key. You can do it in the Cloud console or using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud services api-keys delete&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/services/api-keys/delete"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;command&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. If you find out that it was a false alarm, you can &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/services/api-keys/undelete"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;undelete&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; during the next 30 days.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;What if you do not know which key is compromised? &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In that case you need to do a two-step investigation:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Find out all API keys in your organization or project(s)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Check the graph of API consumption for APIs this key allowing to access&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Find out all your API keys&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;There is more than one way to find your API key resources. You can use &lt;/span&gt;&lt;a href="https://console.cloud.google.com/iam-admin/asset-inventory/dashboard"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Asset Inventory&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in the Cloud console and filter the dashboard by the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Resource type&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to check &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;apikeys.Key&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. If you do not see this resource type, find and click on "View more…" to expand the resource type list. Note that the list shows deleted API keys as well.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you favor CLI, and you know specific project(s) you can use the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud services api-keys list&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/services/api-keys/list"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;command&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To see all active keys in your organization, you will need to use the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud asset search-all-resources&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/gcloud/reference/asset/search-all-resources"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;command&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and query its JSON output to filter out deleted keys:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud asset search-all-resources \\\r\n  --scope=\&amp;#x27;organizations/123456789012\&amp;#x27; \\\r\n  --asset-types=\&amp;#x27;apikeys.googleapis.com/Key\&amp;#x27; \\\r\n  --read-mask=&amp;quot;name,displayName,versionedResources&amp;quot; \\\r\n  --format=json \\\r\n  --order-by=\&amp;#x27;createTime\&amp;#x27; \\\r\n| jq \&amp;#x27;.[] | select(.versionedResources | all(.resource.data.deleteTime == null))\&amp;#x27;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5acbf70&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Find out API consumption&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;There is a way to track the usage of the API key. You can do it using the Cloud Monitoring &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/apis/docs/monitoring#expandable-1"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;metric&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;serviceruntime.googleapis.com/api/request_count&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. This metric shows a number of times different services have been invoked. To see the number of service requests for a particular API key you will need to use the metric's label &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;credential_id&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and filter it by the API key unique ID. You can see the metric data using &lt;/span&gt;&lt;a href="https://console.cloud.google.com/monitoring/metrics-explorer"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Metrics explorer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or use the Monitoring API with the following &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/monitoring/promql"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;PromQL&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; expression:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;sum(\r\n  rate({\r\n    &amp;quot;__name__&amp;quot;=&amp;quot;serviceruntime.googleapis.com/api/request_count&amp;quot;,\r\n        &amp;quot;monitored_resource&amp;quot;=&amp;quot;consumed_api&amp;quot;,\r\n        &amp;quot;credential_id&amp;quot;=&amp;quot;apikey:00000000-0000-0000-0000-000000000000&amp;quot;\r\n  }[${__interval}])\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5acbc70&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can further filter this metric by &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;service_name&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; label using API name (e.g. &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;mapstools.googleapis.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In order to find out the API key ID you will need to use one of the following methods:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Using the Cloud console, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;open the &lt;/span&gt;&lt;a href="https://console.cloud.google.com/apis/credentials"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Credentials&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; page and select the API key that you want. Inspect URL of the API key page in the browser which will look like: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;https://console.cloud.google.com/apis/credentials/key/[KEY_ID]?project=[PROJECT_ID&lt;/code&gt;&lt;code style="vertical-align: baseline;"&gt;]&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. Copy the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;[KEY_ID]&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; part.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Using gcloud CLI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, run the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud services api-keys list --format='value(displayName,uid)'&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;command and find the key by its display name. Copy the UID next to the display name.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Abnormally high level of API invocations usually indicates that the API key was compromised and used to access API by a malicious party.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: API key management hygiene&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you are an engineer, an experienced cloud user or just came to experiment, keeping proper API key hygiene is important to avoid your environment being hijacked from you.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you already use Google API keys do the following right now:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Find out all API keys that you have&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Delete all keys that you no longer use or do not recognize (do not worry, you can restore them during next 30 days)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Restrict API keys to only APIs that you intend to use. Narrow the list of clients that can use the APIs if you can&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;If you administer your Google Cloud projects or organization, consider setting up the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/api-keys/docs/custom-constraints"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;apikeys.googleapis.com/Key&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; org policy to minimize wrangling API keys&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Consider periodically rotating (refreshing) your API keys by replacing them with newly created ones that share the exact same restrictions. Just be careful to track down and update all places where your existing key is used before deleting it to prevent unexpectedly breaking your application or abruptly losing access to one.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Wrapping up&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;Securing API keys is a vital step in protecting your cloud ecosystem. Implementing strict API and application restrictions, utilizing secure storage, and proactively monitoring consumption are highly effective ways to prevent unauthorized access. These practices safeguard your development environment from exploitation and prevent unexpected billing charges.&lt;/p&gt;
&lt;p&gt;To help you implement these practices, here are a few practical tools and resources you can explore next:&lt;/p&gt;
&lt;ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Check more about APIs:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Review &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/docs/authentication/api-keys-best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Best practices for managing API keys&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and practice &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/search-for-and-select-google-apis#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Search for and use Google APIs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Watch a quick tutorial:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Check out this great Google Cloud Tech video on &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=JIE89dneaGo&amp;amp;t=91s" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Manage your Cloud Run secrets securely with Secret Manager&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to see secure storage concepts in action.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Get hands-on with a Codelab:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Practice fetching credentials safely in a guided environment by trying Secret Manager with &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/secret-manager-python#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Python&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or with &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/cloud-spring-cloud-gcp-secret-manager#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Spring Boot&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; codelabs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dive deeper into the docs:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Learn about how to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/monitoring/charts/metrics-selector"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;select metrics&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/monitoring/charts/metrics-explorer"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;create charts&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/monitoring/alerts"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;set up alerts&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to observe your API consumption.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt; &lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 21 May 2026 10:19:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/api-keys-are-open-secrets/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_image_aJLug1s.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Securing Your Gemini and Google API Keys</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/hero_image_aJLug1s.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/api-keys-are-open-secrets/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Leonid Yankulin</name><title>Senior Developer Relations Engineer</title><department></department><company></company></author></item><item><title>What Google I/O '26 means for developing agents on Google Cloud</title><link>https://cloud.google.com/blog/topics/developers-practitioners/io26-news-for-agent-developers-on-google-cloud/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google I/O, we introduced a unified development toolkit featuring &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity 2.0&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Managed Agents API&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, giving developers better ways to build locally and deploy securely to the cloud on a shared protocol layer. In this blog, we’re going to show you how Gemini Enterprise Agent Platform and the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/innovations-from-google-io-26-on-google-cloud"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;new developer tools shared at I/O&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; fit together, unpack the spectrum of choice for building, and share what we’d actually try first.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Following the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;evolution&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; of Vertex AI into the Gemini Enterprise Agent Platform – &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;a comprehensive platform to build, scale, govern, and optimize agents&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; with new features like session memory and centralized governance – we are now extending these capabilities directly into your local development tools. Our goal is to bridge the gap between high-speed prototyping and secure, compliant corporate deployment, offering a modular approach where you can choose between quick-start workflows or full production control to fit your stack's specific needs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Here’s how those pieces now lay out across the entire spectrum of choice.&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The four rungs: The spectrum of how to build agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We like to think of the agent development ecosystem as four rungs on a ladder, designed to give you a clear slider between out-of-the-box configuration and complete code-first control. They're deliberately additive, meaning that starting fast on the lower rungs above never locks you out of graduating to the deeper customization of the rungs above. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Underneath all four rungs is the &lt;/span&gt;&lt;a href="https://google.github.io/A2A/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A2A protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This interoperability ensures that an agent built on the first rung can be called as a sub-agent on the fourth rung, allowing your entire architecture to scale seamlessly on the same infrastructure.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/blog-ladder-1.max-1000x1000.png"
        
          alt="blog-ladder-1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rung one: Agent Studio (low code)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A visual workspace inside Agent Platform. You discover models in Model Garden, engineer prompts, wire up tools, and ship an agent without writing code. Best for business-facing teams and rapid prototyping. The agent you build here runs on the exact same runtime as everything below it.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rung two: Managed Agents API&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;New at I/O, the&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/managed-agents"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Agents API&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is for technical teams who want to &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“manage the mission, not the machine."&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; It allows you to define agentic behavior and let Google Cloud handle the heavy lifting, acting as an agent-as-a-service with nothing to manage.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You use the Managed Agents API to &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;configure&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; your agent, and the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Interactions API&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;invoke&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; it. You package your instructions, skills, and tools, POST them, and Gemini builds and runs the agent.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;What makes this deployable is the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Cloud sandbox,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; which is secure by design. The agent harness runs on &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;our&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; servers, and each agent has its own ephemeral sandbox provisioned with your skills, Model Context Protocol (MCP) servers, and server-side tools. Full integration with A2A and Agent Platform governance and security are coming soon.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=eFot-mAWwiw"
      data-glue-modal-trigger="uni-modal-eFot-mAWwiw-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-2_VJ69eVE.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Agents API over A2A with Gemini Enterprise&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-eFot-mAWwiw-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="eFot-mAWwiw"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=eFot-mAWwiw"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rung three: Antigravity and friends&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://antigravity.google/blog/introducing-google-antigravity-2-0" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;is our primary solution for developers looking to leverage AI for coding tasks and agent orchestration, enabling teams to transform how apps are built and deployed. We've consolidated our developer-facing coding strategy into this single, powerful harness shared across multiple surfaces.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It’s co-optimized with the Gemini family of models, offering high efficiency to speed up development cycles and reduce costs. Skills you develop with Antigravity are intended to be portable across different surfaces.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/antigravity_-_3.max-1000x1000.png"
        
          alt="antigravity - 3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is for development teams who want to utilize Google's advanced reasoning capabilities within their coding workflows, implement custom development loops, and transform how they build, deploy, and manage applications.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we are expanding this with new tools:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity 2.0:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A new standalone desktop application providing a centralized workspace to steer, customize, and orchestrate coding agents. Developers can use this to manage complex tasks, such as orchestrating agents to refactor code, generate unit tests, or even scaffold new service components based on a specification. Agents can spin subagents from a single prompt, while multi-agent orchestration allows tasks to run in parallel. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Antigravity CLI:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This brings the full Antigravity experience to the command line: same harness, same agent, same quality of intelligence as Antigravity 2.0, with a product experience tailored for the terminal. It's optimized for speed and lower overhead, and adapts entirely to you. The CLI is tightly integrated with the desktop app, sharing authentication, context, skills, and configurations, providing a consistent experience across both interfaces. Use the Antigravity SDK to build your own runtime.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Enterprise security and compliance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Google Cloud customers can now use Antigravity 2.0 and Antigravity CLI with their Gemini Enterprise Agent Platform project. All you have to do is to log  in with Cloud OAuth, set your Agent Platform Project ID and region. This ensures that all agent inference runs via Agent Platform models within your secure cloud boundary, inheriting Google Cloud’s standard data privacy protections and Terms of Service. This ensures your customer data is in your control , and you can utilize regional model endpoints.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/antigravity_-4.max-1000x1000.png"
        
          alt="antigravity -4"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Integrating other coding agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While Antigravity is our recommended agentic coding solution, Google Cloud is designed to work well with any coding agent you choose. Our platform is open, and we provide tools to ensure flexibility:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent CLI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Development Kit (ADK)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; allow you to build and interact with agents from various sources, including tools like &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Claude Code&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This means developers can often keep their preferred interfaces while running the underlying AI inference on Google Cloud. This approach ensures your workflows benefit from Google Cloud's security, compliance, and infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Our &lt;/span&gt;&lt;a href="https://github.com/google/skills" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Skills for Google products&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, launched at Next, are designed to be compatible with multiple coding tools, enabling you to enhance different agents with a consistent set of capabilities.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This flexibility allows teams to integrate their existing favorite tools and models, ensuring seamless and compliant operation within their established workflows. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rung four: Agent Development Kit (ADK 2.0)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Code-first, low floor, high ceiling. If Managed Agents are configuration-first, ADK is &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;engineering-first.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This is for software engineers who want to build custom agent meshes from the ground up - any architecture, any model, unconstrained.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://adk.dev" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;ADK&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; enhancements launched at Google Cloud Next are now available for everyone.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; It introduces a unified graph-based engine that gives you a slider from dynamic, model-led reasoning to strict, deterministic workflows. The framework handles the heavy lifting of multi-agent coordination, managing how sub-agents, tools, and data pass between one another.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Collaborative workflows (Python v2.0.0):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Previously called the Task-based Agent Collaboration API, this is how you build self-managing agent teams. A coordinator delegates to subagents using explicit operating modes:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;chat&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;: Full user interaction, manual return to parent, this is “handoff conversation to sub-agents”.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;task&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;: User interaction for clarifications, automatic return to parent, this is a new “collaborate for this assignment” which is the best of both other options.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;single-turn&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;: No user interaction, parallel execution, automatic return, this is “agent as tool”.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dynamic workflows:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Dynamic workflows in ADK allow you to put aside graph-based path structures and use the full power of your chosen programming language to build workflows. With Dynamic workflows, you can create workflows with simple decorators, invoke workflow nodes as functions, and build complex routing logic.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;ADK Kotlin (Beta):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "ADK for Android." Kotlin support joins Python, Go, and Java, increasing language coverage so your on-device mobile agents can seamlessly coordinate with your backend Python agents.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agents CLI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; packages Google's expert skills for ADK, eval, deploy, observability, and publishing - turning any AI coding agent (like Antigravity, Gemini CLI, Claude Code, or Cursor) into an expert at &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;agent app building&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; as well as &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;agent ops&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. It gives your AI Agent skills to understand the Google Cloud agent stack, turning an expansive ecosystem into a seamless assembly line for developers hillclimbing their agent builds. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=GDd-Mhm2gcc"
      data-glue-modal-trigger="uni-modal-GDd-Mhm2gcc-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-3_QY4p08W.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Agents CLI speedrun&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-GDd-Mhm2gcc-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="GDd-Mhm2gcc"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=GDd-Mhm2gcc"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;What we'd actually try first&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If we were starting today, here's the order we'd reach for things:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Start with the &lt;/strong&gt;&lt;a href="https://antigravity.google/blog/introducing-google-antigravity-2-0" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity 2.0 desktop app&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Explore the interface, add a pre-built agent, and interact with it to understand the core functionality. This provides a more intuitive entry point before diving into API specifics.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build a mesh: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Feel free to explore Managed Agents API through the &lt;/span&gt;&lt;a href="https://github.com/google/skills/tree/main/skills/cloud/gemini-agents-api" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agents API skill&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://github.com/google/skills/tree/main/skills/cloud/gemini-interactions-api" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Interactions API skill&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;When you start hitting routing decisions you want to make explicit, or need complex multi-agent orchestration, port your logic to &lt;/span&gt;&lt;a href="http://adk.dev" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; 2.0. The graph model is worth the learning curve as soon as you have more than two branching paths. Don't worry about stringing together a bunch of separate pieces to make this happen - this is exactly where the &lt;/span&gt;&lt;a href="https://github.com/google/agents-cli" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agents CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; shines. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Govern and reuse shared domain logic: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Check out &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/skill-registry"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Skill Registry&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;(public preview):&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A centralized catalog to govern and promote the reuse of packaged domain logic. Skills are accessible via the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/managed-agents"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Agents API&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Platform &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;SDK, and ADK (via &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;SkillToolset&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;). Skill Registry will be part of &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-registry/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Registry&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; shortly.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Evaluate:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use the Gemini Enterprise Agent Platform &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/evaluation/agent-evaluation"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;evaluation suite&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to move beyond basic text-matching vibe checks. Leverage synthetic user simulation to auto-generate multi-turn testing scenarios and safely mock API environments to pressure-test tool resilience. Finally, utilize its LLM-based autoraters and trace logging to evaluate complex logic, group failures, and continuously optimize your agent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secure the pipeline:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Leverage Gemini Enterprise Agent Platform governance capabilities like &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/agent-identity-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Identity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/gateways/agent-gateway-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, Agent Security, and Agent Registry to secure your deployment. Once CodeMender releases, add it to your CI/CD to proactively secure the code your human (and AI) developers are pushing.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Note: You can do this whole loop on a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/docs/starter-tier"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Starter Tier&lt;/span&gt;&lt;/a&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; account without a billing account attached. First two app deployments are on us.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;We’re excited and hope you are, too&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The agent space is evolving rapidly. Agent Platform offers a secure and adaptable foundation. Core components like the Agent Gateway, identity management, and the Skill Registry work together to ensure a robust and controlled environment for your agents, enabling you to innovate flexibly without vendor lock-in.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Pick the rung that fits the project. Bring whatever coding agent your team prefers. The platform you graduate to is the same one either way, and the data stays inside your Cloud project the whole time.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you only read one set of docs after this post, make it the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agents overview in the Agent Platform documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. If you build something interesting, show us - the best examples will land in the next round of templates.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We can’t wait to see what you build!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 19 May 2026 17:45:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/io26-news-for-agent-developers-on-google-cloud/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What Google I/O '26 means for developing agents on Google Cloud</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/io26-news-for-agent-developers-on-google-cloud/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Addy Osmani</name><title>Director, Google Cloud AI</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Alan Blount</name><title>Product Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Gemini Live Agent Challenge: Announcing the winners and highlights</title><link>https://cloud.google.com/blog/topics/developers-practitioners/winners-and-highlights-of-the-gemini-live-agent-challenge/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Gemini Live Agent Challenge is officially in the books! We challenged developers worldwide to break out of the traditional 'text box' paradigm by building next-generation AI agents. From our &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/training-certifications/join-the-gemini-live-agent-challenge?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;initial announcement&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to amassing 11,878 participants and &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;1,536&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; submitted projects from &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;151 &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;countries, the results were nothing short of spectacular.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The mission was to seamlessly integrate multimodal capabilities—building agents that help you see, hear, speak, and create in real time — using the Gemini Live API, the Agent Development Kit (ADK), and the robust infrastructure of Google Cloud. Participants pushed the boundaries of interactive AI across three distinct categories: The Live Agent, The Creative Storyteller, and The UI Navigator.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Congratulations to the builders who took home the top prizes! These winning teams combined technical precision with bold imagination, completely redefining how users can interact with and experience agents. Two of these standout developers were even recognized in person at Google Cloud Next 2026. Here’s a look at their experience, alongside the complete list of winning agents.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Celebrating our category winners at Google Cloud Next ‘26&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Category winners Jeremiah Somoine and Bryen Param were invited to attend Google Cloud Next 2026 in Las Vegas, where they shared their experiences and insights with the broader developer community. Both winners presented Lightning Talks at the Developer Theatre on the expo floor and sat down for exclusive interviews in the Creator Studio Pod at the GDE and Certified Lounge. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;During his time at the event, Bryen discussed the core inspiration behind &lt;/span&gt;&lt;a href="https://devpost.com/software/drone-copilot" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;drone-copilot&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. He explained that his project was driven by the question of "what if a model could interact with the real world?", showcasing how multimodal capabilities can bridge the gap between AI and physical environments. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/bryen.max-1000x1000.jpg"
        
          alt="bryen"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Jeremiah, currently a college student, reflected on the development process behind &lt;/span&gt;&lt;a href="https://devpost.com/software/sankofa-y47f9p" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; text-decoration: underline; vertical-align: baseline;"&gt;Sankofa&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, noting that "the best response to a technical limitation was a creative one." When asked what advice he would give to other students looking to build the next generation of AI applications, he emphasized the importance of jumping at any opportunity to get hands-on with the technology. "The best way to learn is by doing," he said, encouraging aspiring developers to simply dive in and start building.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/jeremiah_edited.max-1000x1000.jpg"
        
          alt="jeremiah edited"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Winners&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Grand Prize winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/orion-operating-room-intelligent-orchestration-node" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ORION - Operating Room Intelligent Orchestration Node&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Aditya Shukla&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;ORION, or Operating Room Intelligent Orchestration Node, is a voice-directed surgical co-pilot for robotic surgery. Surgeons can speak naturally and instantly receive answers, live data on display, and real-time visual assistance - all without breaking scrub.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=AnxII9COzjo"
      data-glue-modal-trigger="uni-modal-AnxII9COzjo-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_0lhMev0.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Orion - Voice Directed Surgical AI Assistant | Gemini Live Agent Hackathon&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-AnxII9COzjo-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="AnxII9COzjo"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=AnxII9COzjo"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;The Live Agent winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/drone-copilot" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;drone-copilot&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Bryen Param&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Drone-copilot transforms how users interact with hardware by enabling natural, real-time conversations with a drone instead of using a joystick or complex menus. Simply by speaking, users can instruct the drone to navigate, perform autonomous visual inspections, or describe its surroundings, while the drone verbally responds and confirms its actions in real time.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=_FCgmYjGCVs"
      data-glue-modal-trigger="uni-modal-_FCgmYjGCVs-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_C6lpyed.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Drone Copilot: Voice-Controlled Drone + Autonomous Inspection with Gemini Live API&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-_FCgmYjGCVs-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="_FCgmYjGCVs"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=_FCgmYjGCVs"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Creative Storyteller winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/sankofa-y47f9p" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Sankofa&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Jeremiah Somoine&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Sankofa acts as a multimodal AI "griot"—a traditional West African storyteller—transforming fragmented family histories into deeply immersive narratives. Based on just a few user details, it weaves together rich voice narration, watercolor imagery, and ambient soundscapes into a historical story, allowing users to engage in a real-time voice conversation with the storyteller to explore their roots further.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=urV3ckRYRC8"
      data-glue-modal-trigger="uni-modal-urV3ckRYRC8-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-1_1ApjCQc.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Sankofa Demo Video&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-urV3ckRYRC8-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="urV3ckRYRC8"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=urV3ckRYRC8"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;UI Navigator winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/moonwalk-tojsay" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Moonwalk&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Enaiho Uwas Paul and Aman Kumar Sah&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Moonwalk is a conversational, hands-free desktop assistant that helps users intuitively navigate their computer and complete complex tasks using just their voice. By remembering personal preferences and past interactions, it acts as an intelligent co-pilot that can seamlessly control your mouse and keyboard to execute everyday workflows—like booking flights or managing spreadsheets—while you simply sit back and speak.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=u3QoaT3pIMs"
      data-glue-modal-trigger="uni-modal-u3QoaT3pIMs-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-2_djltYYE.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Moonwalk Demo Video #geminiliveagentchallenge&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-u3QoaT3pIMs-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="u3QoaT3pIMs"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=u3QoaT3pIMs"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Best multimodal integration and user experience winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/wand-a-live-agent-that-sees-browses-and-clicks-with-you" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Wand&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: David Li&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Wand is a voice-first, pointer-aware browser assistant that helps you seamlessly navigate and interact with any website using a combination of natural speech and hand gestures. By simply pointing at your screen and speaking — like asking to "play this video" or "zoom in here"—this live agent helps you instantly execute clicks, searches, and commands without ever needing to touch a mouse or keyboard.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=t9dyesmxlH8"
      data-glue-modal-trigger="uni-modal-t9dyesmxlH8-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-3_EsDTsNv.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Wand -- A live agent that sees, browses, and clicks with you&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-t9dyesmxlH8-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="t9dyesmxlH8"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=t9dyesmxlH8"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Best technical execution and agent architecture winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/johnkeats-ai" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;JohnKeats.AI&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Matthew Keats&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;JohnKeats.AI is a voice-first emotional companion designed to actively listen and hold space for users without rushing to offer solutions. By processing subtle vocal cues like pitch, pacing, and tone, it reacts naturally to a user's emotional state in real time to provide a deeply reflective and empathetic conversational experience.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=zNKhR3e2ym4"
      data-glue-modal-trigger="uni-modal-zNKhR3e2ym4-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-4_DmxDSNY.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;JohnKeats.AI — The First AI Agent Built to Know When to Shut Up&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-zNKhR3e2ym4-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="zNKhR3e2ym4"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=zNKhR3e2ym4"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Best innovation and thought leadership winner: &lt;/span&gt;&lt;a href="https://devpost.com/software/rayan-memory" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Rayan Memory&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Yusuf Elnady&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Rayan Memory tackles the universal problem of forgetting by turning your daily learnings into a fully explorable 3D "memory palace." A background agent passively listens to your real-world audio to extract important ideas as physical artifacts, allowing you to walk through themed virtual rooms and converse with a dedicated AI companion to easily retrieve your exact memories.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=G05WfE5Zcsg"
      data-glue-modal-trigger="uni-modal-G05WfE5Zcsg-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-5_rlthVRd.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Rayan - A 3D memory palace that listens, remembers, and speaks back&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-G05WfE5Zcsg-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="G05WfE5Zcsg"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=G05WfE5Zcsg"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://devpost.com/software/nagardrishti" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;NagarDrishti&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Nikita Dongre and Omkar Dongre&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;NagarDrishti tackles dangerous road conditions by allowing citizens to safely report potholes and waterlogging using a hands-free voice assistant while driving. These real-time reports instantly populate an interactive dashboard, where city officials can use natural language to easily identify hazard hotspots and manage critical repairs.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=Rn7eJxBdWe4"
      data-glue-modal-trigger="uni-modal-Rn7eJxBdWe4-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-6_LY4Wry4.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;NagarDrishti&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-Rn7eJxBdWe4-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="Rn7eJxBdWe4"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=Rn7eJxBdWe4"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://geminiliveagentchallenge.devpost.com/submissions/970955-ekaette" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Ekaette&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Bassey John&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ekaette revolutionizes customer service by replacing frustrating hold queues with a conversational, multimodal AI assistant that operates across live phone calls and text messaging. Customers can speak naturally with the agent over a standard phone line while seamlessly sharing photos, reviewing product options, or completing payments via WhatsApp, c&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=0BeLDppNGks"
      data-glue-modal-trigger="uni-modal-0BeLDppNGks-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-7_WUG5wng.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Ekaette - A multimodal AI Voice and Messaging Assistant&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-0BeLDppNGks-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="0BeLDppNGks"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=0BeLDppNGks"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://geminiliveagentchallenge.devpost.com/submissions/949057-vibecat" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;VibeCat&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Sejun Kim and Michael Chang&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;VibeCat is a proactive macOS desktop companion that continuously watches your screen, understands your context, and suggests helpful actions before you even ask. Instead of waiting for a command, it speaks up first — like offering to fix a missing line of code or execute a terminal command — and completes the task only after receiving your permission.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=j1zzfoDr7qA"
      data-glue-modal-trigger="uni-modal-j1zzfoDr7qA-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-8_FyBBOlB.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;vibeCat - Your proactive desktop companion&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-j1zzfoDr7qA-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="j1zzfoDr7qA"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=j1zzfoDr7qA"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://geminiliveagentchallenge.devpost.com/submissions/945801-call-my-parts" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Call My Parts&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Sugam Palav, Nikhil Lohar, Siddhant Panday, and Vishal Parekh&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Call My Parts automates the tedious, time-consuming process of sourcing used vehicle parts by doing the research and vendor outreach for you. Users simply speak their part request, and the AI agent autonomously searches vendor websites, calls suppliers to check pricing and inventory, and compiles the best options into a ranked, easy-to-read dashboard.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=8pcRbVBRMqw"
      data-glue-modal-trigger="uni-modal-8pcRbVBRMqw-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-9.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Call My Parts AI Tool : Hackathon Gemini Live 2026&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-8pcRbVBRMqw-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="8pcRbVBRMqw"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=8pcRbVBRMqw"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;hr/&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Honorable mention: &lt;/span&gt;&lt;a href="https://geminiliveagentchallenge.devpost.com/submissions/967879-relay-real-time-voice-vision-lab-tutor-for-electronics" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Relay&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;By: Faith Ogundimu&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Relay is an interactive AI lab partner that uses your webcam to watch and guide your physical electronics projects in real time. It provides step-by-step voice instructions to help you build circuits, catches wiring mistakes before they happen, and reinforces your skills with a built-in 3D simulation sandbox and adaptive quizzes.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=lTwos-2TW_A"
      data-glue-modal-trigger="uni-modal-lTwos-2TW_A-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault-10.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Relay — Real-Time Voice &amp;amp; Vision AI Tutor for Electronics | Gemini Live API + Google Cloud&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-lTwos-2TW_A-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="lTwos-2TW_A"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=lTwos-2TW_A"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Keep the momentum going&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Inspired by these incredible projects? Start building and stay connected with the community through our latest programs and events:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Join &lt;/span&gt;&lt;a href="https://developers.google.com/program/gear?utm_source=cgc-blog&amp;amp;utm_medium=blog&amp;amp;utm_campaign=FY-26-Q2-GEAR-sign-up&amp;amp;utm_content=hackathon-winner-promo&amp;amp;utm_term=-" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Ready (GEAR)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, designed to help developers and decision-makers build and deploy production-ready AI agents.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Catch up on Google Cloud Next 2026: We just wrapped up an amazing Google Cloud Next! If you weren't able to join us in person — or simply want to relive the energy — take a look at our &lt;/span&gt;&lt;a href="https://www.instagram.com/reels/DXxFTSjiTmM/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;social&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=N7N0TU9tkzw" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;livestream&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; recaps to catch up on some of the exciting developer activations straight from the expo floor.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Tune in on Tuesdays: Want to be the first to hear about new tools, product updates, and upcoming hackathons? Join us for our &lt;/span&gt;&lt;a href="https://goo.gle/GoogleCloudTech" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;weekly livestream&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; every Tuesday 9:00 A.M. PDT / 12:00 P.M. EDT for the latest in all things Google Cloud.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Congratulations again to all of our winners and participants. We can't wait to see what you build next!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 15 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/winners-and-highlights-of-the-gemini-live-agent-challenge/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Landscape_16x9_rxRY4RH.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Gemini Live Agent Challenge: Announcing the winners and highlights</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Landscape_16x9_rxRY4RH.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/winners-and-highlights-of-the-gemini-live-agent-challenge/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Dilasha Panigrahi</name><title>Product Marketing Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Ship code within minutes with the Gemini CLI DevOps Extension</title><link>https://cloud.google.com/blog/topics/developers-practitioners/ship-code-within-minutes-with-the-gemini-cli-devops-extension/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With AI coding tools like Antigravity and Claude Code, I can build a working web app in record time. But deploying it? That's where I'd historically lose the rest of the afternoon to Dockerfiles, IAM bindings, and YAML. So I'd take the shortcut most developers take: I just wouldn't do it. The app would stay on my laptop, and my work would never ship.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is the classic tension between the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/application-development/transform-your-developer-experience-with-google-cloud"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;inner loop&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: the fast, local cycle of writing and testing code, and the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/application-development/richard-seroter-on-shifting-down-vs-shifting-left"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;outer loop&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: containerization, CI/CD pipelines, and production infrastructure. Most developers are productive in one but not the other, and the gap between them is where projects stall.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://github.com/gemini-cli-extensions/cicd" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI Extension for CI/CD&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; bridges this gap. It handles both quick deployments and full pipeline generation from a single terminal interface. Let me show you how.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Building the Cosmic Guestbook App&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To demonstrate this workflow, we need an app. Let's start from an empty directory and use our agent to "vibe code" a brand new project: the &lt;a href="https://github.com/kweinmeister/cosmic-guestbook" rel="noopener" target="_blank"&gt;Cosmic Guestbook&lt;/a&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We want a full-stack architecture: a React frontend and a Node.js Express backend API. Instead of scaffolding this by hand, we can ask our agent to jumpstart the app:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;&amp;quot;Build a \&amp;#x27;Cosmic Guestbook\&amp;#x27; web app. I need a dynamic Node.js Express backend and a React frontend utilizing Vite. Make the frontend look like a beautiful, glassmorphic sci-fi interface.&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a0d2b0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Within moments, our agent scaffolds the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;backend/&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory with &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;server.js&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;frontend/&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; directory with a fully styled React app. We now have a functioning, two-tier web app sitting on our laptop.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/guest_book.max-1000x1000.png"
        
          alt="guest_book"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Installing the Extension&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But code on a laptop isn't shipping. To get this guestbook online, we need to equip our chosen environment with the CI/CD extension. Regardless of your setup, start by ensuring that you have the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/sdk/docs/install-sdk"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;gcloud CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; installed and authenticate using Application Default Credentials: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud auth application-default login&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, install the extension in your preferred development environment:&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;For Gemini CLI&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Run the following command directly in your terminal:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gemini extensions install https://github.com/gemini-cli-extensions/cicd&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a0d310&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;For Claude Code&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Add the marketplace and install the plugin directly from the terminal:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# 1. Add the Marketplace\r\nclaude plugin marketplace add https://github.com/gemini-cli-extensions/cicd.git\r\n\r\n# 2. Install the Plugin\r\nclaude plugin install cicd&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a0d370&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;For Antigravity and agents supported by npx skills&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can enable the extension's MCP Server as custom MCP and add skills to your workspace:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Add the Skills\r\nnpx skills add https://github.com/gemini-cli-extensions/cicd --global --all --agent antigravity&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a0d3d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;How It Works&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The CI/CD extension is a powerful three-tier system designed to translate your intent into secure, production-ready infrastructure in all these agent environments:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Skills&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Specialized AI skills like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google-cicd-deploy&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google-cicd-pipeline-design&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; are defined in the extension. These instruct your AI agent (Gemini CLI, Claude Code, or Antigravity) on how to think—helping it analyze your code, ask the right questions, and handle errors gracefully.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;CI/CD MCP server&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Running in the background is a specialized Go-based &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; server. This server provides a suite of tools that gives your agent the hands it needs to actually manipulate Google Cloud: everything from scanning for secrets to provisioning Cloud Run services.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Local knowledge base&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To ensure the most accurate answers, the system includes a pre-indexed retrieval-augmented generation (RAG) database containing verified architecture patterns, which lets the agent ground its design decisions in the source of truth.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Your chosen AI assistant orchestrates these tools and patterns into a cohesive deployment lifecycle.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Inner Loop&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When you're building a prototype or testing a new feature, you don't need a massive, multi-environment CI/CD pipeline. You just need a public URL to test your webhook or show a stakeholder. This is the inner loop, and it needs to be fast.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The traditional approach involves manually writing a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Dockerfile&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, authenticating with a container registry, building the image, pushing it, and finally deploying it. The CI/CD extension turns this into a single natural language prompt: &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemini "Deploy this application to Google Cloud using the google-cicd-deploy skill"&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. If you're using Claude Code, you can prompt it exactly the same way via &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;claude -p "Deploy this application..."&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, and in Antigravity, simply type your deployment request.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When you run this prompt, your AI agent analyzes your local workspace to figure out the best deployment approach.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 1: Pre-Deployment Security Scan&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Leaked secrets are one of the most common and expensive security failures in software. GitGuardian's &lt;/span&gt;&lt;a href="https://www.gitguardian.com/state-of-secrets-sprawl-report-2025" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;2025 State of Secrets Sprawl&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; report found 23.8 million new credentials exposed on public GitHub in a single year; 70% of secrets that were leaked in 2022 are still active today. It happens fast: you hardcode a database password during local testing, forget to remove it, and push.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The extension catches this before it becomes a problem. Before any code leaves your machine, it runs a secret check across your workspace. If it finds a Stripe API key or a database credential sitting in your source, the agent halts the deployment and warns you. No secrets ship to the cloud by accident.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you accidentally commit a database password or a Stripe API key while hacking on localhost, your assistant catches it and prevents the deployment. This is what true &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/application-development/richard-seroter-on-shifting-down-vs-shifting-left"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;shift-left&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; security looks like in practice.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Application Analysis &amp;amp; Containerization&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Next, your agent checks your &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;package.json&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;go.mod&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to figure out your framework. It automatically decides whether to use &lt;/span&gt;&lt;a href="https://cloud.google.com/storage"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Storage&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (for static sites) or &lt;/span&gt;&lt;a href="https://cloud.google.com/run"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (for dynamic services).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you're building a dynamic service that doesn't have a Dockerfile, the extension leverages &lt;/span&gt;&lt;a href="https://cloud.google.com/docs/buildpacks/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud's buildpacks&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to containerize it for you on the fly.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: Clarification and Deployment&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Conversational agents don't blindly guess how you want your app configured. Before making any changes to your Google Cloud environment, the assistant pauses the flow to ask prioritizing questions based on what it found during its analysis.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For our Cosmic Guestbook, I didn't specify a region or whether the service should be public. The assistant paused and asked me about it:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"I have completed the analysis and prepared the application for a unified deployment to Cloud Run. Since the Node.js buildpack requires a single entry point, I created a root &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;package.json&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; with a custom build script that builds the React frontend and allows the Express backend to serve it statically.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Additionally, the secret scan completed successfully, and no sensitive credentials were found.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before we deploy using the Cloud Run source deployment option (buildpacks), I need to clarify a few deployment parameters:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Region/Location:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The gcloud defaults do not specify a location. Which Google Cloud region would you like to deploy to (e.g., &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;us-central1&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;europe-west1&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;)?&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Access:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Should this application be publicly accessible or private?&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Service Name:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; I propose naming the service &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cosmic-guestbook&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. Does that sound good to you?"&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This conversational pause ensures that even in the fast inner loop, you retain complete control over your cloud architecture. After you confirm the details, the agent pushes the code live and returns the public URL:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Final Output:\r\nYour application is now live and publicly accessible at the following URL:\r\nhttps://cosmic-guestbook-xxxxxxxx-uc.a.run.app&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a0d430&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Behind the scenes, the deployment is handled automatically via &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudrun.deploy_to_cloud_run_from_source&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Outer Loop&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A scrappy deployment prompt is perfect for a Tuesday afternoon prototype, but you can't run a production system from your laptop. Eventually, you need the rigors of the outer loop: automated testing, source control integration, and formal continuous deployment.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Writing &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudbuild.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; files and provisioning the necessary infrastructure (like &lt;/span&gt;&lt;a href="https://cloud.google.com/artifact-registry"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Artifact Registry&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; repositories or GitHub connections through &lt;/span&gt;&lt;a href="https://cloud.google.com/developer-connect/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Developer Connect&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;) is notoriously tedious and error-prone. With the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;google-cicd-pipeline-design&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; skill, your AI agent acts as your personal platform engineering consultant.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of writing YAML from scratch, you have a conversation. Your agent will ask you about your testing strategy and where you want to deploy, and then it autonomously provisions the required Google Cloud infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 1: Architectural Design &amp;amp; Feedback&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You start the process directly in your conversational interface:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Prompt your agent to kick off the design process:\r\ngemini &amp;quot;Design a CI/CD pipeline using the google-cicd-pipeline-design skill&amp;quot;\r\n# OR\r\nclaude -p &amp;quot;Design a CI/CD pipeline using the google-cicd-pipeline-design skill&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a0d490&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Your assistant doesn't work in a black box. It retrieves common CI/CD patterns from its knowledge base. With the most relevant knowledge in hand, it proposes a concrete plan in YAML for you to review.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 2: Infrastructure Provisioning&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;After you approve the plan, the assistant works sequentially through the required infrastructure steps. For example, it might first create a registry for your containers.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;// Example MCP call to provision the registry\r\n{\r\n  &amp;quot;name&amp;quot;: &amp;quot;create_artifact_repository&amp;quot;,\r\n  &amp;quot;arguments&amp;quot;: {\r\n    &amp;quot;repository_id&amp;quot;: &amp;quot;demo-app-repo&amp;quot;,\r\n    &amp;quot;location&amp;quot;: &amp;quot;us-central1&amp;quot;,\r\n    &amp;quot;format&amp;quot;: &amp;quot;DOCKER&amp;quot;\r\n  }\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a0d4f0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It might then set up a Git connection so that &lt;/span&gt;&lt;a href="https://cloud.google.com/build"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Build&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; can read your source code.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Step 3: Pipeline Generation &amp;amp; Trigger&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, the agent generates the actual &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudbuild.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; file that defines the pipeline stages (test, build, deploy). Here's a snippet of a generated configuration from the repository that highlights the initial build steps:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;steps:\r\n  # Step 1: Install tools (like the linter) and clean the cache.\r\n  - name: \&amp;#x27;golang:1.24\&amp;#x27;\r\n    id: \&amp;#x27;Install Tools\&amp;#x27;\r\n    entrypoint: \&amp;#x27;sh\&amp;#x27;\r\n    args:\r\n      - \&amp;#x27;-c\&amp;#x27;\r\n      - |\r\n        set -e\r\n        export PATH=/workspace/bin:$$PATH\r\n        echo &amp;quot;Installing golangci-lint...&amp;quot;\r\n        go install github.com/golangci/golangci-lint/cmd/golangci-lint@v1.64.8\r\n        echo &amp;quot;Cleaning module cache...&amp;quot;\r\n        go clean -modcache\r\n    env:\r\n      - \&amp;#x27;GOPATH=/workspace\&amp;#x27;\r\n    dir: \&amp;#x27;devops-mcp-server\&amp;#x27;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a0d550&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With the pipeline defined, we need a way to execute it automatically. The agent finishes by creating a &lt;/span&gt;&lt;a href="https://cloud.google.com/build/docs/automating-builds/create-manage-triggers"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Build trigger&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. The trigger acts as the glue between your GitHub repository and Cloud Build, ensuring that every push to the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;main&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; branch automatically fires off the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudbuild.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; steps.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;// Example MCP call setting the trigger\r\n{\r\n  &amp;quot;name&amp;quot;: &amp;quot;create_build_trigger&amp;quot;,\r\n  &amp;quot;arguments&amp;quot;: {\r\n    &amp;quot;trigger_name&amp;quot;: &amp;quot;main-branch-deploy&amp;quot;,\r\n    &amp;quot;filename&amp;quot;: &amp;quot;cloudbuild.yaml&amp;quot;,\r\n    &amp;quot;branch_pattern&amp;quot;: &amp;quot;^main$&amp;quot;\r\n  }\r\n}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c5a0d5b0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Security And Control&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI-assisted infrastructure generation sounds incredible, but it's reasonable to ask: is it safe?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The extension operates strictly within the permissions of your local &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/docs/authentication/application-default-credentials"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Application Default Credentials (ADC)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. It can't do anything that you can't do. Because it uses the &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, every action that it takes, from creating an Artifact Registry to modifying a Cloud Build trigger, runs through strongly typed, verifiable tools.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you don't like a step in the proposed pipeline, you tell your agent to change it. You're always the "Editor-in-Chief" of your infrastructure. We strongly recommend that you adhere to the &lt;/span&gt;&lt;a href="https://en.wikipedia.org/wiki/Principle_of_least_privilege" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;principle of least privilege&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for both your local ADC and any service accounts that are used by the generated pipelines.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;When Dev and Ops Converge&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The friction between wanting to write code and needing to ship it is finally dissolving. We're moving past the era where deep expertise in YAML formatting was a prerequisite for putting an app on the internet.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By handling the boilerplate of both the scrappy inner loop and the automated outer loop, conversational AI lets developers focus on the business logic that actually matters.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Next Steps&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you want to experience this convergence yourself, here are your immediate next steps:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Get the tools&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Install the &lt;/span&gt;&lt;a href="https://github.com/gemini-cli-extensions/cicd" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;CI/CD Extension for Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deploy the inner loop&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Take an existing side project (or ask your chosen agent to scaffold a new one like our &lt;a href="https://github.com/kweinmeister/cosmic-guestbook" rel="noopener" target="_blank"&gt;Cosmic Guestbook&lt;/a&gt;) and prompt it to deploy to Google Cloud to instantly see it live on Cloud Run or Cloud Storage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Automate the outer loop&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Run a design command against a repository that you're ready to productionize, and watch your agent generate your &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;cloudbuild.yaml&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and provision your infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Stop wrestling with configuration files and start shipping. Let me know what you build by reaching out on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/karlweinmeister/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://x.com/kweinmeister" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, or &lt;/span&gt;&lt;a href="https://bsky.app/profile/kweinmeister.bsky.social" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Bluesky&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 08 May 2026 19:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/ship-code-within-minutes-with-the-gemini-cli-devops-extension/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/gemini_cli_devops_final.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Ship code within minutes with the Gemini CLI DevOps Extension</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/gemini_cli_devops_final.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/ship-code-within-minutes-with-the-gemini-cli-devops-extension/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Karl Weinmeister</name><title>Director, Developer Relations</title><department></department><company></company></author></item><item><title>How BASF manages thousands of supply chain decisions with AlphaEvolve’s agentic algorithms</title><link>https://cloud.google.com/blog/products/ai-machine-learning/how-basf-manages-thousands-of-supply-chain-decisions-with-alphaevolve/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The agricultural and crop protection supply chain is one of the most intricate networks in the world. It takes up to two years to turn active ingredients into the final products farmers need, and a single change in weather or regulations can disrupt everything. Planners at &lt;/span&gt;&lt;a href="https://agriculture.basf.com/global/en" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;BASF Agricultural Solutions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; navigate this reality daily across 180 production sites. To understand how local decisions ripple across their entire global network, BASF turned to &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/alphaevolve-on-google-cloud"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AlphaEvolve on Google Cloud&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to build a digital twin of their supply chain.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Planning across a two-year lead time&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;BASF Agricultural Solutions manages a network with over 5,000 distinct value chains. Creating a single end product requires a bill of materials that can be over 30 levels deep, moving across different production sites and regions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Currently, human planners make thousands of local decisions every day. They decide what to produce, when to produce it, and how much safety stock to hold. Because the network is so large, a planner can’t easily see how a localized decision affects the rest of the global supply chain. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This scale can lead to additional working capital and inventory and or cause production imbalances. Traditional mathematical models struggle to capture the dynamic reality of the network that planners navigate based on years of experience.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Building a foundation for decision support&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AlphaEvolve&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is an evolutionary coding agent that generates and refines algorithms autonomously. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;In collaboration with Google Cloud and prognostica GmbH&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;,&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; BASF’s objective was not to replace human decision-making, but to establish a new model for decision support that helps planners handle the real-world complexity of the production network.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The team gave AlphaEvolve a foundational "seed" program. This initial code established a standard planning logic that translated demand forecasts into production schedules, serving as a functional baseline before introducing dynamic, network-wide coordination. From there, they fed the model three years of historical data, including inventory levels, market demand, and actual production outputs. AlphaEvolve then generated variations of the code, mutating the logic to see if it could simulate a supply chain that matched the real-world historical data.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Measuring what good looks like in initial tests&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For AlphaEvolve to improve, it needed a specific goal. The evaluation function scored every new piece of generated code on one primary metric: how closely the simulated inventory levels and production decisions matched the actual historical reality recorded by BASF.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The latest AlphaEvolve runs delivered more than 80% relative improvement in accuracy compared to the initial seed model. With further adjustments, the team expects to push performance even higher — bringing the model to a level of accuracy not achieved with other approaches and making it actionable for operational use.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The results&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The evolved planning logic delivered immediate, measurable improvements over the initial seed model. The final algorithm successfully mirrored the actual historical performance of the supply chain, significantly reducing the error rate compared to the initial seed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“We had several attempts to build a digital twin for our complex supply network using deterministic models, and all of them failed,” said Dr. Goetz Krabbe, vice president for global supply chain at BASF. “By using AlphaEvolve, we cannot only map the complex network based on system data, but at the same time understand and copy the human decisions that drive our daily operations. This gives us a highly accurate and easy to maintain data driven digital twin of the entire network. Using it we can optimize our inventory levels and respond to market volatility with confidence while avoiding stockouts."&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What the evolved algorithm actually does&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By running thousands of experiments, AlphaEvolve developed a clear, human-readable algorithm that explains how the BASF network truly operates. It automatically discovered factually correct, domain-specific supply chain rules that explain the observed production outputs and inventory levels for the tested product value chain:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Production consolidation:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The algorithm learned to group production amounts together, accurately mapping how planners optimize plant time.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dynamic safety stocks:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; It introduced safety stock parameters to handle volatile and seasonal demand patterns, helping to strictly manage capital costs while preventing out-of-stock situations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Network-wide coordination:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The model successfully mapped the dependencies between different production tiers, providing a clear foundation for optimizing asset utilization globally.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What's next&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The initial simulations showed that evolutionary AI can accurately model large-scale, dynamic supply chains. BASF’s objective is to create a digital twin of their entire global production network as a new foundation for simulation, decision support, scenario forecasting and optimization. This will allow the team to continuously simulate operations, identify hidden bottlenecks before they affect throughput, and optimize asset utilization across all global facilities.&lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sub&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;This project was a collaboration between the BASF SE team including: Benjamin Priese, Michael Arlt, &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Debora Morgenstern and Tobias Hausen as well as Manuel Doerr and Thomas Christ from Prognostica GmbH Würzburg, &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;and the AI for Science team at Google Cloud including (but not limited to): Kartik Sanu, Laurynas Tamulevičius, Nicolas Stroppa, Chris Page, Srikanth Soma, John Semerdjian, Skandar Hannachi, Vishal Agarwal and Anant Nawalgaria as well as &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Christoph Tittelbach from&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; the Google account team and &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;partners at Google DeepMind&lt;/span&gt;&lt;/sub&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 07 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/how-basf-manages-thousands-of-supply-chain-decisions-with-alphaevolve/</guid><category>Data Analytics</category><category>Customers</category><category>Developers &amp; Practitioners</category><category>Google Cloud in Europe</category><category>AI &amp; Machine Learning</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_BFm5ksn.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How BASF manages thousands of supply chain decisions with AlphaEvolve’s agentic algorithms</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_BFm5ksn.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/how-basf-manages-thousands-of-supply-chain-decisions-with-alphaevolve/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Benjamin Priese</name><title>Senior Digital SC Manager, BASF Agricultural Solutions</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Anant Nawalgaria</name><title>Group AI Product Manager &amp; Engineer, Google</title><department></department><company></company></author></item><item><title>Pioneering AI-assisted code migration: How Google achieved 6x faster migration from TensorFlow to JAX</title><link>https://cloud.google.com/blog/topics/developers-practitioners/6x-faster-migration-from-tensorflow-to-jax/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI coding agents are rapidly becoming ubiquitous across the software industry, fundamentally changing how developers write, test, and debug daily code. While these tools excel at localized, self-contained tasks, applying them to massive, systemic codebase migrations requires an entirely new approach.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google is already addressing this challenge by incorporating AI into many migration workflows: &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/systems/using-ai-and-automation-to-migrate-between-instruction-sets"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;x86 to ARM&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (enabling workloads on Google Axion processors); &lt;/span&gt;&lt;a href="https://dl.acm.org/doi/10.1145/3696630.3728542" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;int32 to int64&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; identifiers (to avoid running out of ids); &lt;/span&gt;&lt;a href="https://arxiv.org/abs/2501.06972" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;JUnit3 to JUnit4&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (for testing); and &lt;/span&gt;&lt;a href="https://arxiv.org/abs/2501.06972" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Joda-Time to java.time&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (a modern time library). However, AI model migration represents a whole new level of complexity that requires even more advanced methods for AI-assisted migration. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Translating a production-grade machine learning model from one framework to another, for example, from TensorFlow (TF) to JAX, is not a simple syntax update. It is a long-horizon task that requires untangling thousands of lines of code, managing complex states across multiple files, and preserving precise mathematical equivalence. Generic, single-agent coding assistants typically struggle under this weight — they frequently lose context over long workflows, hallucinate APIs, or fail to produce buildable code across an entire repository.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google’s AI and Infrastructure team has pioneered a new approach to this industry-wide problem. The result is 6x faster model migration, a milestone Sundar highlighted in the recent &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=11PBno-cJ1g&amp;amp;t=384s" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Next keynote&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. In this post, we share how we deployed specialized, multi-agent AI systems to migrate some of Google’s largest-scale production models from TF to JAX.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Accelerating the transition from TF to JAX&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For many teams at Google — and across the industry — the future of scalable machine learning is being built on JAX. Designed around a functional, stateless paradigm, JAX is heavily optimized for modern Tensor Processing Unit (TPU) infrastructure and XLA compilation, making it the bedrock of the modern AI stack.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Evolving to this future presents a monumental challenge. Thousands of production models are built on TensorFlow, a framework characterized by object-oriented, stateful layer initialization and static execution graphs. Manually migrating these models to JAX requires a fundamental rethinking of how layers interact, and how state is explicitly managed. Across large organizations, this type of migration alone represents hundreds (if not thousands) of software engineering (SWE) years — time better spent on researching new architectures and driving product innovation.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Overcoming this challenge with AI started as an ambitious experiment within Google’s AI and Infrastructure team, but has evolved into a repeatable blueprint for addressing complex engineering problems across the company.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Moving beyond single-agent coding&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our early experiments with agentic code translation showed promise for simple models. However, when faced with the realities of a Google-scale migration — complex, production-grade models spanning multiple files and thousands of lines of code — generic, single-agent setups struggled. They could not balance high-level structural rules with low-level execution details, resulting in a variety of failures, such as overwriting critical files or skipping necessary functionality. To overcome these common challenges inherent to enterprise migrations, we developed a highly specialized multi-agent architecture that consists of:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The Planner agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using deterministic, compiler-based static analysis, the Planner maps out the codebase's entire dependency tree. It then works alongside other agents to break the migration down into a discrete, step-by-step plan, helping ensure the migration happens logically from the "leaf nodes" (layers without unmigrated dependencies) upward.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The Orchestrator agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This agent acts as the project manager. It dynamically groups plan steps into manageable chunks to keep the context window focused, injects the necessary domain knowledge, and handles failure recovery if a step doesn't build.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;The Coder agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Built as a reasoning and acting agent, the Coder is the workhorse. Integrated directly into our internal IDE tools, it has the ability to read files, write code, run builds, and execute unit tests. Crucially, it operates in a "test-and-fix" loop, self-correcting until it produces a compilable, verifiable component in the target language.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_-_System_diagram.max-1000x1000.jpg"
        
          alt="2 - System diagram"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="013zu"&gt;Figure: Multi-agent AI system for complex code migrations. Process diagram describing the multi-agent system used to migrate legacy model code to JAX. Image generated with Gemini Nano Banana 2.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Scalable validation and dynamic Playbooks&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Generative AI models are only as good as the context they are provided. Because source and target architectures rarely map 1-to-1, we engineered a scalable, hierarchical system of Playbooks.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These Playbooks range from general repository instructions to highly specific "golden examples" distilled from successful manual migrations. By feeding the Orchestrator a client-specific Playbook (for instance, one tailored to YouTube's unique ranking model infrastructure), the system avoids generic hallucinations and strictly adheres to internal coding standards. This Playbook architecture is framework-agnostic, meaning it can be adapted to guide migrations between any two programming languages or frameworks.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Furthermore, we instituted rigorous quality metrics to ensure the generated code is actually production-ready:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Quantitative verification:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; For each unit of code, we verify correctness mathematically. In the case of the TF-to-JAX migration, the system utilizes algorithmic gradient ascent to find the maximum error between the original TF layer and the new JAX layer, mathematically verifying functional equivalence.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Qualitative evaluation:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We also evaluate the migrated code against a set of qualitative standards. In the case of the TF-to-JAX migration, we deploy a blind-audit LLM Judge that scores the migrated code against a framework-agnostic architectural checklist, so that critical, domain-specific logic is completely captured.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Redefining migration velocity&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By deploying this multi-agent system, we dramatically alter the economics of software migration.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In our evaluations on real-world, highly complex YouTube models (featuring thousands of lines of code, hundreds of layers, and deep metric dependencies), the multi-agent system achieved a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;6.4x to 8x speedup&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; over performing the migration manually. What traditionally took several  SWE-months can now be reduced to only a few weeks of AI-assisted code generation, followed by expert human review.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The system effectively handles the boilerplate, identifies target idioms, maps the dependencies, and generates the unit tests, allowing engineers to act as reviewers and architects rather than manual translators.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Looking ahead into the AI-assisted era&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI is transforming the pace of technological innovation. Without using AI to accelerate our ability to conduct large-scale migrations, it will become increasingly difficult for organizations to adopt the latest breakthroughs and maintain the security, reliability, and performance of their systems.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our work migrating machine learning implementations from one ML framework to another demonstrates that by combining deterministic static analysis, strict testing loops, and specialized multi-agent architectures, we can safely automate some of the most complex software engineering challenges in the industry. A detailed description of the process is published in our &lt;/span&gt;&lt;a href="https://arxiv.org/abs/2603.27296" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;technical paper&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.  &lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sub&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;This work is the result of collaboration across Google. We thank key contributors: Stoyan Nikolov, Niyati Parameswaran, Bernhard Konrad, Moritz Gronbach, Niket Kumar, Ann Yan, Varun Singh, Yaning Liang, Antoine Baudoux, Xevi Miró Bruix, Daniele Codecasa, Madhura Dudhgaonkar, Elian Dumitru, Alex Ivanov, Christopher Milne-O’Grady, Ahmed Omran, Ivan Petrychenko, Assaf Raman, Stefan Schnabl, Yurun Shen, Maxim Tabachnyk, Niranjan Tulpule, Amin Vahdat, and Jeff Zhou.&lt;/span&gt;&lt;/em&gt;&lt;/sub&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 06 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/6x-faster-migration-from-tensorflow-to-jax/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_Hero_Image.max-600x600_4hJcig4.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Pioneering AI-assisted code migration: How Google achieved 6x faster migration from TensorFlow to JAX</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_Hero_Image.max-600x600_4hJcig4.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/6x-faster-migration-from-tensorflow-to-jax/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Jamie Rogers</name><title>Head of Product, Domain Applied Machine Learning, AI and Infrastructure</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Parthasarathy Ranganathan</name><title>Google Fellow &amp; Vice President, AI and Infrastructure</title><department></department><company></company></author></item><item><title>Five must-have guides to move agents into production with Gemini Enterprise Agent Platform</title><link>https://cloud.google.com/blog/topics/developers-practitioners/five-guides-to-building-and-scaling-production-ready-ai-agents/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building AI agents that work well in a demo is one thing, but running them in production requires serious infrastructure. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google Cloud Next '26, we introduced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to help developers build, deploy, scale, govern, and optimize  autonomous AI agents. From managing long-running state and enforcing security with the Agent Governance Stack, to orchestrating complex workflows using Agent Development Kit, these tools help you treat your agent fleet with the same rigor as your engineering organization. Here is a look back at our five-part series covering the architecture patterns and best practices you need to move your agents into production.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Agent design patterns for long-running AI agents &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Developers spend weeks perfecting prompt engineering, tool calling, and response latency. But none of that  matters when your agent loses its reasoning chain over a five-day task. At Next 26, we announced that Agent Runtime now supports long-running agents that maintain state for up to seven days. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this article, we’ll share five essential agent design patterns for building long-running agents with Agent Runtime. You’ll learn how to implement checkpoint-and-resume mechanisms to recover from failures without starting over. We also cover how to build delegated approval workflows where the agent pauses for human review while consuming zero compute resources. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://x.com/GoogleCloudTech/status/2046989964077146490?s=20" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Read the full guide on long-running agents here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;2. The agent governance stack &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A misconfigured SaaS tool leaks data passively, but a misconfigured agent takes bad actions &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;actively. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;The pattern we saw with shadow IT in 2015 is repeating itself with AI agents. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To manage this risk, we explain why you must treat your agent fleet with the same rigor as  your engineering organization. We outline a five-layer governance stack designed to provide your r security team with precise visibility and control. The foundation begins with Agent Identity, assigning every agent a unique cryptographic badge to isolate access. From there, we explore how to use Agent Registry for centralized tool governance and Agent Gateway to enforce natural language security policies across your fleet. The stack concludes with behavioral anomaly detection and a unified security dashboard to monitor your overall risk.&lt;/span&gt;&lt;a href="https://notebooklm.google.com/" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://x.com/googlecloudtech/status/2047120160100860290?s=46&amp;amp;t=B2lIFwfuun9SYmzePZf3ig" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Read the full guide on the agent governance stack here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;3. Must-have multi-agent orchestration patterns in ADK &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building a single AI skill is relatively straightforward, but orchestrating multiple skills across different agents is notoriously difficult. With the new updates to Agent Development Kit (ADK), we introduced graph-based workflows, collaborative agents, and a formalized skills framework to solve these orchestration failures. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our third guide details five multi-agent orchestration patterns you can use to build reliable systems. You will find code examples for building hybrid graphs that combine hard-coded business rules with flexible AI reasoning. We also show how to use the coordinator-specialist pattern to avoid building monolithic, unpredictable agents. The guide concludes with deep dives into skill composition, cross-language pipelines, and secure sandboxed executors for running arbitrary code. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://x.com/GoogleCloudTech/status/2047367046070161674?s=20" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Read the full guide on ADK multi-agent patterns here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;4. Deep dive: How A2A and MCP work togethe&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;r &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Organizations will rarely build every AI agent they need entirely from scratch. The real value comes when agents built by different teams, in different languages, and across different organizations can securely discover and collaborate with each other. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In our final guide, we explore five integration patterns using the Agent-to-Agent (A2A) and Model Context Protocol (MCP) standards. You will see how Agent Cards allow agents to publish their capabilities so coordinator agents can find them through the Agent Registry. We also show how MCP acts as a universal tool bridge to connect your agents to databases and enterprise systems without custom integration code. The article finishes with strategies for cross-organization federation that &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;involves agents from different organizations collaborating on shared tasks &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;using the Agent Gallery in Gemini Enterprise and building ambient event meshes for &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;agents that react to events continuously in the background, without waiting for user requests.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://x.com/GoogleCloudTech/status/2047567704807346675?s=20" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Read the full guide on agent interoperability here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;5. Atomic agent blueprints on Google Cloud’s Agent Garden&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building multi-agent systems from scratch presents complex design challenges, including finding the optimized design pattern for your use-case, orchestration failures and evaluation loops. You can spend weeks reinventing the wheel, trying to get your agents to be ready for production - or you can start with architectures that already work, with our new Atomic Agents in Agent Garden.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://x.com/GoogleCloudTech/status/2048066787233943773" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Read the full guide to learn about pre-built Agent Blueprints in Agent Garden&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Watch the complete Agent Platform explainer&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To see these architectural patterns in practice, watch this technical walkthrough of the Gemini Enterprise Agent Platform. This deep dive covers the complete agent lifecycle, showing you exactly how to move from initial code to a secure, scalable AI Agents in production.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=j8qW5poBkEU"
      data-glue-modal-trigger="uni-modal-j8qW5poBkEU-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_32APulJ.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;What is Gemini Enterprise Agent Platform?&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-j8qW5poBkEU-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="j8qW5poBkEU"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=j8qW5poBkEU"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Dive into the code with Agent Platform samples on GitHub&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Access our curated repository of code samples and tutorials for the Gemini Enterprise Agent Platform. This &lt;/span&gt;&lt;a href="https://github.com/Google-Cloud-AI/agent-platform" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; provides practical examples for the entire agent lifecycle, giving you the exact code needed to build, scale, govern, and optimize your autonomous fleets.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started with Gemini Enterprise Agent Platform &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Moving agents into production requires both robust infrastructure and the flexibility to choose the right reasoning engine for the task. The Gemini Enterprise Agent Platform bridges this gap, allowing you to build, govern, and scale autonomous workflows with complete enterprise control.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Through first-class integration with Model Garden, your agent fleet has direct access to more than 200 leading models. You can route tasks to the best available option, whether that is a first-party model like Gemini 3.1 Pro or Lyria 3, an open model like Gemma 4, or third-party models like Anthropic’s Claude, Opus, Sonnet or Haiku.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Visit &lt;/span&gt;&lt;a href="https://console.cloud.google.com/agent-platform/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in the Google Cloud console to explore new features and start building today.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 05 May 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/five-guides-to-building-and-scaling-production-ready-ai-agents/</guid><category>AI &amp; Machine Learning</category><category>Developers &amp; Practitioners</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Five must-have guides to move agents into production with Gemini Enterprise Agent Platform</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/five-guides-to-building-and-scaling-production-ready-ai-agents/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Addy Osmani</name><title>Director, Google Cloud AI</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Shubham Saboo</name><title>Senior AI Product Manager, Google Cloud AI</title><department></department><company></company></author></item><item><title>Cloud Engineer’s AI Toolkit: Sign up Now for a Developer Workshop Near You!</title><link>https://cloud.google.com/blog/topics/developers-practitioners/cloud-engineers-ai-toolkit-sign-up-now-for-a-developer-workshop-near-you/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The world of AI is rapidly shifting from experimental Large Language Models to an era of Agentic AI. In the agentic era, autonomous software agents act on behalf of employees and consumers—driving a fundamental change in commerce and business operations. This transition presents a critical new challenge: how to securely &lt;strong&gt;build&lt;/strong&gt;, &lt;strong&gt;deploy&lt;/strong&gt;, and &lt;strong&gt;govern&lt;/strong&gt; agents at enterprise scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That’s why we’re hitting the road across North America for a series of hands-on workshops designed to move you past the theory and into production. This isn't a sit-and-listen lecture; it’s a "bring your laptop and build" session. You will gain hands-on experience for the exact skills required to transition your organization into this new agentic era. Whether you're looking to harden your Kubernetes clusters for AI or transform your data warehouse into a powerful engine for autonomous agents, this program will equip you with the practical toolkit used by leading enterprises.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Who should attend&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This program is designed for Platform and Security Engineers, as well as Data Practitioners.  &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;For Platform and Security Engineers&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you’re a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Platform, Security, or DevOps Engineer&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, this is your toolkit for securing and scaling the next generation of workloads.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;What you need:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Foundational knowledge of GKE and containerization.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;What you’ll do:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Master defense-in-depth strategies, secure inference endpoints, and automate cluster operations using natural language.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;For the Data Practitioners&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Calling all &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Data Engineers, Analysts, and Scientists&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. If you’re ready to bridge the gap between traditional analytics and Agentic AI, this track is for you.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;What you need:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Some cloud experience and basic SQL. A little Python knowledge goes a long way, but we’ll guide you through the rest!&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;What you’ll do:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Build governed pipelines, unlock multimodal insights from unstructured data, and deploy autonomous agents that turn dashboards into action.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;What you’ll build&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The labs are designed to take you from raw infrastructure to a full-stack agentic application.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Track 1: Agents &amp;amp; GKE&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;AI-Augmented Ops:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use Gemini and MCP servers to manage clusters with natural language instead of manual slogging.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secure Sandboxing:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Deploy AI agents in isolated environments to safely execute AI-generated code.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Data at Scale:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use GKE to process massive datasets and create complex knowledge graphs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Track 2: Data Engineering &amp;amp; Analytics&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Governed Ingestion:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use Spark and Knowledge Catalog to build a unified, governed data layer for business-ready insights.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Conversational Analytics:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Integrate vector search and multimodal data (images/logs) to create a "talk to your data" experience.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Graph-Powered Agents:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use BigQuery Graph, Knowledge Catalog and the Agent Development Kit (ADK) to build agents that understand complex relationships.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Note:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; These sessions are interactive. You &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;must&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; bring your own laptop and power cable to participate.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to build?&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While theoretical knowledge is valuable, nothing beats hands-on experience guided by Google experts. Registration is officially open for the upcoming sessions listed below. Come build with us! Note that different dates host different tracks.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt; &lt;/p&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;
&lt;div style="color: #5f6368; overflow-x: auto; overflow-y: hidden; width: 100%;"&gt;&lt;table border="1" style="border-collapse: collapse; width: 97.2684%; height: 89.6024px;"&gt;
&lt;tbody&gt;
&lt;tr style="height: 22.4006px;"&gt;
&lt;td style="width: 24.9971%; height: 22.4006px; text-align: center;"&gt;&lt;strong&gt;Track&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px; text-align: center;"&gt;&lt;strong&gt;Location&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px; text-align: center;"&gt;&lt;strong&gt;Date&lt;/strong&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px; text-align: center;"&gt;&lt;strong&gt;Registration&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 22.4006px;"&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Agents &amp;amp; GKE&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;&lt;span style="vertical-align: baseline;"&gt;New York, NY&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;&lt;span style="vertical-align: baseline;"&gt;May 26, 2026&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-labs-ai-toolkit-new-york" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register Now&lt;/span&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 22.4006px;"&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Agents &amp;amp; GKE&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;Austin, TX&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;June 2, 2026&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-labs-ai-toolkit-austin" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register Now&lt;/span&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="height: 22.4006px;"&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;&lt;span style="vertical-align: baseline;"&gt;Agents &amp;amp; GKE&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;Sunnyvale, CA&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;June 9, 2026&lt;/td&gt;
&lt;td style="width: 24.9971%; height: 22.4006px;"&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-labs-ai-toolkit-sunnyvale" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register Now&lt;/span&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Agents &amp;amp; GKE&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Seattle, WA&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;June 11, 2026&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-labs-ai-toolkit-seattle" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register Now&lt;/span&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Agents &amp;amp; GKE&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Toronto, ON&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;June 24, 2026&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-labs-ai-toolkit-toronto" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register Now&lt;/span&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Data Engineering &amp;amp; Analytics&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Toronto, ON&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;June 25, 2026&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-labs-data-cloud-toronto" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register Now&lt;/span&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Data Engineering &amp;amp; Analytics&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;span style="vertical-align: baseline;"&gt;Chicago, IL&lt;/span&gt;&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;June 30, 2026&lt;/td&gt;
&lt;td style="width: 24.9971%;"&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-labs-data-cloud-chicago" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Register Now&lt;/span&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;</description><pubDate>Tue, 05 May 2026 15:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/cloud-engineers-ai-toolkit-sign-up-now-for-a-developer-workshop-near-you/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog_Banners.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Cloud Engineer’s AI Toolkit: Sign up Now for a Developer Workshop Near You!</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Blog_Banners.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/cloud-engineers-ai-toolkit-sign-up-now-for-a-developer-workshop-near-you/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Olivier Bourgeois</name><title>Developer Relations Engineer</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Lucia Subatin</name><title>Developer Advocate</title><department></department><company></company></author></item><item><title>Agent Factory Recap: How Gemma 4 Taught Itself Physics</title><link>https://cloud.google.com/blog/topics/developers-practitioners/agent-factory-recap-how-gemma-4-taught-itself-physics/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this episode of The Agent Factory, &lt;/span&gt;&lt;span data-rich-links='{"per_n":"Vlad Kolesnikov","per_e":"vladkol@google.com","type":"person"}' style="vertical-align: baseline;"&gt;Vlad Kolesnikov&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; and I sat down with &lt;/span&gt;&lt;span data-rich-links='{"per_n":"Omar Sanseviero","per_e":"osanseviero@google.com","type":"person"}' style="vertical-align: baseline;"&gt;Omar Sanseviero&lt;/span&gt;&lt;span data-rich-links='{"per_n":"Omar Sanseviero","per_e":"osanseviero@google.com","type":"person"}' style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;from the Developer Experience team at Google DeepMind. We explored the groundbreaking release of Gemma 4: a new family of open models designed to bring high-level intelligence and agentic capabilities directly to consumer hardware and mobile devices. Since the launch last month, Gemma 4 had &lt;strong&gt;over&lt;/strong&gt; &lt;strong&gt;50 million downloads!&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=ST9mJuTnFqU"
      data-glue-modal-trigger="uni-modal-ST9mJuTnFqU-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        &lt;img src="//img.youtube.com/vi/ST9mJuTnFqU/maxresdefault.jpg"
             alt="A screen recording demonstrating Gemma 4."/&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-ST9mJuTnFqU-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="ST9mJuTnFqU"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=ST9mJuTnFqU"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;Gemma 4 - What is it?&lt;/h2&gt;
&lt;p&gt;Gemma 4 is the latest generation of open models from Google DeepMind, built on the same foundational research as Gemini 3. The family is designed to deliver exceptional "intelligence per parameter" across a range of deployment scenarios, from mobile phones to powerful workstations.The Gemma 4 model family now spans three distinct architectures:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Small Sizes (E2B &amp;amp; E4B):&lt;/strong&gt; Optimized for ultra-mobile, edge, and browser deployment (such as Pixel or Chrome).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dense (31B):&lt;/strong&gt; A powerful 31-billion parameter model that provides server-grade performance for local execution on consumer GPUs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mixture-of-Experts (26B MoE):&lt;/strong&gt; A highly efficient architecture designed for high-throughput tasks and advanced reasoning.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With the shift to an &lt;strong&gt;Apache 2 license&lt;/strong&gt;, these models provide developers and startups with the flexibility to build, modify, and commercialize applications while maintaining full control over their infrastructure.&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Omar Sanseviero on how Gemma 4 changes the landscape for agent developers&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=100s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;1:40&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Omar highlighted that Gemma 4 brings "very high intelligence per parameter," making it possible to run agentic workflows entirely offline. We saw examples of multiple Gemma instances running locally to generate SVGs (&lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=113s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;1:53&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;) &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;and an Android-based agent picking specific skills, like playing the piano, to complete tasks (&lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=165s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;2:45&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;). As Omar noted, "This means that you can run very powerful things with very little hardware overhead...even in the phone that you have in your pocket."&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/Untitled2.max-1000x1000.jpg"
        
          alt="Untitled2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;The Factory Floor&lt;/span&gt;&lt;/h2&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Building a Local Food Tour Agent&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=329s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;5:29&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We showcased a food tour agent powered by Gemma 4 using the Agent Development Kit (ADK) and a Google Maps MCP server. We demonstrated how a local model can handle complex, multi-step reasoning tasks.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The agent identified the best ramen spots in Seattle under a $30 budget.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;It verified that the locations were within walking distance of each other.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;It processed search results to provide specific tips on what to order and what to avoid.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Autonomous Python Code Execution&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=483s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;8:03&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this demo, we pushed Gemma 4’s coding capabilities to the limit by asking it to express itself through animation. Using a sandbox execution environment, the model performed the following:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Wrote Python code using the Matplotlib library.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Attempted to build a physics engine to simulate a bouncing ball.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Self-corrected&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; when the initial execution environment lacked certain CPU features, finding an alternative path to successfully generate the animation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Demonstrated a deep understanding of real-world physics and gravity through code.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The Shift to Apache 2 Licensing&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=245s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;4:05&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A major theme of the conversation was the community-driven decision to move Gemma 4 to an Apache 2 license. This change provides developers and startups with maximum flexibility to build, modify, and commercialize applications. Omar emphasized that this was a direct response to developer feedback, aiming to unlock a new wave of innovation in the open models ecosystem.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Developer Q&amp;amp;A&lt;/span&gt;&lt;/h2&gt;
&lt;h3&gt;Architectural Decisions and Mixture of Experts (MoE)&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=1043s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;17:23&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Omar explained the technical shifts that make Gemma 4 so efficient. For the first time, the Gemma family includes a Mixture of Experts (MoE) architecture, which optimizes for extremely low latency in production. Additionally, the smaller E2B and E4B models utilize per-layer embeddings to remain "cheap" to run on GPUs. For vision tasks, the model now supports variable aspect ratios, allowing it to understand images of various sizes more accurately than previous fixed-resolution versions.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;Comparing Gemma to Gemini&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=1191s" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;19:51&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;  &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When asked how Gemma stacks up against its larger sibling, Gemini, Omar clarified that they serve different purposes. While Gemini excels at massive-scale tasks and deep "world knowledge" due to its size, Gemma is the "best open model that can run on a single consumer GPU." It is specifically optimized for instruction following, coding, and agentic use cases where local deployment or fine-tuning is required.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;Fine-Tuning for Specialized Industries&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Timestamp: &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=1271s" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;21:10&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The conversation touched on the importance of "Sovereign AI" and privacy. Because Gemma is an open model, developers in regulated industries, like healthcare or finance, can &lt;/span&gt;&lt;a href="https://medium.com/google-cloud/fine-tuning-gemma-4-with-cloud-run-jobs-serverless-gpus-nvidia-rtx-6000-pro-for-pet-breed-d408c7e24be2" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;fine-tune the model on their private data&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and deploy it within their own air-gapped infrastructure. This gives developers full control over their data and the model's specialized expertise.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Conclusion&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Gemma 4 marks a turning point for agentic development, proving that you don't always need a massive cloud cluster to build something smart. Whether it's running a physics simulation on a laptop or a travel guide on a phone, the barrier to entry for high-performance AI has never been lower. We are entering an era where the "conductor" of the AI orchestra can be any developer with a single GPU and a great idea.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Your turn to build&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now that you've seen what Gemma 4 can do, it's time to start building. Check out the resources in our show notes, &lt;/span&gt;&lt;a href="https://goo.gle/3OinTFh" rel="noopener" target="_blank"&gt;the food tour agent&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://goo.gle/4dBDNEY" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;the coding agent&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, explore the &lt;/span&gt;&lt;a href="https://adk.dev/agents/models/google-gemma/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK support&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and try running &lt;/span&gt;&lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemma 4 &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;on your local machine or on &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/run/docs/run-gemma-on-cloud-run"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. We can't wait to see what agents you create!&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Watch more of The Agent Factory → &lt;/span&gt;&lt;a href="https://www.youtube.com/watch?v=qBOvM7SiDa4&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1" rel="noopener" target="_blank"&gt;Reinforcement learning &amp;amp; fine-tuning on TP...  &lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Subscribe to Google Cloud Tech → &lt;/span&gt;&lt;a href="https://www.youtube.com/redirect?event=video_description&amp;amp;redir_token=QUFFLUhqbUowS0R4LXpNYlJiQ3FVNXRRRE9Qa3Y1S0tkZ3xBQ3Jtc0trb1pSV0UxZDdYM2x2YTJWMkJ6UFZ4QkFqc2NLREQtY2tuVlNuRGpjN0FBdmI1SmpORDBGaDh2YU5sdmxSdW9IdlU5SE5ac3E4TS1OakdBdXZjempkMGlSbXdUQlRncC1qbVJFMExvNHRmOGpDTEFPaw&amp;amp;q=https%3A%2F%2Fgoo.gle%2FGoogleCloudTech&amp;amp;v=ST9mJuTnFqU" rel="noopener" target="_blank"&gt;https://goo.gle/GoogleCloudTech&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Connect with us&lt;/span&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Shir Meir Lador → &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/shirmeirlador/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; , &lt;/span&gt;&lt;a href="https://x.com/shirmeir86" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Vlad Kolesnikov → &lt;/span&gt;&lt;a href="http://www.linkedin.com/in/vkolesnikov/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://x.com/vladkol" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Omar Sanseviero → &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/omarsanseviero/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://x.com/osanseviero" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Tue, 05 May 2026 11:54:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/agent-factory-recap-how-gemma-4-taught-itself-physics/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Past_AF_Blog_Recap_hero_images_1_1.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Agent Factory Recap: How Gemma 4 Taught Itself Physics</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Past_AF_Blog_Recap_hero_images_1_1.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/agent-factory-recap-how-gemma-4-taught-itself-physics/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Shir Meir Lador</name><title>Head of AI Engineering, Google Cloud Developer Relations</title><department></department><company></company></author></item><item><title>Next '26 Hands-On: 10 Codelabs to Build Featured Tech</title><link>https://cloud.google.com/blog/topics/developers-practitioners/next-26-hands-on-10-codelabs-to-build-featured-tech/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;Significant contributors to this article include &lt;strong&gt;Megan O'Keefe&lt;/strong&gt;, Senior Staff Developer Advocate, and &lt;/span&gt;&lt;strong&gt;Karl Weinmeister&lt;/strong&gt;, Director of Developer Relations.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you are joining us in person in Las Vegas or tuning in virtually from around the world, Google Cloud Next '26 offers a deep look into the practical evolution of AI. With &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;89% of sessions&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; this year dedicated to artificial intelligence, the focus has shifted from high-level concepts to the "Day 2" reality of building and maintaining agentic systems.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We've assembled &lt;strong&gt;55+ new codelabs&lt;/strong&gt; across Cloud at Next, and we want to share 10 highlights with you. The following curated list of codelabs is designed to help you translate the announcements from the talks and demos into functional code. These labs provide a structured way to explore the latest in multi-agent orchestration, data grounding, and enterprise security for your own workflows.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Dive into Codelabs!&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;1&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Build Rich Agent Experiences (ADK + A2UI)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/adk-a2ui/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Improve user interaction&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;through intuitive, high-quality interfaces that allow users to interact with agentic systems seamlessly.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;2&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Building a Multi-Agent System&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/multi-agent-system#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Build&lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;the&lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;architecture required to make multiple agents work together to achieve a shared goal.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;3&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Beyond the Simple SELECT: AlloyDB NL2SQL&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/alloydb-querydata#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Democratize data access by building systems that allow users to query complex databases using natural language, supported by high-speed vector search.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;4&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Beat Fraud with an AI Shield (Spanner &amp;amp; BigQuery Graph)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/spanner-bigquery-graph/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Implement real-time reasoning with Spanner and BigQuery Graph databases. Analyze complex relationships in your data to prevent fraud at the point of transaction.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;5&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Building Secure Agents: Protecting Access and Data&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/showcase-build-secure-agent/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Protect the reasoning engine with&lt;strong&gt; &lt;/strong&gt;&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Model Armor&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Identity and Access Management (IAM) &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;to manage agent access and ensure that sensitive data remains protected during execution.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;6&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Ground Agents with Google Maps Platform&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/maps-grounding/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Use Geo-intelligent logistics to ground your agents in real-world location data to optimize field operations and logistics in real-time.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;7&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy and Scale Agents on Agent Engine&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/adk-deploy-scale/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Deploy agents as containerized microservices that scale dynamically with your workload.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;8&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;The Ultimate Guide to Cloud Run: From Zero to Production&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/ultimate-cloud-run-guide/#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Achieve rapid deployment using this lab as a blueprint for moving from a local prototype to a production-ready, auto-scaling platform on Cloud Run.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;9&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;—Developer Keynote: Building Agents with Skills &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;|&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/dev-keynote/building-agents-with-skills#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Learn the ins and outs of AI agent development including Agent Development Kit (ADK), prompting, Agent Skill usage, and MCP. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;10—General Keynote: Forecasting with AI Agents | &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/next26/gen-keynote/raw-data-forecasting#0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Codelab&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Transform unstructured chaos into actionable business intelligence in seconds&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Start Building Today&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These codelabs will connect you to the heart of the conference. You'll be able to bridge the high-level announcements, talks, and demos into the reality of the technology featured at Next '26. Whether you're here in person or attending virtually, these labs provide the concrete skills to drive real-world value during the conference &lt;em&gt;&lt;strong&gt;and&lt;/strong&gt;&lt;/em&gt; long after the conference ends.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;And there's more!&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Go to the &lt;a href="https://codelabs.developers.google.com/?event=googlecloudnext2026" rel="noopener" target="_blank"&gt;Codelab landing page&lt;/a&gt; to find the &lt;code&gt;Cloud Next '26&lt;/code&gt; tag and access &lt;strong&gt;more than 75 total&lt;/strong&gt; &lt;strong&gt;codelabs&lt;/strong&gt; that support the featured tech at this year's conference.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 13:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/next-26-hands-on-10-codelabs-to-build-featured-tech/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/corrected_updated_final_codelabs_image.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Next '26 Hands-On: 10 Codelabs to Build Featured Tech</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/corrected_updated_final_codelabs_image.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/next-26-hands-on-10-codelabs-to-build-featured-tech/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Mandy Grover</name><title>Strategic Content, Google Cloud</title><department></department><company></company></author></item><item><title>Level Up Your Agents: Announcing Google's Official Skills Repository</title><link>https://cloud.google.com/blog/topics/developers-practitioners/level-up-your-agents-announcing-googles-official-skills-repository/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As AI models improve, technical practitioners are increasingly turning to agentic AI tools to build with Google Cloud products, from Firebase and the Gemini API, to BigQuery and GKE.  But how can you ensure that the model is equipped with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;accurate, up-to-date information &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;about these technologies? &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One way to do this is to plug your AI agent into a grounded, real-time information source. For instance, &lt;/span&gt;&lt;a href="https://developers.google.com/knowledge/mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google offers a Model Context Protocol (MCP) server for its developer documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. But heavily using MCP servers can cause a problem called “context bloat,” where huge amounts of context are loaded into the context window, confusing the model and racking up token costs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We need a way to equip agents with additional, condensed expertise — and we can do this with &lt;/span&gt;&lt;a href="https://agentskills.io/home" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Skills.&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://agentskills.io/home" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Skills&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; are “a simple, open format for giving agents new capabilities and expertise.” Think of a skill as compact, agent-first documentation for a specific technology or task. Skills are written in Markdown and can contain reference files, code snippets, and other assets. Agents load in skill information &lt;/span&gt;&lt;a href="https://agentskills.io/what-are-skills#how-skills-work" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;only as-needed,&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; reducing the risk of context bloat. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, on Day 1 of &lt;/span&gt;&lt;a href="https://www.googlecloudevents.com/next-vegas/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Next 2026,&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; we’re excited to announce the launch of Google’s official Agent Skills repository: &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/google/skills" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;github.com/google/skills&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This repository is starting off with thirteen skills, focused on Google Cloud technologies: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;A selection of products&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;AlloyDB, BigQuery, Cloud Run, Cloud SQL, Firebase, Gemini API, and Google Kubernetes Engine (GKE).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Three &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/architecture/framework"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Well-Architected Pillar&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;skills: Security, Reliability, and Cost Optimization &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;“Recipe” skills for Google Cloud Onboarding, Authentication, and Network Observability. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image_1_BwwkF6A.max-1000x1000.png"
        
          alt="image (1)"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;npx skills install &lt;/code&gt;&lt;a href="http://github.com/google/skills" rel="noopener" target="_blank"&gt;&lt;code style="text-decoration: underline; vertical-align: baseline;"&gt;github.com/google/skills&lt;/code&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to install these skills to your agents of choice, including &lt;/span&gt;&lt;a href="https://antigravity.google/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://geminicli.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and third-party agents. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/agent_skills-2.max-1000x1000.png"
        
          alt="agent_skills-2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Stay tuned as we launch additional skills in this repo in the coming weeks and months! &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now get building!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 13:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/level-up-your-agents-announcing-googles-official-skills-repository/</guid><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Agent_Skills_Blog_-_Hero.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Level Up Your Agents: Announcing Google's Official Skills Repository</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Agent_Skills_Blog_-_Hero.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/level-up-your-agents-announcing-googles-official-skills-repository/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Megan O'Keefe</name><title>Senior Staff Developer Advocate</title><department></department><company></company></author></item><item><title>What’s new with the Cross-Cloud Network at Next ‘26</title><link>https://cloud.google.com/blog/products/networking/whats-new-in-cloud-networking-at-next26/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While generative AI sparked a revolution, the true paradigm shift is the rapid evolution from standalone AI models to multi-agent autonomous systems. In this new era, the network transcends basic connectivity to become the critical integration layer for your agentic enterprise.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As AI agents and services surge, your core applications remain as vital as ever. To thrive in this rapidly evolving landscape, you need a planet-scale network to connect, protect, govern, deliver, and secure all your users, data, agents, AI services, and core applications across clouds and on-premises.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud's Cross-Cloud Network provides this unified foundation, and is now used by 65% of the Fortune 100 and handles up to 27 exabytes of data per month. At Google Cloud Next, we are introducing networking innovations to accelerate your AI infrastructure, strengthen security, and simplify operations. &lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Optimized networking infrastructure for AI &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As we move toward an agentic world, the network must support massive-scale inference paired with reinforcement learning. At Google, we’ve spent years refining this cycle to power our own global AI services. Today, we’re announcing AI infrastructure network innovations that bring this same architecture directly to your workloads, across &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;agents&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;inference&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;training&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, and beyond.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Networking for agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is a comprehensive enterprise environment designed to build, scale, govern, and optimize the next generation of autonomous agents. Key innovations being announced in preview include: &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Gateway:&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt; Air-traffic control for agentic traffic&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Gateway understands MCP and A2A agentic protocols and provides an open, extensible, scalable way to enforce centralized governance policies to securely connect agents, models, and tools across runtimes.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Ambient networking: &lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;A seismic shift in service-to-service connectivity&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ambient networking, a new integrated data plane for Google Kubernetes Engine (GKE) and Cloud Run, provides service discovery, zero-trust access, and traffic management without the need for complex and resource-heavy sidecar proxies. It reduces operational overhead and enables up to a 10x reduction in GKE resource usage for layer 4 (L4) mesh capabilities&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ambient networking underpins two new capabilities:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Service bindings &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;automatically establish service-to-service connectivity, allowing developers to move faster to build and scale their agentic applications and services.&lt;/span&gt;&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Network Services Monitoring&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; bridges application and network observability gaps resulting in faster root-cause analysis and simplified troubleshooting.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Rich partner integrations and customizations&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With the help of &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/service-extensions/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Service Extensions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, we are developing solutions&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;for identity, governance, and AI security for agent-to-anywhere traffic. Coming soon in preview to Agent Gateway are:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Identity and governance administration&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Offering delegated authorization to Cloud IAM and partner services from Okta, Ping, Saviynt, and Silverfort to enforce real-time, contextual governance policies based on application and business context.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Runtime security:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; As a universal enforcement point by integrating with Google Cloud’s Model Armor and partner solutions from Broadcom, Check Point, Cisco, CrowdStrike, Exabeam, F5, Netskope, Palo Alto Networks, Thales, and Zscaler. Together, these can help to secure agentic communications against emerging AI attack vectors.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These innovations are built on an open foundation including Envoy and Kubernetes, providing strong, integrated governance in multicloud environments using standard Kubernetes Gateway APIs.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Networking for inference&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google we run inference at scale with optimized use of distributed GPU and TPU resources, automatic failover between regions for high availability, and optimized global request routing for fast end-user performance. GKE Inference Gateway delivers these capabilities to our cloud customers including the following new innovations:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Multi-region support &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allows scaling inference services across regions, enabling cross-regional failover, optimized utilization, and reduced global latency (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Predictive latency boost&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; improves utilization with intelligent request routing based on predefined performance targets (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Disaggregated serving&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; leverages &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;llm-d’s SGLang support, offering the flexibility to choose between vLLM and SGLang for model serving (&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;GA).&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-pull_quote"&gt;&lt;div class="uni-pull-quote h-c-page"&gt;
  &lt;section class="h-c-grid"&gt;
    &lt;div class="uni-pull-quote__wrapper h-c-grid__col h-c-grid__col--8 h-c-grid__col-m--6 h-c-grid__col-l--6
      h-c-grid__col--offset-2 h-c-grid__col-m--offset-3 h-c-grid__col-l--offset-3"&gt;
      &lt;div class="uni-pull-quote__inner-wrapper h-c-copy h-c-copy"&gt;
        &lt;q class="uni-pull-quote__text"&gt;Gemini Enterprise Agent Platform reduced Time to First Token (TTFT) latency by over 35% for Qwen3-Coder by using GKE Inference Gateway.&lt;/q&gt;

        
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/section&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Before GKE Inference Gateway, managing our inference stack with Ray Serve created a complex, dual-orchestration layer that was a significant burden on our small operations team. Moving to the Inference Gateway and native Kubernetes deployments was the 'North Star' architecture we needed to simplify management and achieve robust production stability with a GKE-native batteries-included solution.”&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;- &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Mikhail Lubinets, Lead HPC Engineer, Technology Innovation Institute&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Networking for training&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google, we build and run the largest AI models in the world — and we built a network to support that. Some of the new enhancements are:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Massive scale with &lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;Virgo Network&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This new &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;non-blocking data center fabric&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; removes latency barriers: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span&gt;&lt;strong style="vertical-align: baseline;"&gt;Virgo&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; can link up-to 134,000 chips with 47 Petabits/sec of non-blocking bi-sectional bandwidth to deliver 1.7K Exaflops of compute. &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;With &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;enhancements in Pathways and JAX&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, you can further connect these Virgo fabrics to scale to over 1 million TPU chips in a single training cluster.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;We are also making Virgo Network&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; available on NVIDIA Vera Rubin NVL72&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, supporting up to 960,000 GPUs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For more on Virgo Network, check out this &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/networking/introducing-virgo-megascale-data-center-fabric"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;blog&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Accelerator network profiles&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It’s easier than ever to handle the complex networking prerequisites for accelerator-equipped GKE node pools with &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/networking/introducing-managed-dranet-in-google-kubernetes-engine"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DRANET&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which improves bandwidth for distributed AI/ML workloads by up to 60% (GA).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;AI-native Cloud Interconnect&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;SLA-backed, and optimized for efficiency, Cloud Interconnect supports petabit-scale data transfers and is available with a fixed price option. Cloud Interconnect now supports:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;400 Gbps circuits with up to 3.2 Tbps in a single connection (GA)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Partner Cross-Cloud Interconnect for AWS (GA), CoreWeave (in preview soon), and Lumen (in preview soon)&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-Cloud Network for AI and core applications&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Cross-Cloud Network helps ensure you can securely connect users, data, locations, applications, services, and infrastructure anywhere in the world, at planetary scale. We designed our global multi-shard network to scale horizontally to meet the demands of the AI era and enable us to accommodate our 10x WAN traffic growth from 2020 to 2025.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These are some of the improvements we’re making to the Cross-Cloud Network: &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Ultra Low Latency Solution for financial exchanges &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In partnership with CME Group, we are bringing the world's leading derivatives marketplace to Google Cloud. To support CME Group’s performance requirements, we developed an ultra low latency (ULL) networking and compute solution. This fully managed cloud environment will allow CME Group and its clients to migrate its core trading systems to Google Cloud. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now in preview, the solution is designed to meet the unique and exacting requirements of running financial exchanges in the cloud. It includes several new technologies:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deterministic high-performance compute &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;powered by ULL networking, with bare metal and VM form factors, delivers a comprehensive portfolio for your trading compute needs. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Scalable multicast data distribution &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;with hardware-based ultra-low latency enables reliable one-to-many market data sharing.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Nanosecond-level clock sync &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;enabled by &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/networking/understanding-the-firefly-clock-synchronization-protocol/"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Firefly&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a novel clock synchronization system. Firefly achieves sub-10ns NIC-to-NIC synchronization to support high-frequency trading.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Advanced network observability &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;with 64-bit nanosecond timestamps, support for multiple traffic-mirroring destinations and multicast traffic, and support for auditing and regulatory requirements.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Low-latency inference &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allowing exchange participants to connect their AI-driven services to the exchange’s infrastructure. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;The Google Cloud Ultra Low Latency Solution provides the level of performance necessary for CME Group futures and options markets to run in the cloud, expanding access to clients worldwide.” &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;- Sunil Cutinho, CIO, CME Group&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-cloud observability for networks, applications, and agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you’re running core applications or new AI agents, you need visibility into your network infrastructure. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Network Insights&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now in preview, offers network performance monitoring (NPM) and digital experience monitoring (DEM) to dramatically reduce the mean time to detect and mitigate network-related agent, application, and API issues.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Network Insights is enabled by technologies from Broadcom’s AppNeta and powered by AI-enabling natural language queries through Gemini Cloud Assist.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"In an environment as complex and high-scale as Sabre’s, total visibility isn't just a luxury — it's a requirement for operational resilience. Cloud Network Insights will enable us to further shift our posture from reactive troubleshooting to proactive optimization. By providing granular, real-time telemetry across our global cloud footprint, it helps eliminate the traditional 'black box' of the network, allowing our teams to resolve bottlenecks before they impact the traveler experience."&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - Alfredo Rodriguez, VP Cloud Platform Infrastructure, Sabre Corporation&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Cloud Network Insights closes the 'visibility gap' between the private corporate network and the public cloud, empowering our joint customers to pinpoint performance bottlenecks in seconds rather than hours.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - Alan Davidson, CIO, Broadcom&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-Cloud Network for distributed applications&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Multicloud and hybrid networks require secure, reliable, and high-performance connectivity. New enhancements for our foundational networking services and tools include:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Private Service Connect &lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Private Service Connect traffic volume grew 4x in 2025 and it now supports &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/private-service-connect-compatibility"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;40+ Google and third-party published services&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, enabling secure private global access to your managed services. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Private Service Connect endpoint-based security &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allows for granular authorization policies for producer-to-consumer service communications (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemini Cloud Assist for Private Service Connect&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; provides for automated troubleshooting (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud-native IP address management (IPAM)&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Number Registry &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;is an IPAM solution powered by agentic technologies. Network admins can easily find free IP ranges, track utilization, and allocate resources (preview). It also integrates with Infoblox Universal DDI for Cross-Cloud Network IPAM discovery and enforcement.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Hybrid Subnets&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; allow you to migrate legacy workloads from on-premises to a VPC without needing to change hard-coded IP addresses (GA).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud NAT &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allows you to connect your IPv6-only workloads to private IPv4 destinations using the combined power of DNS64 and private NAT64 (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Network Connectivity Center (NCC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Partner Cross-Cloud Interconnect for AWS is available as a connectivity type in NCC (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Support for static routes using an &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;internal load balancer as the next hop&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; allows the integration of Secure Web Proxy and third-party network security virtual appliances (GA).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Support for &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;privately used public IP&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (PUPI) allows the exchange of PUPI IPv4 addresses with VPC spokes and producer VPC spokes (GA).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Granular networking charge visibility&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cost Explorer and the new App Optimize API now provide attribution of associated Data Transfer costs to the originating resources for Google Cloud products (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-Cloud Network for internet-facing services&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As part of Cross-Cloud Network, the &lt;/span&gt;&lt;a href="https://cloud.google.com/solutions/cross-cloud-network#deliver-internet-facing-apps-and-content"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Global Front End&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; simplifies how you deliver, scale, and protect web, API, and AI workloads. New capabilities include: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Global Front End Enterprise delivers&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; simplified consumption by combining capabilities from global Cloud Load Balancing, Google Cloud Armor, Cloud CDN, and Service Extensions with up to 15% lower TCO (in preview soon). &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Post quantum cryptography &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;(PQC) helps secure your workloads with industry-standard algorithms that provide a layered defense against both classical and quantum adversaries.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google tag gateway,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; enabling advertisers to serve tags from their own domain, which can significantly improve the accuracy and resilience of measurement signals (GA soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud CDN&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, an important part of the Global Front End, now offers:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Built-in image optimization &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;to help you deliver content that best fits your end users’ screens and saves on bandwidth costs (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE Gateway support&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; so you can enable and manage caching services using GKE APIs (GA).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-Cloud Network’s Cloud WAN for global enterprises&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud WAN is a fully managed, reliable global backbone to connect your enterprise. New capabilities include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Expanded geographic reach: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Our network spans more than 10 million kilometers of terrestrial and subsea fiber, and Network Connectivity Center’s site-to-site data transfer is now available in over &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/network-connectivity/docs/network-connectivity-center/concepts/locations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;25 countries&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;NCC Gateway &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;enables third-party secure service edge (SSE) integrations from Palo Alto Networks (GA soon) and Symantec (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The Verified Peering Provider program&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt;, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;which offers highly reliable internet connectivity to Google, now has dramatically expanded availability through &lt;/span&gt;&lt;a href="https://peering.google.com/#/options/verified-peering-provider" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;175+ providers worldwide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Last mile connectivity&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Provision site-to-cloud private connectivity in minutes with preferred partners from the Google Cloud console (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Cloud WAN enables Dun &amp;amp; Bradstreet to evolve our global network via composable, cloud-native constructs. Leveraging NCC, we’ve built a resilient, high-performance platform that simplifies operations and optimizes costs. This foundation supports continued modernization and AI-driven workloads. We expect to extend this architecture as new patterns emerge, maintaining our blueprints-first approach.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; - Josh Barry, VP, Network Engineering, Dun &amp;amp; Bradstreet&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;AI-powered security against evolving threats&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The threat landscape is evolving faster than ever, with AI-driven attacks. Staying ahead requires the latest defenses. Cross-Cloud Network relies on Cloud NGFW and Cloud Armor for advanced security capabilities. Here’s the latest on those offerings.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud NGFW &lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Advanced malware sandbox &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;uses AI models trained on data from 70k+ customers &lt;/span&gt;&lt;a href="https://www.paloaltonetworks.com/apps/pan/public/downloadResource?pagePath=/content/pan/en_US/resources/datasheets/advanced-wildfire" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;to stop 99% of known and unknown malware&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, including evasive zero-days. Advanced malware sandbox is powered by Palo Alto Networks Advanced Wildfire (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Internal Application and proxy Network Load Balancer &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;support helps to enforce consistent, service-centric security for abstracted services like GKE, Cloud Run, and Private Service Connect traffic (preview).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Project-level policies &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;allow for creating and managing Cloud NGFW endpoints, security profiles, and security profile groups at the project level (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Armor &lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Managed rules, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;built-in rulesets across 15 threat categories, deliver automated threat protection against a broad set of attacks and zero-day CVEs. This is powered by Thales Imperva based on visibility to &lt;/span&gt;&lt;a href="https://engage-cybersec.thalesgroup.com/rs/727-WRL-406/images/EMEA-2025-Partner-Connect-05-Shailes-Nanda.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;1.5 trillion web requests each month&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Cloud Fraud Defense integration&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; helps to &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;discern the legitimacy and authorization of bots, humans, and agents. &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/introducing-google-cloud-fraud-defense-the-next-evolution-of-recaptcha"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Fraud Defense&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is the evolution of reCAPTCHA, which protects over 14 million domains from fraud and abuse.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Adaptive protection for Network Load Balancers &amp;amp; VMs&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; brings advanced machine learning to L3/L4 traffic, to detect and mitigate volumetric DDoS attacks (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A simplified user experience&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with a visual rule builder makes custom rule creation easier (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;AI-powered network operations&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, new AI-powered technologies in &lt;/span&gt;&lt;a href="https://cloud.google.com/products/gemini/cloud-assist"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Cloud Assist&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; can help automate manual tasks, ease troubleshooting, predict reliability issues, improve security, and help optimize your network to reduce toil and improve reliability with new specialist agents. These include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A network security agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; that streamlines network security operations by assisting with policy generation, recommendations, and impact analysis (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A network agent &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;that optimizes workload placement for performance and reliability, and also provides advanced cost estimation for observability services (in preview soon).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Additionally, to enable customers and partners to build their own agents, we are releasing &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Network observability MCP tools and agent skills.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This will allow their agents to leverage connectivity tests, and allows for natural language querying of VPC Flow Logs (both in preview).&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;The network that scales with you&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We built our Cross-Cloud Network on the same global infrastructure that powers Google’s largest AI and internet services. This provides you with a blazing-fast, planet-scale foundation that is both secure by design and open by principle, allowing you to integrate your trusted partners across any environment.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As we move into the agentic era, our flexible, future-proof solutions ensure you can quickly adopt the latest AI technologies while maintaining the reliability of your core applications. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whatever comes next, we’ve built the network to help you lead it. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Attend our networking sessions at Next ’26 to learn more, or learn more about the &lt;/span&gt;&lt;a href="https://cloud.google.com/solutions/cross-cloud-network?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cross-Cloud Network&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 12:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/whats-new-in-cloud-networking-at-next26/</guid><category>Hybrid &amp; Multicloud</category><category>Infrastructure Modernization</category><category>Developers &amp; Practitioners</category><category>Google Cloud Next</category><category>Networking</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/GCN26_102_BlogHeader_2436x1200_Opt_5_Dark.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What’s new with the Cross-Cloud Network at Next ‘26</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/GCN26_102_BlogHeader_2436x1200_Opt_5_Dark.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/whats-new-in-cloud-networking-at-next26/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Rob Enns</name><title>VP/GM of Cloud Networking</title><department></department><company></company></author></item><item><title>Introducing Gemini Enterprise Agent Platform, powering the next wave of agents</title><link>https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the early days of generative AI, building safe and reliable business tools took massive engineering effort and a high tolerance for trial and error. We helped solve that with &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, our trusted AI development platform. But today, we’re managing a different level of complexity, with agents interacting across multiple systems — and often without security and governance guardrails. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To move toward a truly autonomous enterprise, one where agents can act with the same independence and reliability as a member of your team, you need a foundation that can sustain that level of trust. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;What’s new: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we’re launching &lt;/span&gt;&lt;a href="https://console.cloud.google.com/agent-platform/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; — our new, comprehensive platform to build, scale, govern, and optimize agents. It’s the evolution of Vertex AI, bringing the model selection, model building, and agent building capabilities that customers love, together with new features for agent integration, DevOps, orchestration, and security. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Platform provides a single destination for your technical teams to build agents that can transform your products, services, and operations. These agents can be seamlessly delivered to your employees through the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/whats-new-in-gemini-enterprise"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise app&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, all while remaining tightly integrated with your IT operations to help ensure control, governance, and security as you scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The platform also provides first-class access to more than 200 of the world’s leading models through Model Garden. This includes our latest first-party breakthroughs like &lt;/span&gt;&lt;a href="https://deepmind.google/models/gemini/pro/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini 3.1 Pro&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://deepmind.google/models/gemini-image/flash/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini 3.1 Flash Image&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://deepmind.google/models/lyria/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Lyria 3&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, alongside our open models like &lt;/span&gt;&lt;a href="https://deepmind.google/models/gemma/gemma-4/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemma 4&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. And, of course, customers have full flexibility to use the best model for the job with support for third-party models like Anthropic’s Claude Opus, Sonnet and Haiku. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Moving forward, all Vertex AI services and roadmap evolutions will be delivered exclusively through the Agent Platform, rather than as a standalone service, to power the next generation of agent development.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Why Agent Platform matters for your business: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Platform helps you move from managing individual AI tasks to delegating business outcomes with total confidence. You can: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Choose the right environment for the job — from the low-code, visual interface of the new &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Studio,&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to the code-first logic of the upgraded &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Development Kit (ADK)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. We’ve simplified the entire lifecycle with AI-native coding capabilities to help you ship production-grade agents faster.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Scale:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Clear the path to production with the re-engineered &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Runtime&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This supports long-running agents that maintain state for days at a time and are backed by &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Memory Bank&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for persistent, long-term context.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Govern: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Establish centralized control with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Identity, Agent Registry, and Agent Gateway&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. These capabilities help ensure every agent — whether built on Agent Platform or sourced from our partner ecosystem — has a trackable identity and operates within enterprise-grade guardrails. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Optimize:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Guarantee quality with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Simulation, Agent Evaluation, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;and&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Agent Observability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. These tools provide full execution traces and a real-time lens into agent reasoning to help ensure your agents always hit their goals.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_0_gemini_enterprise_agent_platform.max-1000x1000.jpg"
        
          alt="1 gemini enterprise agent platform"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started with Agent Platform: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Visit &lt;/span&gt;&lt;a href="https://console.cloud.google.com/agent-platform/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in the Google Cloud console to explore new features and start building today. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Keep reading for a deeper look at our latest releases and how Agent Platform helps you deliver the production-ready agents you can trust at every stage of the journey.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;How customers are achieving more with Gemini Enterprise Agent Platform&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_gemini_enterprise_agent_platform.max-1000x1000.jpg"
        
          alt="2- GEAP Logo Wall"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"Burns &amp;amp; McDonnell uses Agent Platform to transform how organizational knowledge is applied across the enterprise. Using ADK, we are building an AI agent that turns decades of project data into real-time, actionable intelligence. Agent Platform enables this innovation to scale responsibly by combining deterministic business rules with probabilistic reasoning — making AI a trusted operational capability, not just a productivity tool. With Agent Platform, we aren’t just managing knowledge; we are activating experience to drive faster, more confident decisions." &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Matt Olson, Chief Innovation Officer, Burns &amp;amp; McDonnell&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“Color Health uses Agent Platform to power our Virtual Cancer Clinic, delivering end-to-end care. By building our Color Assistant with the Agent Development Kit (ADK) and scaling it via Agent Runtime, we are helping more women get screened for breast cancer. The Color Assistant engages users to check screening eligibility, connects them to clinicians, and helps schedule appointments. The power of the agent lies in the scale it enables — helping us reach more people and respond to individual risk and eligibility in real time.” &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Jayodita Sanghvi, PhD., Head of AI Platform, Color&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“By rebuilding Comcast’s Xfinity Assistant with Agent Development Kit (ADK), we’ve moved beyond simple scripted automation to conversational generative intelligence that delivers personalized troubleshooting and self-service support to our customers. Agent Runtime has been a massive accelerator, allowing us to deploy a sophisticated multi-agent architecture that increases digital containment while ensuring secure, grounded interactions via Gemini. We aren't just reducing repeat interactions by solving customers’ issues the first time; we're redefining the customer experience at scale.” &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Rick Rioboli, Chief Technical Officer, Connectivity &amp;amp; Platforms, Comcast&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“Geotab uses Agent Platform to rapidly accelerate our AI Agent Center of Excellence. Google's Agent Development Kit (ADK) provides the flexibility to orchestrate various frameworks under a single, governable path to production, while offering an exceptional developer experience that dramatically speeds up our build-test-deploy cycle. For Geotab, ADK is the foundation that allows us to rapidly and safely scale our agentic AI solutions across the enterprise” &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Mike Branch, Vice President, Data &amp;amp; Analytics, GeoTab&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"Gurunavi uses Agent Platform to power 'UMAME!', an AI restaurant discovery app that leverages Memory Bank to achieve a deep understanding of user context. Unlike conventional prompt-based systems, our agent remembers a user's past actions and preferences to proactively present the best options. This eliminates the need for manual searches and creates a seamless experience that will improve user satisfaction by 30% or more. We view this memory function as a non-negotiable feature for the future of new culinary experiences.” &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Toshiaki Iwamoto, CTO, Gurunavi&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"At L'Oréal, Beauty Tech is not just a support function — it is a powerful catalyst to create the beauty that moves the world. To live up to that ambition, we decided to build our own proprietary Beauty Tech Agentic Platform, powered by Google Cloud. Leveraging Agent Development Kit (ADK), we are leading a fundamental shift: moving from deterministic workflow automation to autonomous, outcome-oriented agent orchestration. Our agents are not locked in a vacuum — through Model Context Protocol (MCP), they are securely connected to our single sources of truth, including our Beauty Tech Data Platform and core operational applications. Google Cloud gives us the resilience, the multi-LLM flexibility, and the enterprise-grade trust framework we need to scale this platform globally, while keeping human oversight at the center."&lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt; – Etienne BERTIN, Group CIO, L'Oréal&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“Payhawk uses Agent Platform to transform our AI agents from simple task executors into genuine financial assistants. By leveraging Memory Bank, we have moved from stateless interactions to long-term context retention. Our agents now act like dedicated team members, autonomously recalling user-specific constraints and history. For example, our Financial Controller Agent now remembers a user’s habits to auto-submit expenses, reducing submission time by over 50%. This shift allows our agents to anticipate needs based on past behavior rather than just reacting to prompts.”&lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt; – Diyan Bogdanov, Principal Applied AI Engineer, Payhawk&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"PayPal uses Agent Platform to rapidly build and deploy agents in production. Specifically, we use Agent Development Kit (ADK) and visual tools to inspect agent interactions, and manage multi-agent workflows. This provides the step-by-step visibility we need to visualize the flow of intent and payment mandates. Finally, Agent Payment Protocol (AP2) on Agent Platform provides the critical foundation for trusted agent payments. helping our ecosystem accelerate the shipping of secure agent-based commerce experiences." &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;– Nitin Sharma, Principal Engineer, AI, PayPal&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Build AI agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Build agents quickly and easily by empowering your developers, business users and everyone in between to build and deploy agents at scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Build smarter agents, faster&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A major upgrade to ADK: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;More than six trillion tokens are processed monthly on Gemini models through &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/adk"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ADK&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Unlock more powerful reasoning by organizing agents into a network of sub-agents. This new, graph-based framework allows you to define clear, reliable logic for how agents work together to solve complex problems.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Workspaces are secure-by-design: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Give agents a hardened, sandboxed environment to run bash commands and manage files safely, isolated from your core systems.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Multimodal streaming:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Bring human-like stability to real-time interactions with multimodal support for live audio and video cues.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Connect your agents to the enterprise&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Securely access any system:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use plug-and-play architecture with Native Ecosystem Integrations to connect agents to your internal data and tools without custom coding.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Automate background operations:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Activate your data in BigQuery and Pub/Sub with Batch &amp;amp; Event-driven agents. This way, you can run massive, asynchronous tasks like content evaluation or data analysis in the background.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Go from idea to production in hours&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enable AI-driven development: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;A programmatic interface for coding agents to access Google’s complete suite of agentic capabilities, allowing them to build, evaluate, and deploy production-ready agents on your behalf.&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bringing agent building directly to Agent Studio:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Now, you can move seamlessly from building simple prompts to deploying complex agents in &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/agent-studio/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Studio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Once you're ready for deep customization, export your logic directly into ADK to continue development in a full-code environment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Get a head start with pre-built agents: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Access a curated set of agent templates in &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/agent-garden"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Garden&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; — including code modernization, financial analysis, economic research, invoice processing, and more — that serve as immediate building blocks for your multi-agent systems.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Scale AI agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To move from a proof-of-concept to a live environment, you need a platform that can handle the performance, state, and security requirements of real-world work. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Powering high-performance agent execution&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;The latest Agent Runtime: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Our revamped &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/build/runtime"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Runtime&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; delivers sub-second cold starts and allows you to provision new agents in seconds.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Support for multi-day workflows:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You can now deploy long-running agents that run autonomously for days at a time. This allows your agents to manage complex, multi-step workflows and deep reasoning tasks that require extended persistence, like managing a sales prospecting sequence. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Autonomous action with security-by-design environments:&lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/sandboxes"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/a&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/sandbox/code-execution-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Sandbox&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;provides a hardened environment to safely execute model-generated code and perform computer use tasks like browser-based automation without risk to your host systems.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent-to-agent orchestration:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Enables agents to seamlessly delegate tasks to one another, including support for complex, generative, and deterministic orchestration patterns. This ensures that for critical flows such as compliance, your agents follow well-specified paths every time.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Move beyond temporary session data to high-accuracy context&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Personalize interactions:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/memory-bank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Memory Bank&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; dynamically generates and curates long-term memories from conversations. Using new Memory Profiles, agents can recall high-accuracy details with low latency, ensuring context is never lost.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Link AI interactions to your existing records: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Store and manage history using &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/sessions"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Sessions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. With Custom Session IDs, you can use your own unique identifiers to track sessions and map them directly to your internal database and CRM records.&lt;/span&gt;&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enable real-time, human-like interactions:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using the WebSocket protocol for Bidirectional Streaming, you can help ensure your agents are highly responsive during live customer or employee interactions, processing audio and video without lag.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Govern AI agents&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Govern with a secure-by-design architecture that applies enterprise rigor to every agent in your fleet – from the ones you build on Agent Platform to the ones you source from our partner ecosystem.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Manage all of your agents through a single source of truth for identity and access.&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Assign every agent a verifiable identity: &lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/runtime/agent-identity"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Identity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; improves the security posture of your agents by ensuring every agent receives a unique cryptographic ID. This creates a clear, auditable trail for every action an agent takes, mapped back to defined &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/runtime/agent-identity"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;authorization policies&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Maintain a central library of approved tools:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Our new &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/agent-registry"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Registry&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; provides a single source of truth for your enterprise. It indexes every internal agent, tool, and skill, simplifying discovery and ensuring only governed, approved assets are available to your users.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Manage your agent fleet from one control point:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/gateways/agent-gateway-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; acts as the air traffic control for your agent ecosystem. It provides secure, unified connectivity between agents and tools across any environment, while enforcing consistent security policies and &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/model-armor?e=48754805&amp;amp;hl=en"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Armor&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; protections to safeguard against prompt injection and data leakage.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Use AI-powered insights to detect hidden risks and suspicious behavior before they impact your business.&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Detect suspicious behavior in real-time:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Agent Anomaly Detection uses statistical models and an LLM-as-a-judge framework to flag unusual reasoning. This works alongside &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/view-security-findings"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Threat Detection&lt;/span&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;to provide visibility into malicious activity, such as reverse shells or connections to known bad IP addresses.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Uncover vulnerabilities automatically:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A new &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/govern/view-security-findings"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Security&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; dashboard, powered by &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/security-command-center"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Security Command Center&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, unifies threat detection and risk analysis. It allows your teams to map relationships between agents and models, automate asset discovery, and scan for vulnerabilities in the underlying operating system and language packages.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Optimize AI agents &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Platform gives you the visibility needed to understand how your AI is performing, making it easy to refine their logic and get smarter over time.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Test your agents before they ship&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Simulate realistic conversations: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Use &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/evaluation/evaluate-simulated"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Simulation&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; to test agents against human-like synthetic user interactions and virtualized tools in a controlled environment. Agents are automatically scored based on task success and safety across multi-step conversations.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Monitor and improve in production&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Track live performance: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Use &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/evaluation/agent-evaluation"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Evaluation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to continuously score agents against live traffic using multi-turn autoraters that can evaluate the logic of an entire conversation, not just a single response. With turnkey dashboards and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/observability/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Observability&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, you can visually trace complex reasoning to debug issues as they happen.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Automate agent refinement: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of manually digging through logs, &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/optimize/evaluation/optimize-agent"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Optimizer&lt;/span&gt;&lt;/a&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;automatically clusters real-world failures and suggests refined system instructions to improve accuracy.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Detailed technical guides and a full list of updates are available in our updated &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/release-notes"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;release notes&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;a href="http://cloud.google.com/products/gemini-enterprise-agent-platform"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Platform &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;is the new standard for enterprise agent development, built to help you move from experimentation to production-scale impact, starting today.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 12:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform/</guid><category>Developers &amp; Practitioners</category><category>Google Cloud Next</category><category>AI &amp; Machine Learning</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/0_gemini_enterprise_agent_platform.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Introducing Gemini Enterprise Agent Platform, powering the next wave of agents</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/0_gemini_enterprise_agent_platform.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Michael Gerstenhaber</name><title>VP, Product Management, Cloud AI</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Michael Bachman</name><title>VP/GM, Cloud Foundations</title><department></department><company></company></author></item><item><title>Next ‘26: Redefining security for the AI era with Google Cloud and Wiz</title><link>https://cloud.google.com/blog/products/identity-security/next26-redefining-security-for-the-ai-era-with-google-cloud-and-wiz/</link><description>&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;Our news today from Next ‘26&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7fb5c59b83a0&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The AI era demands a new security era. Organizations are facing the dual challenge of harnessing the potential of AI while &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/threat-intelligence/defending-enterprise-ai-vulnerabilities?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;defending against its malicious use&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and Google Cloud can help you adapt and thrive.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The latest research from Google Cloud shows that adversaries are using AI to &lt;/span&gt;&lt;a href="https://cloud.google.com/transform/new-mandiant-report-boost-basics-with-ai-to-counter-adversaries/"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;accelerate the speed, scale, and sophistication of attacks&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Meanwhile, &lt;/span&gt;&lt;a href="https://cloud.google.com/security/resources/m-trends?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;M-Trends 2026&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; also showed that increased threat actor coordination has driven down the time to hand-off from an initial access to a secondary threat actor from eight hours to 22 seconds in the last three years.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today at Google Cloud Next, we are showcasing how Google Cloud can help you defend against increasingly sophisticated threats at machine speed, protect AI and multicloud environments, and secure cloud workloads at scale. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Delivering agentic defense &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our full-stack AI approach, from the chips to the models, gives you a competitive advantage with better integration and velocity to help protect customers. Not only can Google action insights from the world’s largest threat observatory and Mandiant frontline experts, but we also bring cutting-edge insights and breakthroughs from Google DeepMind, to help make your platforms more secure. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today we are introducing three new agents in &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/security-operations"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Google Security Operations&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to help you defend at the speed of AI. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Threat Hunting agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now in preview, can help teams proactively hunt for novel attack patterns and stealthy adversary behaviors that bypass traditional defenses. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Detection Engineering agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now in preview, can identify coverage gaps and create new detections for threat scenarios, reducing toil and transforming detection creation from a manual craft into an automated science. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Third-Party Context agent, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;coming soon to preview, can enrich your workflows with contextual data from third-party content. &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/1_-_Threat_Hunt_Initiation.gif"
        
          alt="1 - Threat Hunt Initiation"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="mhwgf"&gt;Initiating a threat hunt with the Threat Hunting agent&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Triage and Investigation agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; processed over &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;5 million alerts&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; in the last year, reducing a typical 30-minute manual analysis to 60 seconds with Gemini.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“Operational resilience and cybersecurity are the bedrock of customer trust at BBVA. By integrating advanced artificial intelligence, such as the Triage and Investigation agent, we are able to scale in new ways," said Diego Martinez Blanco, head of Security Technology, BBVA. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“It handles the initial heavy lifting and filters out false positives so we can prioritize issues that require human attention. The agent's transparent explanations allow our team to understand recommendations and ultimately dedicate our resources to more complex investigations,” he said.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can build your own security agents with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;remote Google Cloud model context protocol (MCP) server support for Google Security Operations&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now generally available. To make it even easier, you can also access the MCP server client directly from the Google Security Operations chat interface, available in preview. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-pull_quote"&gt;&lt;div class="uni-pull-quote h-c-page"&gt;
  &lt;section class="h-c-grid"&gt;
    &lt;div class="uni-pull-quote__wrapper h-c-grid__col h-c-grid__col--8 h-c-grid__col-m--6 h-c-grid__col-l--6
      h-c-grid__col--offset-2 h-c-grid__col-m--offset-3 h-c-grid__col-l--offset-3"&gt;
      &lt;div class="uni-pull-quote__inner-wrapper h-c-copy h-c-copy"&gt;
        &lt;q class="uni-pull-quote__text"&gt;Organizations leveraging an intelligence-led, AI-augmented approach to modern security operations with Google Cloud&amp;#x27;s agentic defense can realize a strong ROI.&lt;/q&gt;

        
          &lt;cite class="uni-pull-quote__author"&gt;
            
            
              &lt;span class="uni-pull-quote__author-meta"&gt;
                
                  &lt;strong class="h-u-font-weight-medium"&gt;Christopher Kissel&lt;/strong&gt;&lt;br /&gt;
                
                
                  Research Vice President, IDC
                
              &lt;/span&gt;
            
          &lt;/cite&gt;
        
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/section&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/2_-_Threat_Hunt_report.gif"
        
          alt="2 - Threat Hunt report"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="mhwgf"&gt;Findings report created by the Threat Hunting agent&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Security teams can also automate response actions with &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/rsac-26-supercharging-agentic-ai-defense-with-frontline-threat-intelligence"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;agentic automation&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;in Google Security Operations. To further move teams from manual triage to agentic defense, we introduced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/bringing-dark-web-intelligence-into-the-ai-era"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;dark web intelligence&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in Google Threat Intelligence, now in preview. Internal tests show it can &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;analyze millions of daily external events with 98% accuracy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to elevate threats that truly matter.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"IDC found that organizations experienced measurable operational gains, including substantial reductions in mean time to detect and mean time to respond, fewer false positives, and higher analyst productivity with AI-powered context and automation. These operational improvements translate into significant &lt;/span&gt;&lt;a href="https://services.google.com/fh/files/misc/gti_idc_business_value_report.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;business outcomes&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, such as shorter disruption periods, lower incident-related costs, and improved executive confidence in security posture and decision-making," said Christopher Kissel, research vice president, IDC. "Organizations leveraging an intelligence-led, AI-augmented approach to modern security operations with Google Cloud's agentic defense can realize a strong ROI." &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;New partner-supported workflows for Google Security Operations&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we are also announcing a robust cohort of &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/next26-announcing-new-partner-supported-workflows-for-google-security-operations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;new partner integrations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for Google Security Operations. Designed to deliver high-fidelity security workflows right out of the box, our latest participating Google Cloud Security integration ecosystem partners include Darktrace, Gigamon, and SAP.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Protecting AI and cloud applications across any infrastructure&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI and cloud applications are built across multiple platforms and models. To protect them end-to-end, we want to make it easier and faster to mitigate risk, regardless of where and how you build. This support includes major cloud environments like Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud; software-as-a-service (SaaS) environments like OpenAI; and even custom hosted environments. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/google-completes-acquisition-of-wiz?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Wiz, now a part of Google Cloud&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, expands and deepens our ability to protect the apps you build and run. Wiz empowers you to quickly and securely adopt AI, while also helping protect the AI development lifecycle. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Wiz announced its &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/introducing-wiz-ai-app" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;AI-Application Protection Platform&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (AI-APP) at the RSA Conference, providing deep visibility, risk posture, and runtime analysis for your AI applications. Wiz also announced &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/introducing-wiz-agents" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Wiz Security Agents&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/introducing-wiz-workflows" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Wiz Workflows&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, helping you identify and respond to risks and threats at machine speed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we’re taking our commitment to secure customers in any cloud, platform, and AI environment further. Wiz now &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/wiz-databricks-security-graph" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;supports Databricks&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; as well as new agent studios like AWS Agentcore, Gemini Enterprise Agent Platform, Microsoft Azure Copilot Studio, and Salesforce Agentforce, so customers gain visibility however their teams choose to build.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition, Wiz continues to support security ecosystems with integrations to the outer layer of the cloud, including &lt;/span&gt;&lt;a href="http://wiz.io/blog/wiz-apigee-integration-for-api-discovery" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Apigee&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.cloudflare.com/press/press-releases/2026/cloudflare-partners-with-wiz-to-secure-the-global-ai-attack-surface/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloudflare AI Security for Apps&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and the &lt;/span&gt;&lt;a href="https://www.wiz.io/blog/introducing-wiz-vercel-integration" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vercel platform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, further extending the power of the Wiz Security Graph. We’ve also updated how we integrate security detections from Wiz Defend with Google Security Operations and Mandiant Threat Defense to help analysts more easily configure automatic threat information forwarding.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Wiz is also announcing new capabilities designed to secure the AI-native development lifecycle, helping teams to innovate faster and more securely:  &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secure vibe-coded applications: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Wiz is announcing a new integration, generally available in May, that runs Wiz security scanning directly inside the Lovable platform so vulnerabilities, secrets, and misconfigurations caught by Wiz surface in Lovable's built-in security view, right where teams are already building.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secure AI-generated code&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Wiz removes risks from AI-generated code the moment it is created. Inline AI security hooks integrate directly into IDEs and agent workflows to evaluate prompts and scan AI-generated output instantly, injecting security guardrails before the code is ever committed.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent-based remediation&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Wiz Skills equip coding agents and AI-native IDEs with full code-to-cloud context and validated attack surface findings from the Wiz Security Graph. These capabilities enable teams to trigger automated, agent-driven remediation workflows either locally from the developer's individual IDE or globally at the repository and pull request level within your version control system.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Eliminate shadow AI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Wiz’s dynamic &lt;/span&gt;&lt;a href="https://www.wiz.io/academy/ai-security/ai-bom-ai-bill-of-materials" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI-Bill of Materials&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (AI-BOM)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; automatically inventories all AI frameworks, models, and IDE extensions across your environment. This provides complete visibility into what is writing code across your stack, allowing you to track sanctioned corporate tools like Gemini Code Assist and GitHub Copilot while simultaneously uncovering unapproved shadow AI plugins.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can learn more about the &lt;/span&gt;&lt;a href="https://wiz.io/blog/wiz-at-google-cloud-next" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Wiz announcements here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Securing your agents and the agentic web&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition to securing your cloud and AI workloads, Google Cloud’s secure-by-design foundation can help you innovate at the speed of AI — from agents to fraud defense to the web.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Securing and governing agents with the Gemini Enterprise Agent Platform&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;To build, orchestrate, govern, and optimize agents&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;today we are announcing &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/introducing-gemini-enterprise-agent-platform"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Enterprise Agent Platform&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; including:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Identity&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to enable access management and &lt;/span&gt;&lt;a href="https://cloud.google.com/transform/these-4-ai-governance-tips-help-counter-shadow-agents"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI governance at scale&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Our new&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;capability provides agents unique identities to operate autonomously with specific authentication flows, and with scoped human delegation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent Gateway, &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;which&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;enables policy enforcement for all agent-to-agent and agent-to-tool connections. It governs your enterprise agent traffic and understands agent protocols like MCP and Agent2Agent (A2A) to inspect and secure every agent interaction.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Model Armor&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;,&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;our runtime protection for model and agent interactions, now integrates with Agent Gateway, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Runtime&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;/a&gt;&lt;a href="https://docs.cloud.google.com/model-armor/model-armor-langchain-integration"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Langchain&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; available in preview, and &lt;/span&gt;&lt;a href="https://firebase.google.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Firebase&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, generally available, to help developers add inline enforcement and sanitization of agent traffic and interactions without the need to change code. These integrations expand Model Armor's protection against runtime risks such as prompt injections, tool poisoning, and sensitive data leakage across &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/model-armor/integrations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud services and our AI portfolio&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Securing the agentic web with Google Cloud Fraud Defense and Chrome Enterprise&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we are evolving reCAPTCHA with the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/introducing-google-cloud-fraud-defense-the-next-evolution-of-recaptcha"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;launch of &lt;/span&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Fraud Defense&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, generally available. This comprehensive platform is designed to discern the legitimacy and authorization of bots, humans, and agents. Using the same scale and signals that protect Google’s own ecosystem, Fraud Defense will soon offer in preview agent-specific capabilities for human users and AI agents that can help secure the digital commerce journey, from account creation and login to payment and checkout.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our commitment to securing AI extends to the browser, a vital endpoint for interacting with AI. &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/chrome-enterprise/new-ways-to-navigate-the-ai-era-with-googles-enterprise-platforms-and-devices"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Chrome Enterprise&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; provides comprehensive data protection for the AI era with the visibility and controls needed to embrace AI safely without compromising corporate data:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;AI-aware extension threat detections&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, now in preview, can surface advanced extension telemetry that helps security teams detect and respond to anomalous AI agent activity. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;New &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;shadow AI reporting&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, generally available soon, can help you gain visibility into the shadow AI landscape by flagging employee use of unsanctioned web-based AI and SaaS applications. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What’s new in Trusted Cloud&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We continue to offer new security controls and enhance capabilities across identity, data, and  networking on our cloud platform to help you secure your environments. Today we’re announcing the following updates:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Simplifying permissions with modern IAM&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;To help achieve least privilege quickly and simply, we’ve streamlined our predefined roles catalog with easy-to-use administrator, editor, and viewer roles, such as the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/iam/docs/role-picker-gemini"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;IAM role picker&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and the ability to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/docs/authentication/reauthentication"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;re-authenticate sensitive actions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Data security&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We are announcing several new capabilities for our cloud platform data security portfolio to help protect your most sensitive data and accelerate AI transformation.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Confidential Computing&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: In partnership with NVIDIA, today we’re announcing &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/confidential-computing"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Confidential Computing&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; support for G4 VMs&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, featuring NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on Google Compute Engine (GCE) Confidential G4 VMs, available in preview globally, to help strengthen confidentiality and integrity for a wide spectrum of sensitive AI workloads. In partnership with Intel, we’re also introducing the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;preview of C4 Confidential VMs&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, bringing Intel TDX to 6th Gen Xeon processors to help protect diverse AI and &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/compute/c4-vms-based-on-intel-6th-gen-xeon-granite-rapids-now-ga"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;analytics workloads&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; while providing industry-leading compute density and performance.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Key Management Services (KMS)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: We are announcing the new &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Confidential External Key Manager (cEKM)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; in preview, giving you the flexibility to host and protect external keys in any region and maintain verifiable control within a confidential environment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Post-quantum cryptography (PQC)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: We are introducing &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;KMS Quantum Safe Key Imports&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, available&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;in preview, to help you bring your own keys with quantum-safe algorithms. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secret Manager&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: To help prevent password leaks and mitigate prompt injection risks, we are announcing the general availability of the native integration of our &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Secret Manager with Agent Development Kit&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Network security &lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud’s Cross-Cloud Network security products offer several new capabilities:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud NGFW: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We’re announcing the &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/firewall?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud NGFW&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;advanced malware sandbox&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, in preview later this year, to help defend against highly evasive zero-day threats. This capability is powered by &lt;/span&gt;&lt;a href="https://www.paloaltonetworks.com/apps/pan/public/downloadResource?pagePath=/content/pan/en_US/resources/datasheets/advanced-wildfire" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Palo Alto Networks Advanced Wildfire&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, trained on data from &lt;/span&gt;&lt;a href="https://www.paloaltonetworks.com/apps/pan/public/downloadResource?pagePath=/content/pan/en_US/resources/datasheets/advanced-wildfire" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;more than 70,000 Palo Alto Networks customers to stop 99% of known and unknown malware&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Armor: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We have released new &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Armor&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; managed rules, powered by Thales Imperva&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;and&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;available in preview, to detect Layer 7 application attacks and zero-day CVEs (like &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/identity-security/responding-to-cve-2025-55182"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;React2Shell&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;). &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Advancing Google Cloud security with SCC&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;As our Google Cloud-native security solution, Security Command Center (SCC) establishes a cloud security baseline to protect both your traditional and AI applications on Google Cloud:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;AI agents, models, and MCP servers are secured by providing continuous discovery and comprehensive risk analysis to identify threats, vulnerabilities, and misconfigurations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;SCC will add deep runtime visibility to uncover shadow AI for your Google Cloud workloads. Coming soon in preview, SCC will automatically discover unmanaged agentic workloads — including agents, MCP servers hosted on Cloud Run, GKE, and inference endpoints running on GKE, and surface those as posture findings in SCC.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Our enhanced &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/security-command-center?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Security Command Center Standard tier&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; provides data security posture management, compliance, vulnerability management, and risk analysis to help any Google Cloud customer establish strong security, compliance and risk coverage from the start at no additional costs. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Take the next step&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When you make Google part of your security team, you gain the power of an intelligence-driven, AI-native defense; the freedom of an open cloud that’s secure-by-design; and the industry's most-battle tested experts as an extension of your organization. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For more on these new innovations and how you can secure what’s next, &lt;/span&gt;&lt;a href="https://www.googlecloudevents.com/next-vegas/session-library?session_id=3818847&amp;amp;name=secure-what&amp;amp;" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;tune in to watch our security spotlight&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. And be sure to check out the many great security breakout sessions — live and on-demand — to learn more about all of our Next ‘26 announcements.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 22 Apr 2026 12:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/identity-security/next26-redefining-security-for-the-ai-era-with-google-cloud-and-wiz/</guid><category>AI &amp; Machine Learning</category><category>Networking</category><category>Developers &amp; Practitioners</category><category>Google Cloud Next</category><category>Security &amp; Identity</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/GCN26_102_BlogHeader_2436x1200_Opt_3_Dark.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Next ‘26: Redefining security for the AI era with Google Cloud and Wiz</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/GCN26_102_BlogHeader_2436x1200_Opt_3_Dark.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/identity-security/next26-redefining-security-for-the-ai-era-with-google-cloud-and-wiz/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Francis deSouza</name><title>COO, Google Cloud and President, Security Products</title><department></department><company></company></author></item></channel></rss>