<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/"><channel><title>Networking</title><link>https://cloud.google.com/blog/products/networking/</link><description>Networking</description><atom:link href="https://cloudblog.withgoogle.com/blog/products/networking/rss/" rel="self"></atom:link><language>en</language><lastBuildDate>Fri, 10 Apr 2026 16:00:09 +0000</lastBuildDate><image><url>https://cloud.google.com/blog/products/networking/static/blog/images/google.a51985becaa6.png</url><title>Networking</title><link>https://cloud.google.com/blog/products/networking/</link></image><item><title>Migrating to Google Cloud’s Application Load Balancer: A practical guide</title><link>https://cloud.google.com/blog/products/networking/migrate-on-prem-application-load-balancing-to-google-cloud/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Migrating your existing application load balancer infrastructure from an on-premises hardware solution to Cloud Load Balancing offers substantial advantages in scalability, cost-efficiency, and tight integration within the Google Cloud ecosystem. Yet, a fundamental question often arises: "What about our current load balancer configurations?"&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Existing on-premises load balancer configurations often contain years of business-critical logic for traffic manipulation. The good news is that not only can you fully migrate existing functionalities, but this migration also presents a significant opportunity to modernize and simplify your traffic management.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;This guide outlines a practical approach for migrating your existing load balancer to Google Cloud’s Application Load Balancer. It addresses common functionalities, leveraging both its declarative configurations and the innovative, event-driven Service Extensions edge compute capability.&lt;/span&gt;&lt;/p&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;A simple, phased approach to migration&lt;/span&gt;&lt;/h3&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Transitioning from an imperative, script-based system to a cloud-native, declarative-first model requires a structured plan. We recommend a straightforward, four-phase approach.&lt;/span&gt;&lt;/p&gt;
&lt;h4 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 1: Discovery and mapping&lt;/span&gt;&lt;/h4&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Before commencing any migration, you must understand what you have. Analyze and categorize your current load balancer configurations. What is each rule's intent? Is it performing a simple HTTP-to-HTTPS redirect? Is it engaged in HTTP header manipulation (addition or removal)? Or is it handling complex, custom authentication logic? &lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Most configurations typically fall into two primary categories:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Common patterns:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Logic that is common to most web applications, such as redirects, URL rewrites, basic header manipulation, and IP-based access control lists (ACLs).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bespoke business logic:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Complex logic unique to your application, like custom proprietary token authentication, advanced header extraction / replacement, dynamic backend selection based on HTTP attributes, or HTTP response body manipulation. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 2: Choose your Google Cloud equivalent&lt;/span&gt;&lt;/h4&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once your rules are categorized, the next step involves mapping them to the appropriate Google Cloud feature. This is not a one-to-one replacement; it's a strategic choice.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Option 1: the declarative path (for ~80% of rules)&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;For the majority of common patterns, leveraging the Application Load Balancer's built-in declarative features is usually the best approach. Instead of a script, you define the desired state in a configuration file. This is simpler to manage, version-control, and scale.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Common patterns to declarative feature mapping:  &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="3" style="list-style-type: square; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Redirects/rewrites&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; -&amp;gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Application Load Balancer URL maps&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="3" style="list-style-type: square; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;ACLs/throttling&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; -&amp;gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Cloud Armor security policies&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="3" style="list-style-type: square; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Session persistence&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; -&amp;gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;backend service configuration&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Option 2: The programmatic path (for complex, bespoke rules)&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;When dealing with complex, bespoke business logic, you have a programmatic equivalent: &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/service-extensions/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Service Extensions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a powerful edge compute capability that allows you to inject custom code (written in Rust, C++ or Go) directly into the load balancer's data path. This approach gives you flexibility in a modern, managed, and high-performance framework.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_bkebSe1.max-1000x1000.jpg"
        
          alt="image1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="s1mli"&gt;This flowchart helps you decide the appropriate Google Cloud feature for each configuration&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h4 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 3: Test and validate&lt;/span&gt;&lt;/h4&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Once you’ve chosen the appropriate path for your configurations, you are ready to &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;deploy your new Application Load Balancer configuration in a staging environment that mirrors your production setup. Thoroughly test all application functionality, paying close attention to the migrated logic. Use a combination of automated testing and manual QA to validate the redirects, security policies, and that the custom Service Extensions logic are behaving as expected.&lt;/span&gt;&lt;/p&gt;
&lt;h4 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Phase 4: Phased cutover (canary deployment)&lt;/span&gt;&lt;/h4&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Don't flip a single switch for all your traffic; instead, implement a phased migration strategy. Start the transitioning process by routing a small percentage of production traffic (e.g., 5-10%) to your new Google Cloud load balancer. During this initial period, be sure to monitor key metrics like latency, error rates, and application performance. As you gain confidence, you can progressively increase the percentage of traffic routed to the Application Load Balancer. Always have a clear rollback plan to revert back to the legacy infrastructure in the event you encounter critical issues.&lt;/span&gt;&lt;/p&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Best practices for a smooth migration&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Drawing from our practical experience, we have compiled the following recommendations to assist you in planning your load balancer migrations. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Analyze first, migrate second:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A thorough analysis of your existing configurations is the most critical step. Don't "lift and shift" logic that is no longer needed.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Prefer declarative:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Always default to Google Cloud's managed, declarative features (URL Maps, Cloud Armor) first. They are simpler, more scalable, and require less maintenance.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Use Service Extensions strategically:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Reserve Service Extensions for the complex, bespoke business logic that declarative features cannot handle.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Monitor everything:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Continuously monitor both your existing load balancers and Google Cloud load balancers during the migration. Watch key metrics like traffic volume, latency, and error rates to detect and address issues instantly.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Train your team:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Ensure your team is trained on Cloud Load Balancing concepts. This will empower them to effectively operate and maintain the new infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Migrating from the existing on-premises load balancer infrastructure is more than just a technical task, it's an opportunity to modernize your application delivery. By thoughtfully mapping your current load balancing configurations and capabilities to either declarative Application Load Balancer features or programmatic Service Extensions, you can build a more scalable, resilient, and cost-effective infrastructure destined for future demands.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To get started, review the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/application-load-balancer"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Application Load Balancer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/service-extensions/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Service Extensions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; features and advanced capabilities to come up with the right design for your application. For more guidance and complex use cases, contact your &lt;/span&gt;&lt;a href="https://cloud.google.com/contact"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud team&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 10 Apr 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/migrate-on-prem-application-load-balancing-to-google-cloud/</guid><category>Cloud Migration</category><category>Developers &amp; Practitioners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Migrating to Google Cloud’s Application Load Balancer: A practical guide</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/migrate-on-prem-application-load-balancing-to-google-cloud/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Gopinath Balakrishnan</name><title>Customer Engineer, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Xiaozang Li</name><title>Customer Engineer, Google Cloud</title><department></department><company></company></author></item><item><title>Experimenting with GPUs: GKE managed DRANET and Inference Gateway AI Deployment</title><link>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-gpus-gke-managed-dranet-and-inference-gateway-ai-deployment/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building and serving models on infrastructure is a strong use case for businesses. In Google Cloud, you have the ability to design your AI infrastructure to suit your workloads. Recently, I experimented with Google Kubernetes Engine &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;(GKE) managed DRANET&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; while deploying a model for inference with NVIDIA B200 GPUs on GKE. In this blog, we will explore this setup in easy to follow steps.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;What is DRANET &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;a href="https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Dynamic Resource Allocation (DRA)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is a feature that lets you request and share resources among Pods. DRANET allows you to request and allocate networking resources for your Pods, including network interfaces that support TPUs &amp;amp; Remote Direct Memory Access (RDMA). In my case, the use of high-end GPUs.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;How GPU RDMA VPC works &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/rdma-network-profiles#overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;RDMA network&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is set up as an isolated VPC, which is regional and assigned a network profile type. In this case, the network profile type is RoCEv2. This VPC is dedicated for GPU-to-GPU communication. The GPU VM families have RDMA capable NICs that connect to the RDMA VPC. The GPUs communicate between multiple nodes via this low latency, high speed rail aligned setup.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Design pattern example&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our aim was to deploy a LLM model (Deepseek) onto a GKE cluster with &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/compute/docs/accelerator-optimized-machines#a4-vms"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A4 nodes&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; that support 8 B200 GPUs and serve it via &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Inference gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; privately. To set up an &lt;a href="https://docs.cloud.google.com/ai-hypercomputer/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI Hypercomputer&lt;/span&gt;&lt;/a&gt; GKE cluster, you can use the Cluster Toolkit, but in my case, I wanted to test the &lt;span style="vertical-align: baseline;"&gt;GKE managed &lt;/span&gt;DRANET dynamic setup of the networking that supports RDMA for the GPU communication.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-archgpu.max-1000x1000.png"
        
          alt="1-archgpu"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This design utilizes the following services to provide an end-to-end solution:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;VPC:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Total of 3 VPC. One VPC manually created, two created automatically by &lt;span style="vertical-align: baseline;"&gt;GKE managed &lt;/span&gt;DRANET, one standard and one for RDMA.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; To deploy the workload.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE Inference gateway:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; To expose the workload internally using a regional internal Application Load Balancers type gke-l7-rilb.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A4 VM’s:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; These support RoCEv2 with NVIDIA B200 GPU.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Putting it together &lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To get access to the A4 VM a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/ai-hypercomputer/docs/consumption-models#comparison"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;future reservation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; was used. This is linked to a specific zone.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Begin:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Set up the environment &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/create-modify-vpc-networks#create-custom-network"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;standard VPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, with firewall rules and subnet in the same zone as the reservation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/proxy-only-subnets#proxy_only_subnet_create"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;proxy-only subnet&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; this will be used with the Internal regional application load balancer attached to the GKE inference gateway&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Create a standard GKE cluster node and default node pool.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud container clusters create $CLUSTER_NAME \\\r\n    --location=$ZONE \\\r\n    --num-nodes=1 \\\r\n    --machine-type=e2-standard-16 \\\r\n    --network=${GVNIC_NETWORK_PREFIX}-main \\\r\n    --subnetwork=${GVNIC_NETWORK_PREFIX}-sub \\\r\n    --release-channel rapid \\\r\n    --enable-dataplane-v2 \\\r\n    --enable-ip-alias \\\r\n    --addons=HttpLoadBalancing,RayOperator \\\r\n    --gateway-api=standard \\\r\n    --enable-ray-cluster-logging \\\r\n    --enable-ray-cluster-monitoring \\\r\n    --enable-managed-prometheus \\\r\n    --enable-dataplane-v2-metrics \\\r\n    --monitoring=SYSTEM&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fc880&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once that is complete you can connect to your cluster:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud container clusters get-credentials $CLUSTER_NAME --zone $ZONE --project $PROJECT&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fcd30&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Create a &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra#enable-dra-driver-gpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GPU node pool&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (this example uses, A4 VM with reservation) and additionals flags: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;code style="vertical-align: baseline;"&gt;---accelerator-network-profile=auto&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; (GKE automatically adds the gke.networks.io/accelerator-network-profile: auto label to the nodes) &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;--node-labels=cloud.google.com/gke-networking-dra-driver=true &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;(Enables DRA for high-performance networking)&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud beta container node-pools create $NODE_POOL_NAME \\\r\n  --cluster $CLUSTER_NAME \\\r\n  --location $ZONE \\\r\n  --node-locations $ZONE \\\r\n  --machine-type a4-highgpu-8g \\\r\n  --accelerator type=nvidia-b200,count=8,gpu-driver-version=latest \\\r\n  --enable-autoscaling --num-nodes=1 --total-min-nodes=1 --total-max-nodes=3 \\\r\n  --reservation-affinity=specific \\\r\n--reservation=projects/$PROJECT/reservations/$RESERVATION_NAME/reservationBlocks/$BLOCK_NAME \\\r\n   --accelerator-network-profile=auto \\\r\n--node-labels=cloud.google.com/gke-networking-dra-driver=true&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fca60&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Next:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Create a ResourceClaimTemplate, which will be used to attach the networking resources to your deployments. The &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;deviceClassName: mrdma.google.com &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;is used for GPU workloads:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: resource.k8s.io/v1\r\nkind: ResourceClaimTemplate\r\nmetadata:\r\n  name: all-mrdma\r\nspec:\r\n  spec:\r\n    devices:\r\n      requests:\r\n      - name: req-mrdma\r\n        exactly:\r\n          deviceClassName: mrdma.google.com\r\n          allocationMode: All&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fcd90&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy model and inference&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;Now that a cluster and node pool is setup,&lt;/span&gt; we can deploy a model and serve it via Inference gateway. In my experiment I used DeepSeek but this could be any model.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy model and services&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt; nodeSelector: gke.networks.io/accelerator-network-profile: auto &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;is used to assign to the GPU node&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt; resourceClaims: &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;attaches the resource we defined for networking&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Create a secret (&lt;/span&gt;&lt;a href="https://huggingface.co/docs/hub/security-tokens#how-to-manage-user-access-tokens" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;I used Hugging Face&lt;/span&gt;&lt;/a&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;strong&gt; token)&lt;/strong&gt;:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl create secret generic hf-secret \\\r\n  --from-literal=hf_token=${HF_TOKEN}&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fc850&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;Deployment&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;apiVersion: apps/v1\r\nkind: Deployment\r\nmetadata:\r\n  name: deepseek-v3-1-deploy\r\nspec:\r\n  replicas: 1\r\n  selector:\r\n    matchLabels:\r\n      app: deepseek-v3-1\r\n  template:\r\n    metadata:\r\n      labels:\r\n        app: deepseek-v3-1\r\n        ai.gke.io/model: deepseek-v3-1\r\n        ai.gke.io/inference-server: vllm\r\n        examples.ai.gke.io/source: user-guide\r\n    spec:\r\n      containers:\r\n      - name: vllm-inference\r\n        image: us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20250819_0916_RC01\r\n        resources:\r\n          requests:\r\n            cpu: &amp;quot;190&amp;quot;\r\n            memory: &amp;quot;1800Gi&amp;quot;\r\n            ephemeral-storage: &amp;quot;1Ti&amp;quot;\r\n            nvidia.com/gpu: &amp;quot;8&amp;quot;\r\n          limits:\r\n            cpu: &amp;quot;190&amp;quot;\r\n            memory: &amp;quot;1800Gi&amp;quot;\r\n            ephemeral-storage: &amp;quot;1Ti&amp;quot;\r\n            nvidia.com/gpu: &amp;quot;8&amp;quot;\r\n          claims:\r\n          - name: rdma-claim\r\n        command: [&amp;quot;python3&amp;quot;, &amp;quot;-m&amp;quot;, &amp;quot;vllm.entrypoints.openai.api_server&amp;quot;]\r\n        args:\r\n        - --model=$(MODEL_ID)\r\n        - --tensor-parallel-size=8\r\n        - --host=0.0.0.0\r\n        - --port=8000\r\n        - --max-model-len=32768\r\n        - --max-num-seqs=32\r\n        - --gpu-memory-utilization=0.90\r\n        - --enable-chunked-prefill\r\n        - --enforce-eager\r\n        - --trust-remote-code\r\n        env:\r\n        - name: MODEL_ID\r\n          value: deepseek-ai/DeepSeek-V3.1\r\n        - name: HUGGING_FACE_HUB_TOKEN\r\n          valueFrom:\r\n            secretKeyRef:\r\n              name: hf-secret\r\n              key: hf_token\r\n        volumeMounts:\r\n        - mountPath: /dev/shm\r\n          name: dshm\r\n        livenessProbe:\r\n          httpGet:\r\n            path: /health\r\n            port: 8000\r\n          initialDelaySeconds: 1800\r\n          periodSeconds: 10\r\n        readinessProbe:\r\n          httpGet:\r\n            path: /health\r\n            port: 8000\r\n          initialDelaySeconds: 1800\r\n          periodSeconds: 5\r\n      volumes:\r\n      - name: dshm\r\n        emptyDir:\r\n            medium: Memory\r\n      nodeSelector:\r\n        gke.networks.io/accelerator-network-profile: auto\r\n      resourceClaims:\r\n      - name: rdma-claim\r\n        resourceClaimTemplateName: all-mrdma\r\n---\r\napiVersion: v1\r\nkind: Service\r\nmetadata:\r\n  name: deepseek-v3-1-service\r\nspec:\r\n  selector:\r\n    app: deepseek-v3-1\r\n  type: ClusterIP\r\n  ports:\r\n    - protocol: TCP\r\n      port: 8000\r\n      targetPort: 8000\r\n---\r\napiVersion: monitoring.googleapis.com/v1\r\nkind: PodMonitoring\r\nmetadata:\r\n  name: deepseek-v3-1-monitoring\r\nspec:\r\n  selector:\r\n    matchLabels:\r\n      app: deepseek-v3-1\r\n  endpoints:\r\n  - port: 8000\r\n    path: /metrics\r\n    interval: 30s&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fcbb0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy GKE Inference Gateway&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway#prepare-environment"&gt;install needed Custom Resource Definitions (CRDs) in your GKE cluster:&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For GKE versions &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;1.34.0-gke.1626000&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or later, install only the alpha &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; CRD:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/v1.0.0/config/crd/bases/inference.networking.x-k8s.io_inferenceobjectives.yaml&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fc910&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Create Inference pool  &lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;helm install deepseek-v3-pool \\\r\n  oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool \\\r\n  --version v1.0.1 \\\r\n  --set inferencePool.modelServers.matchLabels.app=deepseek-v3-1 \\\r\n  --set provider.name=gke \\\r\n  --set inferenceExtension.monitoring.gke.enabled=true&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fcc70&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Create the Gateway, HTTPRoute and InferenceObjective&lt;/span&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# 1. The Regional Internal Gateway (ILB)\r\napiVersion: gateway.networking.k8s.io/v1\r\nkind: Gateway\r\nmetadata:\r\n  name: deepseek-v3-gateway\r\n  namespace: default\r\nspec:\r\n  gatewayClassName: gke-l7-rilb\r\n  listeners:\r\n  - name: http\r\n    protocol: HTTP\r\n    port: 80\r\n    allowedRoutes:\r\n      namespaces:\r\n        from: Same\r\n---\r\n# 2. The HTTPRoute (Routing to the Pool)\r\napiVersion: gateway.networking.k8s.io/v1\r\nkind: HTTPRoute\r\nmetadata:\r\n  name: deepseek-v3-route\r\n  namespace: default\r\nspec:\r\n  parentRefs:\r\n  - name: deepseek-v3-gateway\r\n  rules:\r\n  - matches:\r\n    - path:\r\n        type: PathPrefix\r\n        value: /\r\n    backendRefs:\r\n    - group: inference.networking.k8s.io\r\n      kind: InferencePool\r\n      name: deepseek-v3-pool\r\n---\r\n# 3. The Inference Objective (Performance Logic)\r\napiVersion: inference.networking.x-k8s.io/v1alpha2\r\nkind: InferenceObjective\r\nmetadata:\r\n  name: deepseek-v3-objective\r\n  namespace: default\r\nspec:\r\n  poolRef:\r\n    name: deepseek-v3-pool&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fcd00&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once complete, you can create a test VM in your main VPC and make a call to the IP address of the GKE Inference Gateway:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;curl -N -s -X POST &amp;quot;http://$GATEWAY_IP/v1/chat/completions&amp;quot; \\\r\n  -H &amp;quot;Content-Type: application/json&amp;quot; \\\r\n  -d \&amp;#x27;{\r\n    &amp;quot;model&amp;quot;: &amp;quot;deepseek-ai/DeepSeek-V3.1&amp;quot;,\r\n    &amp;quot;messages&amp;quot;: [{&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;Box A: red. Box B: blue. Box C: empty. Move A to C, Move B to A, Swap B and C. Where is red?&amp;quot;}],\r\n    &amp;quot;stream&amp;quot;: true\r\n  }\&amp;#x27; | stdbuf -oL grep &amp;quot;data: &amp;quot; | sed -u \&amp;#x27;s/^data: //\&amp;#x27; | grep -v &amp;quot;\\[DONE\\]&amp;quot; | \\\r\n  jq --unbuffered -rj \&amp;#x27;.choices[0].delta | (.reasoning_content // .reasoning // .content // empty)\&amp;#x27;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17446fca30&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;span style="vertical-align: baseline;"&gt;Next Steps&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Take a deeper dive into GKE managed DRANET and GKE Inference Gateway, review the following.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Blog: &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-device-management-with-dra-dynamic-resource-allocation?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DRA: A new era of Kubernetes device management with Dynamic Resource Allocation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Document set: &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/config-auto-net-for-accelerators"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DRANET&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Documentation: &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/ai-hypercomputer/docs/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI Hypercomputer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Want to ask a question, find out more or share a thought? Please connect with me on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/ammett/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 08 Apr 2026 10:05:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-gpus-gke-managed-dranet-and-inference-gateway-ai-deployment/</guid><category>Networking</category><category>Developers &amp; Practitioners</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dranet.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Experimenting with GPUs: GKE managed DRANET and Inference Gateway AI Deployment</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/0-hero-dranet.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/developers-practitioners/experimenting-with-gpus-gke-managed-dranet-and-inference-gateway-ai-deployment/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ammett Williams</name><title>Developer Relations Engineer</title><department></department><company></company></author></item><item><title>See beyond the IP and secure URLs with Google Cloud NGFW</title><link>https://cloud.google.com/blog/products/identity-security/see-beyond-the-ip-and-secure-urls-with-google-cloud-ngfw/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a cloud-first world, traditional IP-based defenses are no longer enough to protect your perimeter. As services migrate to shared infrastructure and content delivery networks, relying on static IP addresses and FQDNs can create security gaps.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Because single IP addresses can host multiple services, and IPs addresses can change frequently, we are introducing domain filtering with a wildcard capability in Cloud Next Generation Firewall (NGFW) Enterprise. This new capability provides increased security and granular policy controls.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Why domain and SNI filtering matters&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Cloud NGFW URL filtering service performs deep inspections of HTTP payloads to secure workloads against threats from both public and internal networks. This service elevates security controls to the application layer and helps restrict access to malicious domains. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Key use cases include: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Granular egress control&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This capability enables the precise allowing and blocking of connections based on domain names and SNI information found in egress HTTP(S) messages. By inspecting Layer 7 (L7) headers, it offers significantly finer control than traditional filtering based solely on IP addresses and FQDNs, which can be inefficient when a single IP hosts multiple services.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Control access without decrypting&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: For organizations that prefer not to perform full TLS decryption on their traffic, Cloud NGFW can still enforce security policies by controlling traffic based on SNI headers provided during the TLS handshake. This allows for effective domain-level filtering while maintaining end-to-end encryption for privacy or compliance reasons.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Reduced operational overhead&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Implementing domain-based filtering helps reduce the constant maintenance typically required to track frequently changing IP addresses and DNS records. By focusing on stable domain identities rather than dynamic network attributes, security teams can minimize the manual effort involved in updating firewall rulebases.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Flexible matching&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The service utilizes matcher strings within URL lists, supporting limited wildcard domains to define criteria for both domains and subdomains. For example, using a wildcard like &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;*.example.com&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; allows a single filter to cover all associated subdomains, providing a more scalable solution than defining thousands of individual FQDN entries.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Improved security: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;URL filtering significantly enhances the security posture by protecting against sophisticated flaws like SNI header spoofing. By evaluating L7 headers before allowing access to an application, Cloud NGFW ensures that attackers cannot bypass security controls by simply spoofing lower-layer identifiers. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;How Cloud NGFW URL filtering works&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The URL filtering service functions by inspecting traffic at L7 using a distributed architecture. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_zzP0Xt6.max-1000x1000.png"
        
          alt="image1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="6nmqq"&gt;Cloud NGFW URL filtering service&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can get started with URL filtering in three simple steps.&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deploy Cloud NGFW endpoints&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ol&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The first step is to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/configure-firewall-endpoints#create-firewall-endpoint"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;create and deploy a Cloud NGFW endpoint&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in a zone. The &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/about-firewall-endpoints"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;NGFW endpoint&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is an organization level resource. Please ensure you have the right permission before deploying the endpoint.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the endpoint is deployed you can &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/configure-firewall-endpoint-associations#create-end-assoc-network"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;associate it to one or more VPCs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; of your choice.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Create security profiles and security profile groups:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ol&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/about-security-profiles#url-filtering-profile"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;URL filtering security profile&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; holds the URL filters with matcher strings and an action (allow or deny).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/about-security-profile-groups"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;security profile group&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; acts as a container for these security profiles, which is then referenced by a firewall policy rule. &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/configure-urlf-security-profiles#create-urlf-security-profile"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create URL filtering security profiles&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; with desired URLs, wildcard FQDNs and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/configure-security-profile-groups#create-security-profile-group"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;add them to a security profile group&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Once the security profile group is created, you will need to reference the security profile group in firewall policies.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Policy enforcement:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ol&gt;
&lt;li aria-level="2" style="list-style-type: lower-alpha; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;You enable the service by configuring a hierarchical or global network firewall policy rule using the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;apply_security_profile_group&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; action, specifying the name of your security profile group. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For more information about configuring a firewall policy rule, see the following:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/using-firewall-policies#create-ingress-rule-target-vm"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create an ingress hierarchical firewall policy rule&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/using-firewall-policies#create-egress-rule-target-vm"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create an egress hierarchical firewall policy rule&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/use-network-firewall-policies#create-ingress-rule-target-vm"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create an ingress global network firewall policy rule&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/use-network-firewall-policies#create-egress-rule-target-vm"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Create an egress global network firewall policy rule&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Getting started&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Get started with Cloud NGFW URL filtering by visiting our &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/firewall/docs/about-url-filtering"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/cloud-ngfw-enterprise-urlf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;codelab&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 07 Apr 2026 17:30:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/identity-security/see-beyond-the-ip-and-secure-urls-with-google-cloud-ngfw/</guid><category>Networking</category><category>Developers &amp; Practitioners</category><category>Security &amp; Identity</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>See beyond the IP and secure URLs with Google Cloud NGFW</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/identity-security/see-beyond-the-ip-and-secure-urls-with-google-cloud-ngfw/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Uttam Ramesh</name><title>Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Susan Wu</name><title>Outbound Product Manager</title><department></department><company></company></author></item><item><title>Envoy: A future-ready foundation for agentic AI networking</title><link>https://cloud.google.com/blog/products/networking/the-case-for-envoy-networking-in-the-agentic-ai-era/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In today's agentic AI environments, the network has a new set of responsibilities.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a traditional application stack, the network mainly moves requests between services. But as discussed in a recent white paper,&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://services.google.com/fh/files/misc/cloud_infrastructure_in_the_agent_native_era.pdf" rel="noopener" target="_blank"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Cloud Infrastructure in the Agent-Native Era&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;,&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; in an agentic system the network sits in the middle of model calls, tool invocations, agent-to-agent interactions, and policy decisions that can shape what an agent is allowed to do. The rapid proliferation of agents, often built on diverse frameworks, necessitates a consistent enforcement of governance and security across all agentic paths at scale. To achieve this, the enforcement layer must shift from the application level to the underlying infrastructure. That means the network can no longer operate as a blind transport layer. It has to understand more, enforce better, and adapt faster. This shift is precisely where Envoy comes in.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As a high-performance distributed proxy and universal data plane, Envoy is built for massive scale. Trusted by demanding enterprise environments, including Google Cloud, it supports everything from single-service deployments to complex service meshes using Ingress, Egress, and Sidecar patterns. Because of its deep extensibility, robust policy integration, and operational maturity, Envoy is uniquely suited for an era where protocols change quickly and the cost of weak control is steep. For teams building agentic AI, Envoy is more than a concept: it's a practical, production-ready foundation.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_xPxMxF4.max-1000x1000.jpg"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Agentic AI changes the networking problem&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agentic workloads still often use HTTP as a transport, but they break some of the assumptions that traditional HTTP intermediaries rely on. Protocols such as&lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (MCP) and&lt;/span&gt;&lt;a href="https://github.com/google/A2A" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent2agent&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (A2A) use&lt;/span&gt;&lt;a href="https://www.jsonrpc.org/specification" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;JSON-RPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or&lt;/span&gt;&lt;a href="https://grpc.io" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;gRPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; over HTTP, adding protocol-level phases such as MCP initialization, where client and server exchange their capabilities, on top of standard HTTP request/response semantics. The key aspects of agentic systems that require intermediaries to adapt include:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Diverse enterprise governance imperatives. &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The primary challenge is satisfying the wide spectrum of non-negotiable enterprise requirements for safety, security, data privacy, and regulatory compliance. These needs often go beyond standard network policies and require deep integration with internal systems, custom logic, and the ability to rapidly adapt to new organizational rules or external regulations. This demands a highly extensible framework where enterprises can plug in their specific governance models.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Policy attributes live inside message bodies, not headers.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Unlike traditional web traffic where policy inputs like paths and headers are readily accessible, agentic protocols frequently bury critical attributes (e.g., model names, tool calls, resource IDs) deep within JSON-RPC or gRPC payloads. This shift requires intermediaries to possess the ability to parse and understand message contents to apply context-aware policies.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Handling diverse and evolving protocol characteristics. &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Agentic protocols are not uniform. Some, like MCP with Streamable HTTP, can introduce stateful interactions requiring session management across distributed proxies (e.g., using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Mcp-Session-Id&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;). The need to support such varied behaviors, along with future protocol innovations, reinforces the necessity of an inherently adaptable and extensible networking foundation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These factors mean enterprises need more than just connectivity. The network must now serve as a central point for enforcing the crucial governance needs mentioned earlier. This includes providing capabilities like centralized security, comprehensive auditability, fine-grained policy enforcement, and dynamic guardrails, all while keeping pace with the rapid evolution of protocols and agent behaviors. Put simply, agentic AI transforms the network from a mere transit path into a critical control point.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Why Envoy fits this shift&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy is a strong fit for agentic AI networking for three reasons. Envoy is:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Battle-tested.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Enterprises already rely on Envoy in high-scale, security-sensitive environments, making it a credible platform to anchor a new generation of traffic management and policy enforcement.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Extensible.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Envoy can be extended through native filters, Rust modules, WebAssembly (Wasm) modules, and &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;external processing&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; patterns. That gives platform teams room to adopt new protocols without having to rebuild their networking layer every time the ecosystem changes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Operationally useful today.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Envoy already acts as a gateway, enforcement point, observability layer, and integration surface for control planes. That makes it a practical choice for organizations that need to move now, not after the standards settle.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building on these core strengths, Envoy has introduced specific architectural advancements to meet the unique demands of agentic networking:&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;1. Envoy understands agent traffic&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The first requirement for agentic networking is simple: The gateway needs to understand what the agent is actually trying to do.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That’s harder than it sounds. In protocols such as MCP, A2A, and OpenAI-style APIs, important policy signals may live inside the request body. Traditional HTTP proxies are optimized to treat bodies as opaque byte streams. That design is efficient, but it limits what the proxy can enforce. For protocols that use JSON messages, a proxy may need to buffer the entire request body to locate attribute values needed for policy application — especially when those attributes appear at the end of the JSON message. Business logic specific to gen AI protocols, such as rate limiting based on consumed tokens, may also require parsing server responses.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy addresses this by deframing protocol messages carried over HTTP and exposing useful attributes to the rest of the filter chain. The extensibility model for gen AI protocols was guided by two goals:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Easy reuse of existing HTTP extensions that work with gen AI protocols out of the box, such as RBAC or tracers.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Easy access to deframed messages for gen-AI-specific extensions, so that developers can focus on gen AI business logic without needing to deal with HTTP or JSON envelopes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Based on these goals, new extensions for gen AI protocols are still built as HTTP extensions and configured in the HTTP filter chain. This provides flexibility to mix HTTP-native business logic, such as OAuth or mTLS authorization, with gen AI protocol logic in a single chain. A deframing extension parses the protocol messages carried by HTTP and provides an ambient context with extracted attributes, or even the entirety of parsed messages, to downstream extensions via well-known filter state and metadata values.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Instead of forcing every policy component to parse JSON envelopes or protocol-specific message formats on its own, Envoy makes those attributes available as structured metadata. Once the gateway has deframed protocol messages, existing Envoy extensions such as &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_authz_filter" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ext_authz&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or RBAC can read protocol properties to evaluate policies using protocol-specific attributes such as tool names for MCP, message attributes for A2A, or model names for OpenAI.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Access logs can include message attributes for enhanced monitoring and auditing. The protocol attributes are also available to the &lt;/span&gt;&lt;a href="https://cel.dev/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Common Expression Language&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (CEL) runtime, simplifying creation of complex policy expressions in RBAC or composite extensions.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_t4lf1kG.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Buffering and memory management&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy is designed to use as little memory as possible when proxying HTTP requests. However, parsing agentic protocols may require an arbitrary amount of buffer space, especially when extensions require the entire message to be in memory. The flexibility of allowing extensions to use larger buffers needs to be balanced with adequate protection from memory exhaustion, especially in the presence of untrusted traffic.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To achieve this, Envoy now provides a per-request buffer size limit. Buffers that hold request data are also integrated with the overload manager, enabling a full range of protective actions under memory pressure, such as reducing idle timeouts or resetting requests that consume the most memory for an extended duration. These changes pave the way for Envoy to serve as a gateway and policy-enforcement point for gen AI protocols without compromising its resource efficiency.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;2. Envoy enforces policy on things that matter&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Understanding traffic is only useful if the gateway can act on it.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In agentic systems, policy is not just about which service an agent can reach. It’s about which tools an agent can call, which models it can use, what identity it presents, how much it can consume, and what kinds of outputs require additional controls. Those are higher-value decisions than simple layer-4 or path-based controls, and they are exactly the kinds of controls enterprises care about when agents are allowed to take action on their behalf.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy is well-positioned here because it can combine transport-level security with application-aware policy enforcement. Teams can authenticate workloads with mTLS and SPIFFE identities, then enforce protocol-specific rules with RBAC, external authorization, external processing, access logging, and CEL-based policy expressions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This capability is crucial because it lets platform teams decouple agent development from enforcement. Developers can focus on building useful agents, while operators enforce a consistent zero-trust posture at the network layer, even as tools, models, and protocols continue to change.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;A prime example of this zero-trust decoupling is the critical "user-behind-agent" scenario, where an AI agent must execute tasks on a human user's behalf. Traditionally, handing user credentials directly to an application introduces severe security risks — if the agent is compromised or manipulated via prompt injection, an attacker could exfiltrate or misuse those credentials. By offloading identity management to Envoy, the proxy can automatically insert user delegation tokens into outbound requests at the infrastructure layer. Because the agent never directly holds the sensitive credential, the risk of a compromised agent misusing or leaking the token is completely neutralized, ensuring actions remain strictly bound to the user's actual permissions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Case study: Restricting an agent to specific GitHub MCP tools&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Consider an agent that triages GitHub issues.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The GitHub MCP server may expose dozens of tools, but the agent may only need a small read-only subset, such as &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;list_issues&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;get_issue&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;get_issue_comments&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. In most enterprises, that difference matters. A useful agent should not automatically become an unrestricted one.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With Envoy in front of the MCP server, the gateway can verify the agent identity using SPIFFE during the mTLS handshake, parse the MCP message via &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/http/mcp/v3/mcp.proto#envoy-v3-api-msg-extensions-filters-http-mcp-v3-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;the deframing filter&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, extract the requested method and tool name, and enforce a policy that allows only the approved tool calls for that specific agent identity. RBAC uses metadata created by the MCP deframing filter to check the method and tool name in the MCP message:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;envoy.filters.http.rbac:\r\n  &amp;quot;@type&amp;quot;: type.googleapis.com/envoy.extensions.filters.http.rbac.v3.RBACPerRoute\r\n  rbac:\r\n    rules:\r\n      policies:\r\n        github-issue-reader-policy:\r\n          permissions:\r\n            - and_rules:\r\n                rules:\r\n                  - sourced_metadata:\r\n                      metadata_matcher:\r\n                        filter: envoy.http.filters.mcp\r\n                        path: [{ key: &amp;quot;method&amp;quot; }]\r\n                        value: { string_match: { exact: &amp;quot;tools/call&amp;quot; } }\r\n                  - sourced_metadata:\r\n                      metadata_matcher:\r\n                        filter: envoy.http.filters.mcp\r\n                        path: [{ key: &amp;quot;params&amp;quot; }, { key: &amp;quot;name&amp;quot; }]\r\n                        value:\r\n                          or_match:\r\n                            value_matchers:\r\n                              - string_match: { exact: &amp;quot;list_issues&amp;quot; }\r\n                              - string_match: { exact: &amp;quot;get_issue&amp;quot; }\r\n                              - string_match: { exact: &amp;quot;get_issue_comments&amp;quot; }\r\n          principals:\r\n            - authenticated:\r\n                principal_name:\r\n                  exact: &amp;quot;spiffe://cluster.local/ns/github-agents/sa/issue-triage-agent&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f172251c8e0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That’s the real value: Policy is enforced centrally, close to the traffic, and in terms that match the agent's actual behavior.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_jtbLCMn.max-1000x1000.png"
        
          alt="3"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Beyond static rules: External authorization&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;A complex compliance policy that can’t be expressed using RBAC rules can be implemented in an external authorization service using the &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_authz_filter" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;ext_authz&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; protocol. Envoy provides MCP message attributes along with HTTP headers in the context of the ext_authz RPC. It can also forward the agent's SPIFFE identity from the peer certificate:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;http_filters:\r\n  - name: envoy.filters.http.ext_authz\r\n    typed_config:\r\n      &amp;quot;@type&amp;quot;: type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz\r\n      grpc_service:\r\n        envoy_grpc:\r\n          cluster_name: auth_service_cluster\r\n      include_peer_certificate: true\r\n      metadata_context_namespaces:\r\n        - envoy.http.filters.mcp&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f172251c820&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This allows external services to make authorization decisions based on the full combination of agent identity, MCP method, tool name, and any other protocol attributes, without the agent or the MCP server needing to be aware of the policy layer.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Protocol-native error responses&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;When Envoy denies a request, the error should be meaningful to the calling agent. For MCP traffic, Envoy can use &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;local_reply_config&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to map HTTP error codes to appropriate JSON-RPC error responses. For example, a 403 Forbidden can be mapped to a JSON-RPC response with &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;isError: true&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and a human-readable message, ensuring the agent receives a protocol-appropriate denial rather than an opaque HTTP status code.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;3. Envoy supports stateful agent interactions at scale&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Not all agent traffic is stateless. Some protocols, including Streamable HTTP for MCP, can rely on session-oriented behavior. That creates a new challenge for intermediaries, especially when traffic flows through multiple gateway instances to achieve scale and resilience. An MCP session effectively binds the agent to the server that established it, and all intermediaries need to know this to direct incoming MCP connections to the correct server.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If a session is established on one backend, later requests in that conversation need to reach the right destination. That sounds straightforward for a single-proxy deployment, but it becomes more complicated in horizontally scaled systems, where multiple Envoy instances may handle different requests from the same agent.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Passthrough gateway&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In the simpler passthrough mode, Envoy establishes one upstream connection for each downstream connection. Its primary use is enforcing centralized policies, such as client authorization, RBAC, rate limiting, and authentication, for external MCP servers. The session state transferred between intermediaries needs to include only the address of the server that established the session over the initial HTTP connection, so that all session-related requests are directed to that server.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Session state transfer between different Envoy instances is achieved by appending encoded session state to the MCP session ID provided by the MCP server. Envoy removes the session-state suffix from the session ID before forwarding the request to the destination MCP server. This session stickiness is enabled by configuring Envoy's &lt;/span&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/http/stateful_session/envelope/v3/envelope.proto" rel="noopener" target="_blank"&gt;&lt;code style="text-decoration: underline; vertical-align: baseline;"&gt;envoy.http.stateful_session.envelope&lt;/code&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; extension.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/4_j0wGyAp.max-1000x1000.png"
        
          alt="4"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Aggregating gateway&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;In aggregating mode, Envoy acts as a single MCP server by aggregating the capabilities, tools, and resources of multiple backend MCP servers. In addition to enforcing policies, this simplifies agent configuration and unifies policy application for multiple MCP servers.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Session management in this mode is more complicated because the session state also needs to include mapping from tools and resources to the server addresses and session IDs that advertised them. The session ID that Envoy provides to the agent is created before tools or resources are known, and the mapping has to be established later, after the MCP initialization phases between Envoy and the backend MCP servers are complete.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One approach, currently implemented in Envoy, is to combine the name of a tool or resource with the identifier and session ID of its origin server. The exact tool or resource names are typically not meaningful to the agent and can carry this additional provenance information. If unmodified tool or resource names are desirable, another approach is to use an Envoy instance that does not have the mapping, and then recreate it by issuing a &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;tools/list&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; command before calling a specific tool. This trades latency for the complexity of deploying an external global store of MCP sessions, and is currently in planning based on user feedback.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/5_61xwM79.max-1000x1000.png"
        
          alt="5"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This matters because it moves Envoy beyond simple traffic forwarding. It allows Envoy to serve as a reliable intermediary for real agent workflows, including those spanning multiple requests, tools, and backends.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;4. Envoy supports agent discovery&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy is adding support for the A2A protocol and agent discovery via a well-known AgentCard endpoint. AgentCard, a JSON document with agent capabilities, enables discovery and multi-agent coordination by advertising skills, authentication requirements, and service endpoints. The AgentCard can be provisioned statically via direct response configuration or obtained from a centralized agent registry server via xDS or ext_proc APIs. A more detailed description of A2A implementation and agent discovery will be published in a forthcoming blog post.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span style="vertical-align: baseline;"&gt;5. Envoy is a complete solution for agentic networking challenges&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building on the same foundation that enabled policy application for MCP protocol in demanding deployments, Envoy is adding support for OpenAI and transcoding of agentic protocols into RESTful HTTP APIs. This transcoding capability simplifies the integration of gen AI agents with existing RESTful applications, with out-of-the-box support for OpenAPI-based applications and custom options via dynamic modules or Wasm extensions. In addition to transcoding, Envoy is being strengthened in critical areas for production readiness, such as advanced policy applications like quota management, comprehensive telemetry adhering to&lt;/span&gt;&lt;a href="https://opentelemetry.io/docs/specs/semconv/gen-ai/" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;OpenTelemetry semantic conventions for generative AI systems&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and integrated guardrails for secure agent operation.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Guardrails for safe agents&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The next significant area of investment is centralized management and application of guardrails for all agentic traffic. Integrating policy enforcement points with external guardrails presently requires bespoke implementation and this problem area is ripe for standardization.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Control planes make this operational&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The gateway is only part of the story. To achieve this policy management and rollout at scale, a separate control plane is required to dynamically configure the data plane using the xDS protocol, also known as the universal data plane API.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That is where control planes become important. Cloud Service Mesh, alongside open-source projects such as &lt;/span&gt;&lt;a href="https://aigateway.envoyproxy.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Envoy AI Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://github.com/kubernetes-sigs/kube-agentic-networking" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;kube-agentic-networking&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, uses Envoy as the data plane while giving operators higher-level ways to define and manage policy for agentic workloads.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This combination is powerful: Envoy provides the enforcement and extensibility in the traffic path, while control planes provide the operating model teams need to deploy that capability consistently.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Why this matters now&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The shift towards agentic systems and gen AI protocols such as MCP, A2A, and OpenAI necessitates an evolution in network intermediaries. The primary complexities Envoy addresses include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deep protocol inspection.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Protocol deframing extensions extract policy-relevant attributes (tool names, model names, resource paths) from the body of HTTP requests, enabling precise policy enforcement where traditional proxies would only see an opaque byte stream.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fine-grained policy enforcement.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; By exposing these internal attributes, existing Envoy extensions like RBAC and ext_authz can evaluate policies based on protocol-specific criteria. This allows network operators to enforce a unified, zero-trust security posture, ensuring agents comply with access policies for specific tools or resources.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Stateful transport management.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Envoy supports managing session state for the Streamable HTTP transport used by MCP, enabling robust deployments in both passthrough and aggregating gateway modes, even across a fleet of intermediaries.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Agentic AI protocols are still in their early stages, and the protocol landscape will continue to evolve. That’s exactly why the networking layer needs to be adaptable. Enterprises should not have to rebuild their security and traffic infrastructure every time a new agent framework, transport pattern, or tool protocol gains traction. They need a foundation that can absorb change without sacrificing control.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Envoy brings together three qualities that are hard to get in one place: proven production maturity, deep extensibility, and growing protocol awareness for agentic workloads. By leveraging Envoy as an agent gateway, organizations can decouple security and policy enforcement from agent development code.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That makes Envoy more than just a proxy that happens to handle AI traffic. It makes Envoy a future-ready foundation for agentic AI networking.&lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sup&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Special thanks to the additional co-authors of this blog: Boteng Yao, Software Engineer, Google and Tianyu Xia, Software Engineer, Google and Sisira Narayana, Sr Product Manager, Google.&lt;/span&gt;&lt;/sup&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 03 Apr 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/the-case-for-envoy-networking-in-the-agentic-ai-era/</guid><category>Containers &amp; Kubernetes</category><category>AI &amp; Machine Learning</category><category>GKE</category><category>Developers &amp; Practitioners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Envoy: A future-ready foundation for agentic AI networking</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/the-case-for-envoy-networking-in-the-agentic-ai-era/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Yan Avlasov</name><title>Staff Software Engineer, Google</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Erica Hughberg</name><title>Product and Product Marketing Manager, Tetrate</title><department></department><company></company></author></item><item><title>Introducing multi-cluster GKE Inference Gateway: Scale AI workloads around the world</title><link>https://cloud.google.com/blog/products/containers-kubernetes/multi-cluster-gke-inference-gateway-helps-scale-ai-workloads/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The world of artificial intelligence is moving fast, and so is the need to serve models reliably and at scale. Today, we're thrilled to announce the preview of &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;multi-cluster GKE Inference Gateway&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to enhance the scalability, resilience, and efficiency of your AI/ML inference workloads across multiple Google Kubernetes Engine (GKE) clusters — even those spanning different Google Cloud regions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Built as an extension of the&lt;/span&gt; &lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/gateway-api"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Gateway API&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, the multi-cluster Inference Gateway leverages the power of &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/multi-cluster-gateways"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;multi-cluster Gateways&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to provide intelligent, model-aware load balancing for your most demanding AI applications.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_gRilinA.max-1000x1000.jpg"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Why multi-cluster for AI inference?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As AI models grow in complexity and users become more global, single-cluster deployments can face limitations:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Availability risks:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Regional outages or cluster maintenance can impact service.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Scalability caps:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Hitting hardware limits (GPUs/TPUs) within a single cluster or region.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Resource silos:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Underutilized accelerator capacity in one cluster can’t be used by another&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Latency:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Users far from your serving cluster may experience higher latency&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The multi-cluster GKE Inference Gateway addresses these challenges head-on, providing a variety of features and benefits:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enhanced high reliability and fault tolerance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Intelligently route traffic across multiple GKE clusters, including across different regions. If one cluster or region experiences issues, traffic is automatically re-routed, minimizing downtime.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Improved scalability and optimized resource usage:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Pool and leverage GPU/TPU resources from various clusters. Handle demand spikes by bursting beyond the capacity of a single cluster and efficiently utilize available accelerators across your entire fleet.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Globally optimized, model-aware routing:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The Inference Gateway can make smart routing decisions using advanced signals. With &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GCPBackendPolicy&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, you can configure load balancing based on real-time custom metrics, such as the model server's KV cache utilization metric, so that requests are sent to the best-equipped backend instance. Other modes like in-flight request limits are also supported.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Simplified operations:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Manage traffic to a globally distributed AI service through a single Inference Gateway configuration in a dedicated GKE "config cluster," while your models run in multiple "target clusters."&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;How it works&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In GKE Inference Gateway there are two foundational resources,&lt;/span&gt; &lt;code style="vertical-align: baseline;"&gt;InferencePool&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. An&lt;/span&gt; &lt;code style="vertical-align: baseline;"&gt;InferencePool&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; acts as a resource group for pods that share the same compute hardware (like GPUs or TPUs) and model configuration, helping to ensure scalable and high-availability serving. An&lt;/span&gt; &lt;code style="vertical-align: baseline;"&gt;InferenceObjective&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; defines the specific model names and assigns serving priorities, allowing Inference Gateway to intelligently route traffic and multiplex latency-sensitive tasks alongside less urgent workloads.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_ek1kPQE.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With this release, the system uses Kubernetes Custom Resources to manage your distributed inference service. &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;InferencePool&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; resources in each "target cluster" group model-server backends. These backends are exported and become visible as &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GCPInferencePoolImport&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; resources in the "config cluster." Standard &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;Gateway&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;HTTPRoute&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; resources in the config cluster define the entry point and routing rules, directing traffic to these imported pools. Fine-grained load-balancing behaviors, such as using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;CUSTOM_METRICS&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;IN_FLIGHT&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; requests, are configured using the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GCPBackendPolicy&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; resource attached to &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GCPInferencePoolImport&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This architecture enables use cases like global low-latency serving, disaster recovery, capacity bursting, and efficient use of heterogeneous hardware.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For more information about GKE Inference Gateway core concepts check out our &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway#understand_key_concepts"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;guide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started today&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As you scale your AI inference serving workloads to more users in more places, we're excited for you to try multi-cluster GKE Inference Gateway. To learn more and get started, check out the documentation:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/about-multi-cluster-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;About multi-cluster GKE Inference Gateway&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/setup-multicluster-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Set up multi-cluster GKE Inference Gateway&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/customize-backend-multicluster-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Customize backend configurations with GCPBackendPolicy&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Tue, 17 Mar 2026 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/containers-kubernetes/multi-cluster-gke-inference-gateway-helps-scale-ai-workloads/</guid><category>AI &amp; Machine Learning</category><category>GKE</category><category>Networking</category><category>Developers &amp; Practitioners</category><category>Containers &amp; Kubernetes</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Introducing multi-cluster GKE Inference Gateway: Scale AI workloads around the world</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/containers-kubernetes/multi-cluster-gke-inference-gateway-helps-scale-ai-workloads/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Arman Rye</name><title>Senior Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Andres Guedez</name><title>Senior Staff Software Engineer</title><department></department><company></company></author></item><item><title>The AI-native core: Highly resilient telco architecture using Google Kubernetes Engine</title><link>https://cloud.google.com/blog/products/networking/gke-for-telco-building-a-highly-resilient-ai-native-core/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;The telecommunications industry has reached a critical tipping point. Traditional, on-premises-heavy data center models are struggling under the weight of escalating infrastructure costs and an under utilization due to availability and compliance requirements. But the AI era demands exponential scale and beyond-nines reliability. The question for operators is no longer &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;if&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; they should modernize, but which architectural path will help them do that fastest.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Modernization isn't a "rip and replace" event; it’s a strategic choice. Today, we’re showcasing how &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Kubernetes Engine (GKE)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; can serve as a high-performance foundation for two versatile deployment strategies: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;cloud-centric evolution&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;strategic hybrid modernization&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;The two paths to network modernization&lt;/span&gt;&lt;/h3&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;E&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;very operator has a unique appetite for risk, regulatory landscape, and investment base, with some prioritizing agility, and others emphasizing the need for local control. You can use GKE to support both approaches:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Cloud- centric modernization: Agility at scale&lt;/strong&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;This path is for operators looking to fully harness the cloud's elasticity. Whether you’re migrating your own containerized network functions (CNFs) or &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;building a cloud-native service like &lt;/span&gt;&lt;a href="https://www.ericsson.com/en/core-network/on-demand" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Ericsson-on-Demand&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, the goal is the same: move the heavy lifting to Google Cloud.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;The benefit:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; By running mission-critical workloads like &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Voice Core&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Policy Control Functions&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; on Google's global fiber backbone, operators can scale instantly for peak events and move toward "zero-human-touch" operations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;The economics:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Transition from heavy upfront CAPEX to a "pay-as-you-grow" model. You no longer need to over-provision hardware that sits idle; the cloud absorbs the bursts for you.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Time to market&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Accelerate time to market for new services like fixed wireless access, IoT and private 5G.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt;2. Strategic hybrid modernization: Cloud agility, local control&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;For many telcos, a hybrid approach offers a better balance. Here, operators can selectively move agile control plane components and data analytics to the cloud while keeping latency-sensitive user-plane functions on premises or at the edge.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;The benefit:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Optimize for ultra-low latency and meet strict data sovereignty requirements by keeping data plane traffic local, while still gaining the AI-driven insights and orchestration power of the cloud.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;The versatility:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using GKE, you can run your control plane workloads in the cloud and data plane services directly in your own data centers or at the network edge, enjoying a unified operational model across your environments.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Engineering the "telco-grade" foundation&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we are proud to showcase how GKE has evolved into the industry's most specialized platform for containerized network functions (CNFs), backed by massive momentum from operators and equipment vendor partners&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;.&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/5g_workload_optimized_infrastructure.max-1000x1000.png"
        
          alt="5g workload optimized infrastructure"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;It’s achieved this thanks to a variety of capabilities.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Connectivity and isolation&lt;/strong&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Standard Kubernetes wasn't designed for the complex traffic separation that telcos require. GKE bridges this gap with:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Multi-networking API:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A native Kubernetes way to manage multiple interfaces per Pod, bringing standard Network Policies to every interface.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Simulated L2 networking:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A "migration superpower" that allows legacy applications to maintain their Layer-2 operational model while running on a modern cloud-native stack.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;The telco CNI:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Support for &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/multus-ipvlan-whereabouts"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Multus, IPvlan, and Whereabouts&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; on specialized Ubuntu images. This allows operators to isolate management, control, and user planes with surgical precision.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Persistent reachability&lt;/strong&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;In a world of ephemeral containers, telco functions need stability. GKE enables this through:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE IP route:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; We’ve integrated equal-cost multi-path (ECMP)-like functionality directly into the GKE dataplane. If a workload fails, it is automatically and rapidly removed from the service path, providing high availability without complex external router configurations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Persistent IP:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; GKE provides the static IP support that 5G core functions require for consistent reachability across their lifecycle without NAT that isn't available on standard Kubernetes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Sub-second convergence&lt;/strong&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;For&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; telcos, every millisecond of downtime is a lost connection. GKE’s dataplane via &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;HA Policy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; is optimized for near-zero downtime with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;ultra-fast failure detection and convergence&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, offering operators the choice between self-managed recovery or fully Google-managed failure detection.&lt;/span&gt;&lt;/p&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Shifting from "saving" to "solving" with AI&lt;/span&gt;&lt;/h3&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;For operators, t&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;he ultimate goal of modernization is to transition to an autonomous&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; network&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. By running the core network functions on a platform adjacent to Google Cloud AI and data platforms such as &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Vertex AI&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; BigQuery&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, they can turn telemetry into actionable changes to optimize the network. Some use cases and benefits that modernization enables include:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Predictive AIOps:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Use AI to identify performance degradation and trigger automated healing before a call ever drops. Use the cloud for on-demand burst capacity during sporting events or service launches. Or use the data from your GKE-hosted 5G core to fuel AI-powered automation that anticipates issues before they impact subscribers.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Intent-driven programmability:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Shift from expensive, reactive operations and cut down new deployment setup times from several weeks to a couple of hours. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation" style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Monetize insights:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Leverage AI on cloud-native data to identify and capture entirely new revenue opportunities in addition to &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;rightsizing your networks&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Your journey, your terms&lt;/span&gt;&lt;/h3&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;The future of telco is intelligent, resilient, and incredibly flexible. Whether you are taking your first step into a hybrid deployment or launching a fully cloud-hosted core, Google Cloud is your strategic partner. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Join us at MWC: Visit booth #2H40 in Hall 2 to see these solutions in action, including live demonstrations of mobile core running on GKE.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 04 Mar 2026 08:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/gke-for-telco-building-a-highly-resilient-ai-native-core/</guid><category>Containers &amp; Kubernetes</category><category>GKE</category><category>Telecommunications</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>The AI-native core: Highly resilient telco architecture using Google Kubernetes Engine</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/gke-for-telco-building-a-highly-resilient-ai-native-core/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Abhi Maras</name><title>Senior Product Manager, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Maciej Skrocki</name><title>Software Engineer, Google Cloud</title><department></department><company></company></author></item><item><title>Designing private network connectivity for RAG-capable gen AI apps</title><link>https://cloud.google.com/blog/products/networking/design-private-connectivity-for-rag-ai-apps/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The flexibility of Google Cloud allows enterprises to build secure and reliable architecture for their AI workloads. In this blog we will look at a reference architecture for &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/architecture/private-connectivity-rag-capable-gen-ai"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;private connectivity for retrieval-augmented generation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (RAG)-capable generative AI applications. This architecture is for scenarios where communications of the overall system must use private IP addresses and must not traverse the internet.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The power of RAG&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;RAG is a powerful technique used to optimize the output of large language models (LLMs) by grounding them in specific, authoritative knowledge bases outside of their original training data. RAG allows an application to retrieve relevant information from your documents, datasources, or databases in real time. This retrieved context is then provided to the model alongside the user’s query, helping to ensure that the AI’s responses are accurate, verifiable, and highly relevant to your business. This improves the quality of responses and reduces hallucinations. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This approach is helpful because it allows you to direct generative AI to use a designated source of truth, rather than relying solely on the model's pre-existing knowledge, and without needing to retrain or fine-tune the model itself. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Design pattern example&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To understand how to think about setting up your network for private connectivity for a RAG application in a regional design, let's look at the design pattern.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The setup comprises an &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;external network&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (on-prem and other clouds) and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Cloud environments&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; consisting of a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;routing project&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Shared VPC host project for RAG&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, and three specialized service projects: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;data ingestion&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;serving&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;frontend&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This design utilizes the following services to provide an end-to-end solution:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/network-connectivity/docs/interconnect/concepts/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Interconnect&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; or &lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/network-connectivity/docs/vpn/concepts/topologies#vpn-overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud VPN&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; To securely connect from your on-premises or other clouds to the routing VPC network&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/network-connectivity/docs/network-connectivity-center/concepts/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Network Connectivity Center&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Used as an orchestration framework to manage connectivity between the routing VPC network and the RAG VPC network via VPC spokes and hybrid spokes&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/network-connectivity/docs/router/concepts/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Router&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; In the routing project, facilitates dynamic BGP route exchange between the external network and Google Cloud&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/vpc/docs/private-service-connect"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Private Service Connect&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Provides a private endpoint in the routing VPC network to reach the Cloud Storage bucket for data ingestion without traversing the public internet&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/shared-vpc"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Shared VPC&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Host project architecture that allows multiple service projects to use a common, centralized VPC network&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google &lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/cloud-armor-overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Armor&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; and Application &lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/application-load-balancer"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Load Balancer&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Placed in the frontend service project to provide security and traffic management for user interaction&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloud.google.com/security/vpc-service-controls"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;VPC Service Controls&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Creates a managed security perimeter around all resources to mitigate data exfiltration risks&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-rag-gen-ai.max-1000x1000.png"
        
          alt="1-rag-gen-ai"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The traffic flow &lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong style="vertical-align: baseline;"&gt;RAG population flow&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the diagram, the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;green dashed line&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; shows the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;RAG population flow&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, which describes how data travels from data engineers to vector storage.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;From the external network, data travels over Cloud Interconnect or Cloud VPN.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;In the routing projects it uses the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Private Service Connect endpoint&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to get to the Cloud Storage bucket.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;From the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Storage bucket&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; in the Data Ingestion service project, the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;data ingestion subsystem&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; processes the raw data. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The AI model creates vectors from the chunks, returns them to the data ingestion subsystem, which writes them to the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;RAG datastore&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; in the serving service project.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong style="vertical-align: baseline;"&gt;Inference flow&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the diagram, the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;orange dashed line&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; shows the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;inference flow&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, which describes customer or user requests.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The request travels over Cloud Interconnect or Cloud VPN to the routing VPC network and then over the VPC spoke to the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;RAG VPC network&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The request reaches the Application Load&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Balancer&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;protected by&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Cloud Armor&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;; once allowed, it passes it to the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;frontend subsystem&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The frontend subsystem forwards the request to the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;serving subsystem&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, which augments the prompt with data from the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;RAG datastore&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and generates a response via the AI model.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The system generates a response via the AI model, and the grounded response is returned along the same path to the requestor.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong style="vertical-align: baseline;"&gt;Management and routing&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In the diagram, the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;blue dotted lines&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; represent the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Network Connectivity Center hybrid and VPC spokes&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; that manage the control plane and route orchestration between the routing network and the RAG VPC network. This ensures that routes learned from the external network are appropriately propagated across the environment.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Please read the entire architecture document &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/architecture/private-connectivity-rag-capable-gen-ai"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Private connectivity for RAG-capable generative AI applications&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to understand the specific including IAM permissions, VPC Service Controls, and deployment considerations.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Next steps&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Take a deeper dive into the Cross-Cloud Network, and other guides about generative AI with RAG:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Document set: &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/architecture/rag-reference-architectures"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Generative AI with RAG&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Document: &lt;/span&gt;&lt;a href="https://cloud.google.com/architecture/ccn-distributed-apps-design"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cross-Cloud Network for distributed applications &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Blog: &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/build-your-first-adk-agent-workforce?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Build Your First ADK Agent Workforce&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Want to ask a question, find out more or share a thought? Please connect with me on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/ammett/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Mon, 02 Mar 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/design-private-connectivity-for-rag-ai-apps/</guid><category>AI &amp; Machine Learning</category><category>Hybrid &amp; Multicloud</category><category>Developers &amp; Practitioners</category><category>Networking</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/0-rag-hero.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Designing private network connectivity for RAG-capable gen AI apps</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/0-rag-hero.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/design-private-connectivity-for-rag-ai-apps/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ammett Williams</name><title>Developer Relations Engineer</title><department></department><company></company></author></item><item><title>Firefly: Illuminating the path to nanosecond-level clock sync in the data center</title><link>https://cloud.google.com/blog/products/networking/understanding-the-firefly-clock-synchronization-protocol/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;From the high-frequency trading floors of Wall Street to orchestrating cloud data centers, the ability to synchronize events with nanosecond accuracy is critical. Yet, achieving this level of temporal precision across thousands of interconnected devices in a modern data center is fraught with challenges like clock drift, network jitter, and path asymmetries. And doing so on cloud-hosted infrastructure has traditionally been impossible, preventing a certain class of applications from running there. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is where Firefly, a clock synchronization system developed by researchers and engineers at Google, comes in. Firefly isn't just a clock synchronization protocol; it's a software-driven approach that combines theoretical insights and practical engineering to deliver ultra-accurate, scalable, and cost-effective time synchronization on commodity hardware within a demanding data center environment.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The nanosecond race: Why precise timing matters&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Precise clock synchronization is the foundation of distributed systems. It is non-negotiable in financial exchanges, where regulatory requirements mandate sub-100µs external synchronization to Coordinated Universal Time, or UTC, and fairness demands sub-10ns internal clock synchronization. In high-frequency trading, a minuscule timing advantage can translate to significant financial gains, making accurate timestamping critical for market integrity. Beyond finance, numerous data center operations, including database consistency, distributed logging, virtual machine management, and network telemetry, rely on accurate temporal ordering of events. And as data centers scale, the need for a robust, scalable synchronization solution becomes even more important.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;But achieving nanosecond-level synchronization in a dynamic data center environment is difficult. Several factors conspire to undermine precision:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Clock drift:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Crystal oscillators, which are fundamental to all clocks, have inherent imperfections that cause them to gradually deviate over time. Although these deviations were considered minor previously, they are substantial when targeting sub-10ns.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Jitter:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Network components such as switches and network interface cards (NICs) introduce unpredictable delays. These delays, often stemming from queuing in network buffers or the intricate processing of packets, can manifest as jitter, disrupting the timing of synchronization messages.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Asymmetry:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The network path between two devices is rarely symmetrical. Differences in cable lengths, the number of hops, or the internal workings of network equipment can cause signals to take different amounts of time to travel in opposite directions. This asymmetry can introduce significant errors when estimating one-way delays and clock offsets.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Scalability:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; As data centers expand to house tens of thousands of servers, any synchronization solution must be able to scale efficiently without becoming a bottleneck or requiring disproportionate resources.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fault tolerance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; In a distributed system, failures are inevitable. A synchronization protocol must be resilient to the loss or misbehavior of individual nodes or network links, so that the overall synchronization accuracy is not compromised.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Firefly: Bridging software and theory&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Firefly uses a multi-faceted strategy to tackle these challenges, distinguishing itself from prior synchronization protocols. Its core innovations lie in its architectural design and its theoretical underpinnings.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/1-architecture_v1.jpg"
        
          alt="1-architecture"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;1. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Layered synchronization:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Firefly employs a novel layered synchronization technique. Instead of relying on a central clock, which can be a single point of failure or introduce delays, it first establishes tight internal synchronization amongst NICs within the data center. Each NIC in the network constantly communicates with a set of its peers, comparing times and making adjustments. From this "swarm" of devices emerges a highly stable and accurate consensus time that the entire group agrees upon. This internal synchronization is rapid and robust, effectively shielding it from external timing disturbances. Concurrently, Firefly synchronizes the entire swarm to UTC. Decoupling of these two processes is crucial, as it prevents external factors like time-server jitter or drift from directly impacting internal synchronization.&lt;/span&gt;&lt;/p&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;2. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Distributed consensus over Random graphs:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Unlike traditional hierarchical approaches that can be brittle and susceptible to single points of failure, Firefly uses a distributed consensus algorithm built on a d-regular random graph. This means each NIC communicates with a randomly selected set of 'd' peers. Theoretical analysis, as presented in &lt;/span&gt;&lt;a href="https://dl.acm.org/doi/10.1145/3718958.3750502" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;the Firefly research paper&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, demonstrates that such random graphs offer significant advantages:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Faster convergence: Random graphs promote a more rapid dissemination of clock information across the network, leading to quicker synchronization.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Scalability: The theoretical bounds show that random graphs can maintain synchronization accuracy even as the size of the network grows, provided the number of peers ('d') scales logarithmically with the total number of nodes.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Resilience to asymmetry: The diverse probing paths inherent in random graphs help to average out and mitigate the impact of path asymmetries.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;3. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Mitigating jitter and asymmetry in practice: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Beyond the theoretical advantages of random graphs, Firefly incorporates practical techniques to further refine accuracy:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;RTT filtering: By analyzing round-trip time (RTT) measurements, Firefly can identify and discard probe samples that are likely affected by queuing jitter, thereby improving the accuracy of delay estimations.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Path profiling: Firefly actively probes network paths to identify and favor those with minimal asymmetry. This proactive approach helps to select the most reliable paths for synchronization.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Leveraging hardware: Where available, Firefly can utilize features like &lt;/span&gt;&lt;a href="https://docs.commscope.com/bundle/fastiron-10010-managementguide/page/GUID-A2A87D89-1224-4694-817A-D91F70D5F850.html" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Transparent Clock (TC)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in network switches to accurately account for in-switch delays, further reducing measurement error.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;4. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Robustness and fault tolerance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Firefly’s use of distributed consensus, combined with its averaging mechanisms, makes it inherently resilient to failures. By not relying on a single time server or a fixed hierarchical structure, the system can gracefully handle the loss or misbehavior of individual nodes.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Performance in the real world&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The results discussed in our &lt;/span&gt;&lt;a href="https://dl.acm.org/doi/10.1145/3718958.3750502" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Firefly research paper&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; are compelling:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Internal synchronization:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Firefly consistently achieves sub-10ns NIC-to-NIC synchronization when used in conjunction with Google's latest data center fabric technology. This can be used to determine order of events like packets, logs, remote procedure calls (RPCs) across machines.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;External synchronization:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The system also delivers significantly better synchronization to UTC than the 100µs regulatory requirement for financial exchanges.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2-graph_h5KX17K.max-1000x1000.jpg"
        
          alt="2-graph"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="ry130"&gt;The offset between a pair of clocks that are six hops away in a Firefly-synced network, measured by an oscilloscope via 1 pulse per second.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The accompanying video illustrates the accuracy of NIC-to-NIC synchronization, as quantified by an oscilloscope utilizing a one-pulse-per-second (1PPS) signal from the NICs. Each row corresponds to a NIC clock, with the rising edge indicating the precise moment the NIC clock attains an integer second. The oscilloscope observations confirm that all measured NICs exhibit close synchronization, maintaining alignment within a few nanoseconds.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=KB3z34OO9QU"
      data-glue-modal-trigger="uni-modal-KB3z34OO9QU-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_GLx4Roj.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Firefly: Sub-10ns NIC-to-NIC clock synchronization for datacenters&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-KB3z34OO9QU-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="KB3z34OO9QU"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=KB3z34OO9QU"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These results are particularly impressive given that Firefly operates purely in software on commodity hardware, avoiding the need for expensive, specialized synchronization equipment. This makes ultra-accurate time synchronization accessible to a broader range of data center applications.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;A foundation for future applications&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Firefly's success in delivering nanosecond-level accuracy in a scalable and cost-effective manner has far-reaching implications:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Democratizing high-precision timing: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Firefly allows cloud-hosted financial services that traditionally rely on expensive dedicated hardware, to achieve the required precision using standard cloud infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enabling new applications:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The availability of precise, synchronized clocks across data center devices can unlock new possibilities in areas like fine-grained network telemetry and congestion control, time-coordinated distributed systems, and deterministic fabric for ML workloads.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Transforming data center operations:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; By creating a tightly integrated and precisely timed computing entity, Firefly can enhance data centers’ overall efficiency, reliability, and performance.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In conclusion, Firefly represents a significant advancement in the field of clock synchronization. By ingeniously combining theoretical insights into graph theory and consensus algorithms with practical network engineering techniques, it overcomes the long-standing challenges of achieving nanosecond-level precision in complex, distributed environments. As data centers continue to evolve, systems like Firefly will be instrumental in building the high-performance, reliable, and fair infrastructure of the future.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;2026 AI Agent Trends in Financial Services&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17442e2550&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;Read it now.&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;https://cloud.google.com/resources/content/ai-agent-trends-financial-services-2026&amp;#x27;), (&amp;#x27;image&amp;#x27;, &amp;lt;GAEImage: FSI_Confirmation email_500x450&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;</description><pubDate>Mon, 23 Feb 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/understanding-the-firefly-clock-synchronization-protocol/</guid><category>Infrastructure</category><category>Systems</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Firefly: Illuminating the path to nanosecond-level clock sync in the data center</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/understanding-the-firefly-clock-synchronization-protocol/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Rohit Dalal</name><title>Product Manager, Google</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Yuliang Li</name><title>Software Engineer</title><department></department><company></company></author></item><item><title>Google Distributed Cloud brings public-cloud-like networking to air-gapped environments</title><link>https://cloud.google.com/blog/products/networking/google-distributed-cloud-gdc-air-gapped-1-15-networking/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Organizations in highly regulated industries often struggle to balance the rigid security of air-gapped environments with the need for the agility and flexibility that the cloud provides. To address this, Google Distributed Cloud (GDC) air-gapped 1.15 introduces new networking features in preview that give you more direct control and visibility without compromising your security posture, as well as a new IPAM feature in general availability that simplifies &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/distributed-cloud/hosted/docs/latest/gdch/platform/pa-user/subnets-overview#subnet-groups"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;subnet management&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. These preview features are Cloud NAT, enhanced connectivity for standard clusters, and advanced HTTP and HTTPS health checks in load balancers. Together, they make it easier for you to manage complex workloads in a secure environment. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Manage outbound traffic with Cloud NAT&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud NAT for GDC air-gapped replaces previous egress solutions and gives you more control over how your instances communicate with other networks, on par with public cloud functionality. Cloud NAT provides several benefits:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Configurable egress IPs:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You can assign and manage multiple egress IP addresses for your outbound traffic so you can identify exactly which workloads are communicating.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Customizable timeouts:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Manage connection lifecycles by adjusting timeouts for different types of traffic.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Granular control:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Administrators can create specific subnets for egress IPs, while application operators define how pods and VMs route their traffic.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Connect standard clusters directly to your organization&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In a secure environment, isolation should not result in disconnected silos. With the latest release, standard clusters include networking updates that help you communicate across your organization while maintaining strict security boundaries, helping you manage your environment more effectively. The updates include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Direct pod communication:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Your standard cluster pods can now communicate directly with workloads in your organization’s Default VPC. This simplifies how you connect standard clusters and shared clusters.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Flexible firewall policies:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You can use both Project Network Policy and Kubernetes Network Policy APIs to set granular rules for traffic entering and leaving your pods and nodes.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Managed load balancing:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; You can create internal and external load balancers using standard Kubernetes Service APIs, while GDC manages the underlying configuration for you.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Pods within a standard cluster can now connect to other pods directly or through a ClusterIP. While traffic to the Infra VPC remains blocked, you can send traffic to shared cluster workloads through GDC internal load balancers. This ensures your applications can reach necessary services quickly.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Improve reliability with Load Balancer HTTP and HTTPS health checks&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Previously, L4 load balancing health checks only monitored basic TCP connectivity, only confirming if a port was open. GDC air-gapped load balancers now support HTTP and HTTPS health checks, which allow you to verify if an application is actually functioning correctly. By checking status codes and response content, you can:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Confirm application health: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Verify that services are responding correctly, not just that the server is powered on.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Increase reliability:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Automatically detect and route traffic away from applications experiencing internal errors.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Improve visibility:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Access better data regarding the health of your VM-based workloads to manage performance before issues arise.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Make subnet management easier with subnet groups&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Previously, a child subnet could only reference a single parent subnet. With the introduction of the subnet group, a child subnet can now reference a subnet group that may contain multiple parent subnets. This provides the following benefits:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Overcome the challenges of immutable subnet CIDR: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;While subnet CIDR range is immutable, subnet group simplifies scaling up IP resources by attaching a new subnet to a subnet group. You can reference a subnet group instead of a single parent subnet for easy scale-up.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Automatically identify a parent subnet:&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Now you can reference a subnet group as parent rather than as a single subnet. By using a subnet group in this way, you don't need to manually identify a parent subnet that has available IP resources: inste&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;ad, GDC IPAM automatically finds a subnet in the subnet group with enough available IP space as its parent.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Start with smaller CIDRs for easier planning&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Using subnet groups to scale IP resources also means that you can start with smaller and discontinuous CIDRs when creating new parent subnets, making IP resource utilization more efficient and the planning process easier.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To learn more about these features, please refer to our &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/distributed-cloud/hosted/docs/latest/gdch/platform/pa-user/networking-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or contact your Google Cloud account team.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 10 Feb 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/google-distributed-cloud-gdc-air-gapped-1-15-networking/</guid><category>Hybrid &amp; Multicloud</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Google Distributed Cloud brings public-cloud-like networking to air-gapped environments</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/google-distributed-cloud-gdc-air-gapped-1-15-networking/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Michael Yitayew</name><title>Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Philip Bai</name><title>Product Manager</title><department></department><company></company></author></item><item><title>A gRPC transport for the Model Context Protocol</title><link>https://cloud.google.com/blog/products/networking/grpc-as-a-native-transport-for-mcp/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI agents are moving from test environments to the core of enterprise operations, where they must interact reliably with external tools and systems to execute complex, multi-step goals. The &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is the standard that makes this agent to tool communication possible. In fact, just last month we announced the release of &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/announcing-official-mcp-support-for-google-services?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;fully-managed, remote MCP servers&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Developers can now simply point their AI agents or standard MCP clients like Gemini CLI to a globally-consistent and enterprise-ready endpoint for Google and Google Cloud services.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;MCP uses &lt;/span&gt;&lt;a href="https://www.jsonrpc.org/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;JSON-RPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; as its standard transport. This brings many benefits as it combines an action-oriented approach with natural language payloads that can be directly relayed by agents in their communication with foundational models. Yet many organizations rely on &lt;/span&gt;&lt;a href="https://grpc.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;gRPC&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a high-performance, open source implementation of the remote procedure call (RPC) model. Enterprises that have adopted the gRPC framework must adapt their tooling to be compatible with the JSON-RPC transport used by MCP. Today, these enterprises need to deploy transcoding gateways to translate between JSON-RPC MCP requests and their existing gRPC-based services. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span style="vertical-align: baseline;"&gt;An interesting alternative to MCP transcoding is to use gRPC as the custom transport for MCP. Many gRPC users are actively experimenting with this option by implementing their own custom MCP servers. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;At Google Cloud, we use gRPC extensively to enable services and offer APIs at a global scale, and we’re committed to sharing the technology and expertise that has resulted from this pervasive use of gRPC. Specifically, we’re committed to supporting gRPC practitioners in their journey to adopt MCP in production, and &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;we’re actively working with the MCP community to explore mechanisms to support gRPC as a transport for MCP. The MCP core maintainers have arrived at an &lt;/span&gt;&lt;a href="https://blog.modelcontextprotocol.io/posts/2025-12-19-mcp-transport-future/#official-and-custom-transports" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;agreement to support pluggable transports in the MCP SDK&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and in the near future, Google Cloud will contribute and distribute a gRPC transport package to be plugged into the MCP SDKs. A community-backed transport package will enable gRPC practitioners to deploy MCP with gRPC in a consistent and interoperable manner.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span style="vertical-align: baseline;"&gt;The  use of gRPC as a transport avoids the need for transcoding and helps maintain operational consistency for environments that are actively using gRPC. In the rest of this post, we explore the benefits of using gRPC as a  transport for MCP and how Google Cloud is supporting this journey.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The choice of RPC transport&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span&gt;&lt;span style="vertical-align: baseline;"&gt;For organizations already using gRPC for their services, gRPC support allows them to continue to use their existing tooling to access services via MCP without altering the services or implementing transcoding proxies. These organizations are on a journey to keep the benefits of gRPC as MCP becomes the mechanism for agents to access services.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Because gRPC is our standard protocol in the backend, we have invested in experimental support for MCP over gRPC internally. And we already see the benefits: ease of use and familiarity for our developers, and reducing the work needed to build MCP servers by using the structure and statically typed APIs.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; -  &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Stefan Särne, Senior Staff Engineer and Tech Lead for Developer Experience, Spotify &lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Benefits of gRPC&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span&gt;&lt;span style="vertical-align: baseline;"&gt;Using gRPC as a transport aligns MCP with the best practices of modern gRPC-based distributed systems, improving performance, security, operations, and developer productivity.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Performance and efficiency&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The performance advantages of gRPC provide a big boost in efficiency, thanks to the following attributes:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Binary encoding (protocol buffers)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: gRPC uses protocol buffers (Protobufs) for binary encoding, shrinking message sizes by up to 10x compared to JSON. This means less bandwidth consumption and faster serialization/deserialization, which translates to lower latency for tool calls, reduced network costs, and a much smaller resource footprint.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Full duplex bidirectional streaming&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: gRPC natively supports the client (the agent) and the server (the tool), sending continuous data streams to each other simultaneously over a single, persistent connection. This feature is a game-changer for agent-tool interaction, opening the door to truly interactive, real-time agentic workflows without requiring application-level connection synchronization. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Built-in flow control (backpressure)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: gRPC includes native flow control to prevent a fast-sending tool from overwhelming the agent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Enterprise-grade security and authorization&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;gRPC treats security as a first-class citizen, with enterprise-grade features built directly into its core, including:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Mutual TLS (mTLS)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Critical for Zero Trust architectures, mTLS authenticates both the client and the gRPC-powered server, preventing spoofing and helping to ensure only trusted services communicate.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Strong authentication&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: gRPC offers native hooks for integrating with industry-standard token-based authentication (JWT/OAuth), providing verifiable identity for every AI agent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Method-level authorization&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: You can enforce authorization policies directly on specific RPC methods or MCP tools (e.g., an agent is authorized to &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;ReadFile&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; but not &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;DeleteFile&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;), helping to ensure strict adherence to the principle of least privilege and combating "excessive agency."&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Operational maturity and developer productivity&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;gRPC provides a powerful, integrated solution that helps offload resiliency measures and improves developer productivity through extensibility and reusability. Some of its capabilities include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Unified observability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Native integration with distributed tracing (&lt;/span&gt;&lt;a href="https://opentelemetry.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;OpenTelemetry&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;) and structured error codes provides a complete, auditable trail of every tool call. Developers can trace a single user prompt through every subsequent microservice interaction.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Robust resiliency&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Features like deadlines, timeouts, and automatic flow control prevent a single unresponsive tool from causing system-wide failures. These features allow a client to specify a policy for a tool call that the framework automatically cancels if exceeded, preventing a cascading failure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Polyglot development&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: gRPC generates code for 11+ languages, allowing developers to implement MCP Servers in the best language for the job while maintaining a consistent, strongly-typed contract.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Schema-based input validation&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Protobuf's strict typing mitigates injection attacks and simplifies the development task by rejecting malformed inputs at the serialization layer.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Error handling and metadata&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The framework provides a standardized set of error codes (e.g., &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;UNAVAILABLE&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;PERMISSION_DENIED&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;) for reliable client handling, and clients can send and receive out-of-band information as key-value pairs in &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;metadata&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (e.g., for tracing IDs) without cluttering the main request.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As a founding member of the &lt;/span&gt;&lt;a href="https://aaif.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agentic AI Foundation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and a core contributor to the MCP specification, Google Cloud, along with other members of the community, has championed the inclusion of pluggable transport interfaces in the MCP SDK. Participate and communicate your interest in having gRPC as a transport for MCP:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Express your interest in enabling gRPC as an MCP transport. Contribute to the active &lt;/span&gt;&lt;a href="https://github.com/modelcontextprotocol/python-sdk/pull/1591" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;pull request&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for pluggable transport interfaces for the Python MCP SDK. &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Join the community that is shaping the future of communications for AI and help advance the Model Context Protocol. &lt;/span&gt;&lt;a href="https://modelcontextprotocol.io/community/communication" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Contributor Communication - Model Context Protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:mcp-grpc-external@google.com"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Contact us&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. We want to learn from your experience and support your journey.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Tue, 13 Jan 2026 17:30:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/grpc-as-a-native-transport-for-mcp/</guid><category>AI &amp; Machine Learning</category><category>Application Development</category><category>Developers &amp; Practitioners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>A gRPC transport for the Model Context Protocol</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/grpc-as-a-native-transport-for-mcp/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Victor Moreno</name><title>Solutions Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Mark D. Roth</name><title>Senior Staff Software Engineer</title><department></department><company></company></author></item><item><title>How Hackensack Meridian Health de-risked network migration using VPC Flow Logs</title><link>https://cloud.google.com/blog/products/networking/using-vpc-flow-logs-to-de-risk-network-migration/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Network administrators rely heavily on &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/flow-logs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;VPC Flow Logs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for visibility into their network traffic. Last year, we updated VPC Flow Logs to offer &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/networking/vpc-flow-logs-for-cross-cloud-network?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;expanded network traffic visibility&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, extending beyond subnets to include VLAN attachments and VPN tunnels. This enhancement provides comprehensive monitoring of network traffic across your on-premises and multi-cloud environments.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, with VPC Flow Logs for VLAN attachments, you can export detailed telemetry data for your network traffic traversing &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/network-connectivity/docs/interconnect/concepts/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Interconnect&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This data encompasses essential information such as source and destination IP addresses, ports, protocols, bytes/packets transferred, timestamps, and other relevant metadata. These logs are crucial for a variety of use-cases, including network traffic analysis, troubleshooting, capacity planning, and maintaining compliance and security. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Then, you can use &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/network-intelligence-center/docs/flow-analyzer/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Flow Analyzer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to quickly &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;analyze your VPC Flow Logs to gain valuable insights into your network &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;without writing complex SQL queries. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Sounds great, but how do you use it? Hackensack Meridian Health (HMH) is a leading not-for-profit healthcare organization and the largest hospital system in New Jersey. As a network of hospitals, urgent care centers, and physician practices, system reliability is extremely important and a cornerstone value of HMH. In this&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; blog post, we demonstrate how HMH &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;leveraged VPC Flow Logs and Flow Analyzer to &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;analyze their Cloud Interconnect traffic &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;prior to migrating their Google Cloud network to a new architecture design.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Let’s jump in.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Using VPC Flow Logs to prepare for migration&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Last year, HMH was getting ready to migrate their critical, large-scale network to a newer Google Cloud network design. Before a migration of this scale, they wanted to use &lt;/span&gt;&lt;a href="https://en.wikipedia.org/wiki/Sankey_diagram" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;sankey diagrams&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to get a clear understanding of their most important hybrid traffic patterns. This analysis was the only way to accurately identify — and proactively plan for — the biggest risks that could cause disruption during the cutover.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"Getting a clear picture of our interconnect traffic always felt like a black box. Enabling VPC Flow Logs and feeding it into Flow Analyzer finally gave us the 'who-is-talking-to-what' map we needed. Identifying those critical traffic flows before we changed any routes was key to de-risking the entire migration." - &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Randall Brokaw, Cloud Engineering Manager, Hackensack Meridian Health&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To collect the necessary data, HMH enabled VPC Flow Logs on all of their VLAN attachments, then leveraged Flow Analyzer to easily aggregate the ingress and egress data. The following query components were used for ingress analysis:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/1_4ySTHoU.jpg"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="k86f2"&gt;Flow Analyzer query&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Source &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ul&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Filter:&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Gateway type = INTERCONNECT_ATTACHMENT&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Organize Flows By:&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Gateway location, Gateway VPC network&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Destination&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ul&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Organize Flows By: &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;GCE Instance Project, Google service type&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These selections filter VPC Flow Logs to ingress traffic over Cloud Interconnect VLAN attachments, and aggregate the source traffic volume by Google Cloud region and VPC network.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The destination data was grouped by Compute Engine instance project to easily identify the destination application, since each application is deployed into a dedicated &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/shared-vpc"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;service project&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. However, since not all traffic is sent to Compute Engine VMs, incorporating the Google service type enabled them to account for traffic destined for Google APIs and Google VPC hosted services.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In your environment, the best flow parameters and destination grouping to conduct this analysis will depend on how your organization deploys applications on Google Cloud. For example, you can group by any of the &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/about-flow-logs-records"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;available fields&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; collected by VPC Flow Logs metadata, such as IP address and port, VPC subnet, GKE cluster, and more.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;HMH then transformed the VPC Flow Logs traffic volumes into sankey diagrams. This required formatting each traffic flow into multiple three-column rows of {source, destination, weight}. For this analysis, the weight was the traffic volume displayed in Flow Analyzer, and source,destination corresponded to each layer of the sankey visualization in the following order:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Data center to Google Cloud region&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud region to VPC network&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;VPC network to application&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Selecting “&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;View the query in Log Analytics&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;” from the Flow Analyzer console allows the traffic flows to be easily exported to Google Sheets and combined correctly for the diagram. Then using &lt;/span&gt;&lt;a href="https://developers.google.com/chart/interactive/docs/gallery/sankey" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Charts&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, HMH created the sankey diagram:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;var data = new google.visualization.DataTable();\r\n\r\ndata.addColumn(&amp;#x27;string&amp;#x27;, &amp;#x27;From&amp;#x27;);\r\ndata.addColumn(&amp;#x27;string&amp;#x27;, &amp;#x27;To&amp;#x27;);\r\ndata.addColumn(&amp;#x27;number&amp;#x27;, &amp;#x27;Weight&amp;#x27;);\r\ndata.addRows([\r\n     [ &amp;#x27;On Premises&amp;#x27;, &amp;#x27;us-central1&amp;#x27;, 28 ],\r\n     [ &amp;#x27;On Premises&amp;#x27;, &amp;#x27;us-east1&amp;#x27;, 7 ],\r\n     [ &amp;#x27;us-east1&amp;#x27;, &amp;#x27;Prod Network&amp;#x27;, 2 ],\r\n     [ &amp;#x27;us-east1&amp;#x27;, &amp;#x27;Shared Network&amp;#x27;, 9 ],\r\n     [ &amp;#x27;us-central1&amp;#x27;, &amp;#x27;Prod Network&amp;#x27;, 4 ],\r\n     ...\r\n]);&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f1757076340&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_z5SDPA1.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="k86f2"&gt;Google Charts sankey diagram - Analysis of Cloud Interconnect traffic&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Using VPC Flow Logs, HMH network engineers pinpointed critical cutover moments in their plan, allowing them to de-risk the migration through proactive monitoring and preparedness. This preparation proved its value when a migration issue was detected in 3 minutes and resolved in just 5 — slashing a resolution process that previously could have taken hours. This readiness was fundamental to the migration's success.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This implementation uses Flow Analyzer which requires VPC Flow Logs to be stored in Cloud Logging. Alternatively, you have the option to forward VPC Flow Logs straight to BigQuery, bypassing Cloud Logging. From there, you can utilize visualization services like Looker to construct personalized dashboards and gain valuable insights.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;VPC Flow Logs and Flow Analyzer for the win&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;HMH used VPC Flow Logs and Flow Analyzer to facilitate their network migration. But, by providing granular visibility into your Cloud Interconnect traffic, VPC Flow Logs can enable many other use cases, such as for capacity planning, cost attribution, and more. Enable VPC Flow Logs on your &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/using-flow-logs#enable-vlan-attachment"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;VLAN attachments&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; today and leverage Flow Analyzer for insights into your traffic flow patterns.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To learn more, check out the VPC Flow Logs &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/flow-logs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or get started with &lt;/span&gt;&lt;a href="https://cloud.google.com/network-intelligence-center/docs/flow-analyzer/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Flow Analyzer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to analyze your logs at no additional cost.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 09 Jan 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/using-vpc-flow-logs-to-de-risk-network-migration/</guid><category>Customers</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How Hackensack Meridian Health de-risked network migration using VPC Flow Logs</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/using-vpc-flow-logs-to-de-risk-network-migration/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Adam Cole</name><title>GCC Network Specialist</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Neha Chhabra</name><title>Product Manager</title><department></department><company></company></author></item><item><title>Responding to CVE-2025-55182: Secure your React and Next.js workloads</title><link>https://cloud.google.com/blog/products/identity-security/responding-to-cve-2025-55182/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;Editor's note&lt;/strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;: This blog was updated on Dec. 4, 5, 7, and 12, 2025, with additional guidance on Cloud Armor WAF rule syntax, and WAF enforcement across App Engine Standard, Cloud Functions, and Cloud Run.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Earlier today, Meta and Vercel publicly disclosed two vulnerabilities that expose services built using the popular open-source frameworks &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;React&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Server Components&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (&lt;/span&gt;&lt;a href="https://www.cve.org/CVERecord?id=CVE-2025-55182" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;CVE-2025-55182&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;) and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Next.js &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;to remote code execution risks when used for some server-side use cases. At Google Cloud, we understand the severity of these vulnerabilities, also known as &lt;/span&gt;&lt;a href="https://react2shell.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;React2Shell&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and our security teams have shared their recommendations to help our customers take immediate, decisive action to secure their applications.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Vulnerability background&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;React Server Components framework&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; is commonly used for building user interfaces. On Dec. 3, 2025, &lt;/span&gt;&lt;a href="http://cve.org" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;CVE.org&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; assigned this vulnerability as &lt;/span&gt;&lt;a href="https://www.cve.org/CVERecord?id=CVE-2025-55182" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;CVE-2025-55182&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. The official Common Vulnerability Scoring System (CVSS) base severity score has been determined as Critical, a severity of 10.0. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Vulnerable versions&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: React 19.0, 19.1.0, 19.1.1, and 19.2.0&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Patched&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; in React 19.2.1&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fix&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://github.com/facebook/react/commit/7dc903cd29dac55efb4424853fd0442fef3a8700" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://github.com/facebook/react/commit/7dc903cd29dac55efb4424853fd0442fef3a8700&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Announcement&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://react.dev/blog/2025/12/03/critical-security-vulnerability-in-react-server-components" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://react.dev/blog/2025/12/03/critical-security-vulnerability-in-react-server-components&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Next.js is a web development framework that depends on React, and is also commonly used for building user interfaces. (The Next.js vulnerability was referenced as &lt;/span&gt;&lt;a href="https://www.cve.org/CVERecord?id=CVE-2025-66478" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;CVE-2025-66478&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; before being marked as a duplicate.)&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Vulnerable versions&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Next.js 15.x, Next.js 16.x, Next.js 14.3.0-canary.77 and later canary releases&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Patched&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; versions are listed &lt;/span&gt;&lt;a href="https://nextjs.org/blog/CVE-2025-66478#required-action" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fix&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://github.com/vercel/next.js/commit/6ef90ef49fd32171150b6f81d14708aa54cd07b2" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://github.com/vercel/next.js/commit/6ef90ef49fd32171150b6f81d14708aa54cd07b2&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Announcement&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: &lt;/span&gt;&lt;a href="https://nextjs.org/blog/CVE-2025-66478" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://nextjs.org/blog/CVE-2025-66478&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google Threat Intelligence Group (GTIG) has also published a new report to help understand the &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/topics/threat-intelligence/threat-actors-exploit-react2shell-cve-2025-55182"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;specific threats exploiting React2Shell&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We strongly encourage organizations who manage environments relying on the React and Next.js frameworks to update to the latest version, and take the mitigation actions outlined below.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Mitigating CVE-2025-55182&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We have created and rolled out a new &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Armor web application firewall (WAF) rule&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; designed to detect and block exploitation attempts related to CVE-2025-55182. This new rule is &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;available now&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and is intended to help protect your internet-facing applications and services that use global or regional Application Load Balancers. We recommend deploying this rule as a temporary mitigation while your vulnerability management program patches and verifies all vulnerable instances in your environment.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For customers using &lt;/span&gt;&lt;a href="https://cloud.google.com/appengine/"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;App Engine Standard&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/functions/"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Functions&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/run/"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://firebase.google.com/products/hosting" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Firebase Hosting&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;a href="https://firebase.google.com/products/app-hosting" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Firebase App Hosting&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, we provide an additional layer of defense for serverless workloads by automatically enforcing platform-level WAF rules that can detect and block the most common exploitation attempts related to CVE-2025-55182.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For &lt;/span&gt;&lt;a href="https://support.projectshield.google/s/article/Protecting-Your-Website-From-Known-Vulnerabilities" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Project Shield&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; users, we have deployed WAF protections for all sites and no action is necessary to enable these WAF rules. For long-term mitigation, you will need to patch your origin servers as an essential step to eliminate the vulnerability (see additional guidance below).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Armor and the Application Load Balancer can be used to deliver and protect your applications and services regardless of whether they are deployed on Google Cloud, on-premises, or on another infrastructure provider. If you are not yet using Cloud Armor and the Application Load Balancer, please follow the guidance further down to get started.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While these platform-level rules and the optional Cloud Armor WAF rules (for services behind an Application Load Balancer) help mitigate the risk from exploits of the CVE, we continue to strongly recommend updating your application dependencies as the primary long-term mitigation.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Deploying the cve-canary WAF rule for Cloud Armor&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To configure Cloud Armor to detect and protect from CVE-2025-55182, you can use the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/waf-rules#cves_and_other_vulnerabilities"&gt;&lt;code style="text-decoration: underline; vertical-align: baseline;"&gt;cve-canary&lt;/code&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt; preconfigured WAF rule&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; leveraging the new ruleID that we have added for this vulnerability. This rule is opt-in only, and must be added to your policy even if you are already using the cve-canary rules.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In your Cloud Armor backend security policy, create a new rule and configure the following match condition:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;(has(request.headers[&amp;#x27;next-action&amp;#x27;]) || has(request.headers[&amp;#x27;rsc-action-id&amp;#x27;]) || request.headers[&amp;#x27;content-type&amp;#x27;].contains(&amp;#x27;multipart/form-data&amp;#x27;) || request.headers[&amp;#x27;content-type&amp;#x27;].contains(&amp;#x27;application/x-www-form-urlencoded&amp;#x27;)) &amp;amp;&amp;amp; evaluatePreconfiguredWaf(&amp;#x27;cve-canary&amp;#x27;,{&amp;#x27;sensitivity&amp;#x27;: 0, &amp;#x27;opt_in_rule_ids&amp;#x27;: [&amp;#x27;google-mrs-v202512-id000001-rce&amp;#x27;,&amp;#x27;google-mrs-v202512-id000002-rce&amp;#x27;]})&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f175ab3e280&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This can be accomplished from the Google Cloud console by navigating to Cloud Armor and modifying an existing or creating a new policy.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--medium
      
      
        h-c-grid__col
        
        h-c-grid__col--4 h-c-grid__col--offset-4
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/20251205_11am_rule_1.max-1000x1000.png"
        
          alt="20251205_11am_rule (1)"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="5admg"&gt;Cloud Armor rule creation in the Google Cloud console.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;Alternatively, the gcloud CLI can be used to create or modify a policy with the requisite rule:&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud compute security-policies rules create PRIORITY_NUMBER \\\r\n    --security-policy SECURITY_POLICY_NAME \\\r\n    --expression &amp;quot;(has(request.headers[\&amp;#x27;next-action\&amp;#x27;]) || has(request.headers[\&amp;#x27;rsc-action-id\&amp;#x27;]) || request.headers[\&amp;#x27;content-type\&amp;#x27;].contains(\&amp;#x27;multipart/form-data\&amp;#x27;) || request.headers[\&amp;#x27;content-type\&amp;#x27;].contains(\&amp;#x27;application/x-www-form-urlencoded\&amp;#x27;)) &amp;amp;&amp;amp; evaluatePreconfiguredWaf(\&amp;#x27;cve-canary\&amp;#x27;,{\&amp;#x27;sensitivity\&amp;#x27;: 0, \&amp;#x27;opt_in_rule_ids\&amp;#x27;: [\&amp;#x27;google-mrs-v202512-id000001-rce\&amp;#x27;,\&amp;#x27;google-mrs-v202512-id000002-rce\&amp;#x27;]})&amp;quot; \\\r\n    --action=deny-403&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f175ab3e730&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Additionally, if you are managing your rules with Terraform, you may implement the rule via the following syntax:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;rule {\r\n    action   = &amp;quot;deny(403)&amp;quot;\r\n    priority = &amp;quot;PRIORITY_NUMBER&amp;quot;\r\n    match {\r\n      expr {\r\n        expression = &amp;quot;(has(request.headers[\&amp;#x27;next-action\&amp;#x27;]) || has(request.headers[\&amp;#x27;rsc-action-id\&amp;#x27;]) || request.headers[\&amp;#x27;content-type\&amp;#x27;].contains(\&amp;#x27;multipart/form-data\&amp;#x27;) || request.headers[\&amp;#x27;content-type\&amp;#x27;].contains(\&amp;#x27;application/x-www-form-urlencoded\&amp;#x27;)) &amp;amp;&amp;amp; evaluatePreconfiguredWaf(\&amp;#x27;cve-canary\&amp;#x27;,{\&amp;#x27;sensitivity\&amp;#x27;: 0, \&amp;#x27;opt_in_rule_ids\&amp;#x27;: [\&amp;#x27;google-mrs-v202512-id000001-rce\&amp;#x27;,\&amp;#x27;google-mrs-v202512-id000002-rce\&amp;#x27;]})&amp;quot;\r\n      }\r\n    }\r\n    description = &amp;quot;Applies protection for CVE-2025-55182 (React/Next.JS)&amp;quot;\r\n  }&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f175ab3e670&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Verifying WAF rule safety for your application and consuming telemetry&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Armor rules can be &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/security-policy-overview#preview_mode"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;configured in preview mode&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a logging-only mode to test or monitor the expected impact of the rule without Cloud Armor enforcing the configured action. We recommend that the new rule described above first be deployed in preview mode in your production environments so that you can see what traffic it would block. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once you verify that the new rule is behaving as desired in your environment, then you can disable preview mode to allow Cloud Armor to actively enforce it.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Armor per-request WAF logs are emitted as part of the Application Load Balancer logs to Cloud Logging. To see what Cloud Armor’s decision was on every request, load balancer logging first &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/https/https-logging-monitoring"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;needs to be enabled on a per backend service basis&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Once it is enabled, all subsequent Cloud Armor decisions will be logged and can be found in Cloud Logging by &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/request-logging"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;following these instructions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Interaction of Cloud Armor rules with &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;vulnerability&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; scanning tools&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;There has been a proliferation of scanning tools designed to help identify vulnerable instances of React and Next.js in your environments. Many of those scanners are designed to identify the version number of relevant frameworks in your servers and do so by crafting a &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;legitimate&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; query and inspecting the response from the server to detect the version of React and &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Next.js&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; that is running. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our WAF rule is designed to detect and prevent exploit attempts of &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;CVE-2025-55182&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. As the scanners discussed above are not attempting an exploit, but sending a safe query to &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;elicit&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; a response revealing indications of the version of the software, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;the above Cloud Armor rule will not detect or block such scanners. &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If the findings of these scanners indicate a vulnerable instance of software protected by Cloud Armor, that does not mean that an actual exploit attempt of the vulnerability will successfully get through your Cloud Armor security policy. Instead, such findings mean that the version React or Next.js detected is known to be vulnerable and should be patched.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;How to get started with Cloud Armor for new users&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If your workload is already using an Application Load Balancer to receive traffic from the internet, you can configure Cloud Armor to protect your workload from this and other application-level vulnerabilities (as well as DDoS attacks) by following &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/configure-security-policies"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;these instructions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you are not yet using an Application Load Balancer and Cloud Armor, you can get started with the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/https"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;external Application Load Balancer overview&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/security-policy-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Armor overview&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Armor best practices&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If your workload is using &lt;/span&gt;&lt;a href="http://docs.cloud.google.com/run/"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/functions"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run functions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, or &lt;/span&gt;&lt;a href="https://cloud.google.com/appengine"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;App Engine&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and receives traffic from the internet, you must first &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/load-balancing/docs/https/setup-global-ext-https-serverless"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;set up an Application Load Balancer in front of your endpoint&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to leverage Cloud Armor security policies to protect your workload. You will then need to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/integrating-cloud-armor#serverless"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;configure the appropriate controls&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to ensure that Cloud Armor and the Application Load Balancer can’t be bypassed.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Best practices and additional risk mitigations&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once you configure Cloud Armor, we recommend consulting our &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;best practices guide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Be sure to account for &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/armor/docs/security-policy-overview#limitations"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;limitations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;discussed in the documentation to minimize risk and optimize performance while ensuring the safety and availability of your workloads. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Serverless platform protections&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud is enforcing platform-level protections across App Engine Standard, Cloud Functions, and Cloud Run to automatically help protect against common exploit attempts of CVE-2025-55182. This protection supplements the protections already in place for Firebase Hosting and Firebase App Hosting.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;What this means for you:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Applications deployed to those serverless services benefit from these WAF rules that are enabled by default to help provide a base level of protection without requiring manual configuration.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;These rules are designed to block known malicious payloads targeting this vulnerability.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Important considerations:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Patching is still critical:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; These platform-level defenses are intended to be a temporary mitigation. The most effective long-term solution is to update your application's dependencies to non-vulnerable versions of React and Next.js, and redeploy them.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Potential impacts:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; While unlikely, if you believe this platform-level filtering is incorrectly impacting your application's traffic, please contact &lt;/span&gt;&lt;a href="https://support.google.com/cloud/answer/6282346" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Support&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and reference issue number 465748820.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Long-term mitigation: Mandatory framework update and redeployment&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While WAF rules provide critical frontline defense, the most comprehensive long-term solution is to patch the underlying frameworks.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;While Google Cloud is providing platform-level protections and Cloud Armor options, we urge all customers running React and Next.js applications on Google Cloud to immediately update their dependencies to the latest stable versions (React 19.2.1 or the relevant version of Next.js listed &lt;/strong&gt;&lt;a href="https://nextjs.org/blog/CVE-2025-66478#required-action" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;), and redeploy their services.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This applies specifically to applications deployed on:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Run, Cloud Run functions, or App Engine&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Update your application dependencies with the updated framework versions and redeploy.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Kubernetes Engine (GKE)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Update your container images with the latest framework versions and redeploy your pods.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Compute Engine&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;:&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The public OS images provided by Google Cloud do not have React or Next.js packages installed by default. If you have installed a custom OS with the affected packages, update your workloads to include the latest framework versions and enable WAF rules in front of all workloads.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Firebase&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;:&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;If you’re using Cloud Functions for Firebase, Firebase Hosting, or Firebase App Hosting, update your application dependencies with the updated framework versions and redeploy. Firebase Hosting and App Hosting are also automatically enforcing a rule to limit exploitation of CVE-2025-55182 through requests to custom and default domains.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Patching your applications is an essential step to eliminate the vulnerability at its source and ensure the continued integrity and security of your services.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We will continue to monitor the situation closely and provide further updates and guidance as necessary. Please refer to our official &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/support/bulletins#gcp-2025-072"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Security advisories&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for the most current information and detailed steps.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you have any questions or require assistance, please contact &lt;/span&gt;&lt;a href="https://support.google.com/cloud/answer/6282346" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Support&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and reference issue number 465748820.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 03 Dec 2025 23:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/identity-security/responding-to-cve-2025-55182/</guid><category>DevOps &amp; SRE</category><category>Application Development</category><category>Networking</category><category>Serverless</category><category>Security &amp; Identity</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Responding to CVE-2025-55182: Secure your React and Next.js workloads</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/identity-security/responding-to-cve-2025-55182/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Tim April</name><title>Security Reliability Engineer</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Emil Kiner</name><title>Senior Product Manager</title><department></department><company></company></author></item><item><title>Gain Cross-Cloud Network traffic insights with VPC Flow Logs and Flow Analyzer</title><link>https://cloud.google.com/blog/products/networking/vpc-flow-logs-for-cross-cloud-network/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Gaining visibility into your network traffic is crucial, particularly with hybrid environments encompassing both on-premises and cross-cloud infrastructure. &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/flow-logs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;VPC Flow Logs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; have long been a staple to obtain detailed records of network traffic to, from, and within your Google Cloud subnets. But with the rise of more complex network topologies enabled by the &lt;/span&gt;&lt;a href="https://cloud.google.com/solutions/cross-cloud-network"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cross-Cloud Network&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, we knew we needed to expand VPC Flow Logs to give you a more complete picture.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That's why we're excited to share that you can &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;now enable VPC Flow Logs&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; directly on your &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/using-flow-logs#enable-vpn-tunnel"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud VPN tunnels&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/using-flow-logs#enable-vlan-attachment"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;VLAN attachments&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for Cloud Interconnect and Cross-Cloud Interconnect. This enhancement provides comprehensive monitoring of critical network traffic moving between your on-prem infrastructure, cross-cloud resources, and Google Cloud. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;With this new capability, you can:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gain granular insights:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Log network flows passing through Cloud Interconnect and Cloud VPN with 5-tuple granularity (source/destination IP, source/destination port, protocol).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Optimize performance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Quickly identify "elephant flows" (high-bandwidth flows) that might be congesting a specific VPN tunnel or VLAN attachment, enabling you to better plan and manage capacity. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Audit Shared VPC usage:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; In &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/vpc/docs/shared-vpc"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Shared VPC &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;environments, identify which service projects are consuming the most hybrid bandwidth.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Map utilization to flows:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Understand exactly &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;how&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; your hybrid connections are being utilized by mapping high-level bandwidth graphs to specific application flows. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Diagnose connectivity issues:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; When an on-prem/cross-cloud application can't reach a Google Cloud resource, use logs to check if the traffic is arriving at the Google Cloud gateway (VLAN attachment or VPN tunnel).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Finetune your application awareness on Cloud Interconnect policy configurations:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Monitor and verify that your applications are marking &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/network-connectivity/docs/interconnect/how-to/cci/configure-traffic-differentiation"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;differentiated services field codepoints&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;  (DSCP) correctly.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To provide more context to these flows, we've also added "&lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/about-flow-logs-records#gateway-details"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;" annotations to VPC Flow Logs. Think of a gateway as the entry or exit point for traffic traveling between your Google Cloud VPC and an external network.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When you inspect a flow log of Cross-Cloud Network traffic, you'll now see two key &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/about-flow-logs-records#gateway-details"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;new fields&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;reporter&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This field tells you the &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;direction&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; of the traffic, relative to the gateway.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;ul&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;SRC_GATEWAY&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The traffic was observed &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;entering&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; Google Cloud through Cloud Interconnect or Cloud VPN (e.g., on-prem to Google Cloud).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="2" style="list-style-type: circle; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;DEST_GATEWAY&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: The traffic was observed &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;exiting&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; Google Cloud through Cloud Interconnect or Cloud VPN (e.g., Google Cloud to on-prem).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;gateway&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt; object&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: This JSON payload provides the full context of the gateway itself, including its &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;name&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;type&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; (VPN_TUNNEL or INTERCONNECT_ATTACHMENT), &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;project_id&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;location&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Analyze your logs with Flow Analyzer&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To help you analyze your flow logs without writing-complex SQL queries, we’ve also integrated the new gateway annotations directly into &lt;/span&gt;&lt;a href="https://cloud.google.com/network-intelligence-center/docs/flow-analyzer/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Flow Analyzer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a native tool for performing deep network traffic analysis on your VPC Flow Logs stored in Cloud Logging at no additional cost. Using Flow Analyzer, you can:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Quickly identify top talkers in your network with 5-tuple granularity. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/network-intelligence-center/docs/flow-analyzer/run-connectivity-tests"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Run Connectivity Tests &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;in-context to understand how your configurations (ie. firewall policies) impact traffic flowing through your network.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Use Gemini Cloud Assist to construct &lt;/span&gt;&lt;a href="https://cloud.google.com/network-intelligence-center/docs/flow-analyzer/write-queries-gemini"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;natural language queries&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Analyze and compare current network flows with historical data (e.g., last hour, day, or week).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/IC_GIF_200000.gif"
        
          alt="IC GIF 200000"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="n4nm9"&gt;Flow Analyzer providing Cloud Interconnect traffic insights&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Achieve essential visibility across the Cross-Cloud Network&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you're running a Cross-Cloud Network, enabling VPC Flow Logs on your VLAN attachments and VPN tunnels provides the essential telemetry you need to manage, secure, and scale your interconnected networks. You can enable this feature on your new and existing &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/using-flow-logs#enable-vlan-attachment"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;VLAN attachments&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/using-flow-logs#enable-vpn-tunnel"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;VPN tunnels&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; using CLI, API, Terraform, or directly from the&lt;/span&gt; &lt;a href="https://console.cloud.google.com/networking/vpc-flow-logs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud console&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To learn more, check out the VPC Flow Logs &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/flow-logs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or get started with &lt;/span&gt;&lt;a href="https://console.cloud.google.com/net-intelligence/flow-analyzer"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Flow Analyzer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Mon, 01 Dec 2025 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/vpc-flow-logs-for-cross-cloud-network/</guid><category>Developers &amp; Practitioners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Gain Cross-Cloud Network traffic insights with VPC Flow Logs and Flow Analyzer</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/vpc-flow-logs-for-cross-cloud-network/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Mary Colley</name><title>Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Neha Chhabra</name><title>Product Manager</title><department></department><company></company></author></item><item><title>AWS and Google Cloud collaborate to simplify multicloud networking</title><link>https://cloud.google.com/blog/products/networking/aws-and-google-cloud-collaborate-on-multicloud-networking/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As organizations increasingly adopt multicloud architectures, the need for interoperability between cloud service providers has never been greater. Historically, however, connecting these environments has been a challenge, forcing customers to take a complex "do-it-yourself" approach to managing global multi-layered networks at scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To address these challenges and advance a more open cloud environment, Amazon Web Services (AWS) and Google Cloud collaborated to transform how cloud service providers could connect with one another in a simplified manner. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, AWS and Google Cloud are excited to announce a jointly engineered multicloud networking solution that uses both &lt;/span&gt;&lt;a href="https://aws.amazon.com/interconnect/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AWS Interconnect - multicloud&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/hybrid-connectivity#multicloud-networking-connectivity"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud’s Cross-Cloud Interconnect&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This collaboration also introduces a new open specification for network interoperability, enabling customers to establish private, high-speed connectivity between Google Cloud and AWS with high levels of automation and speed.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Integrating Salesforce Data 360 with the broader IT landscape requires robust, private connectivity. AWS Interconnect - multicloud allows us to establish these critical bridges to Google Cloud with the same ease as deploying internal AWS resources, utilizing pre-built capacity pools and the tools our teams already know and love. This native, streamlined experience — from provisioning through ongoing support — accelerates our customers' ability to ground their AI and analytics in trusted data, regardless of where it resides.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;strong&gt;- Jim Ostrognai, SVP Software Engineering, Salesforce&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Previously, to connect cloud service providers, customers had to manually set up complex networking components including physical connections and equipment; this approach required lengthy lead times and coordinating with multiple internal and external teams. This could take weeks or even months. AWS had a vision for developing this capability as a unified specification that could be adopted by any cloud service provider, and collaborated with Google Cloud to bring it to market.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, this new solution reimagines multicloud connectivity by moving away from physical infrastructure management toward a managed, cloud-native experience. By integrating AWS with Google Cloud’s Cross-Cloud Network architecture, we are abstracting the complexity of physical connectivity, network addressing, and routing policies. Customers no longer need to wait weeks for circuit provisioning: they can now provision dedicated bandwidth on demand and establish connectivity in minutes through their preferred cloud console or API. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Reliability and security are the cornerstone of this collaboration. We have collaborated on this solution to deliver high resiliency by leveraging quad-redundancy across physically redundant interconnect facilities and routers. Both providers engage in continuous monitoring to proactively detect and resolve issues. And this solution is built on a foundation of trust, utilizing MACsec encryption between the Google Cloud and AWS edge routers. &lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“This collaboration between AWS and Google Cloud represents a fundamental shift in multicloud connectivity. By defining and publishing a standard that removes the complexity of any physical components for customers, with high availability and security fused into that standard, customers no longer need to worry about any heavy lifting to create their desired connectivity. When they need multicloud connectivity, it's ready to activate in minutes with a simple point and click.”&lt;/span&gt;&lt;strong&gt;&lt;span style="vertical-align: baseline;"&gt; - Robert Kennedy, VP of Network Services, AWS&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“We are excited about this collaboration which enables our customers to move their data and applications between clouds with simplified global connectivity and enhanced operational effectiveness. Today's announcement further delivers on Google Cloud’s Cross-Cloud Network solution focused on delivering an open and unified multicloud experience for customers.”&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;strong&gt;- Rob Enns, VP/GM of Cloud Networking, Google Cloud&lt;/strong&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This collaboration between AWS and Google Cloud is more than a multicloud solution: it’s a step toward a more open cloud environment. The &lt;/span&gt;&lt;a href="https://github.com/aws/AWSInterconnect" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;API specifications&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; developed for this product are open for other providers and partners to adopt, as we aim to simplify global connectivity for everyone. We invite you to explore this new capability today. To learn more about how to streamline your multicloud operations please visit the in-depth &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/networking/extending-cross-cloud-interconnect-to-aws-and-partners"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Cross-Cloud Interconnect blog&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and the &lt;/span&gt;&lt;a href="https://aws.amazon.com/interconnect/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AWS Interconnect - multicloud website&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to get started.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Sun, 30 Nov 2025 19:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/aws-and-google-cloud-collaborate-on-multicloud-networking/</guid><category>Hybrid &amp; Multicloud</category><category>Infrastructure Modernization</category><category>Partners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>AWS and Google Cloud collaborate to simplify multicloud networking</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/aws-and-google-cloud-collaborate-on-multicloud-networking/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Rob Enns</name><title>VP/GM of Cloud Networking, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Robert Kennedy</name><title>VP of Network Services, Amazon Web Services</title><department></department><company></company></author></item><item><title>Expanding Google Cloud’s Cross-Cloud Network with a groundbreaking AWS collaboration</title><link>https://cloud.google.com/blog/products/networking/extending-cross-cloud-interconnect-to-aws-and-partners/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/networking/aws-and-google-cloud-collaborate-on-multicloud-networking"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;announced&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; a significant collaboration with Amazon Web Services (AWS)&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;to offer a managed, private and secure, on-demand, solution for &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;cross-cloud connectivity&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. This solution is designed to enable customers to easily build enterprise-grade applications that span both Google Cloud and AWS environments. This collaboration is particularly timely, as the adoption of multicloud applications is rapidly accelerating, driven in part by the rise of AI. A Forbes &lt;/span&gt;&lt;a href="https://www.forbes.com/sites/rscottraynovich/2024/12/03/its-been-a-big-year-for-multicloud-networking-2024-will-be-bigger/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;survey&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; highlighted that 82% of respondents anticipate that the arrival of AI services will increase the demand for multicloud networking due to the scarcity of specialized accelerator resources and the availability of diverse AI agents across different vendors. The surge in multicloud adoption is a strategic imperative for organizations looking to build agentic AI applications, optimize workloads, access best-of-breed services, meet data residency requirements, and ensure the necessary resiliency for modern hybrid and multicloud applications.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To address the inherent network infrastructure challenges introduced by multicloud deployments, we designed the &lt;/span&gt;&lt;a href="https://cloud.google.com/solutions/cross-cloud-network"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cross-Cloud Network&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to simplify and optimize networking between Google Cloud and other providers.This commitment to multicloud integration has led to over 50% of the Fortune 500 currently using the Cross-Cloud Network, and this collaboration provides a significant boost. Importantly, this new jointly engineered solution with AWS is being published under an &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;open specification&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, creating an opportunity to expand the reach, allowing other providers to contribute and implement this solution in their own environments, further benefiting mutual customers.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Introducing the Cross-Cloud Interconnect for AWS&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today marks a major step in simplifying and securing the multicloud journey. We are thrilled to announce a first of its kind&lt;strong&gt; &lt;/strong&gt;&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;open specification&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; that fundamentally streamlines private network connections between customers' environments across different cloud providers. This groundbreaking joint specification has culminated in the preview of partner&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Cross-Cloud Interconnect for AWS, a powerful expansion of our Cloud Interconnect portfolio. This innovation allows you to build &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;on-demand connections in minutes&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; between your Google Cloud and AWS VPCs, transforming multicloud networking from a complex build into a simple, managed service.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-A_groundbreaking_collaboration.max-1000x1000.png"
        
          alt="1-A groundbreaking collaboration"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is more than just a connection — it's a complete shift in how you adopt multicloud solutions. We are delivering &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;substantial value&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; to our mutual customers:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Simplicity and speed:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Say goodbye to complex networking builds. This is a &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;fully managed, cloud-native experience&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; where a cross-cloud connection is as easy as peering two VPCs. We're cutting end-to-end setup time from &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;days to mere minutes&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, with flexible, on-demand bandwidth starting at &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;1 Gbps&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; during preview and scaling up to &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;100 Gbps&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; at general availability.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Secure by default:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Your data's security is paramount. All connections between the two clouds' edge routers are MACsec-encrypted&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;— providing &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;line-rate performance with always-on encryption &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;— for a more secure foundation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Inherently resilient:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Benefit from an inherently resilient architecture that provides layers of protection against facility, network, and software failures, ensuring your critical applications remain online.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Open and optimized:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The foundation is an &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;open specification&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; for seamless adoption across the industry. You can also benefit from an optimized t&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;otal cost of ownership&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; through vendor consolidation and an on-demand service model that lets you provision exactly what you need, when you need it.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This service is launching with availability in key locations like N. Virginia, Oregon, London, and Frankfurt, with rapid expansion planned to more locations globally.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Simplifying cross-cloud connectivity&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Before today's jointly engineered solution, building applications that spanned multiple cloud environments was a significant undertaking, often becoming a barrier to multicloud adoption. Customers faced a complex, multi-layered process that involved cross-functional teams and substantial lead times.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A typical deployment required several intricate steps:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Procurement:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Acquiring physical connections, whether dedicated or through a shared partner offering, and then building and managing the necessary infrastructure to ensure network availability and separation of fate.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Logical configuration:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Establishing basic connectivity by meticulously assigning and negotiating non-overlapping link-local IP addresses and setting up VLANs.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Routing setup:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Configuring BGP sessions, assigning Autonomous System (AS) numbers, and creating complex routing policies to meet specific performance and reliability requirements.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Security implementation:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Conducting thorough security reviews and implementing custom solutions to encrypt traffic between the distinct cloud environments.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The integrated &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;partner Cross-Cloud Interconnect&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; offering completely abstracts away this complexity. Customers can now bypass all the manual steps and instantly leverage pre-built physical connections with built-in security and resiliency, achieving streamlined, on-demand connectivity between their Google Cloud VPCs and AWS.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Building this powerful cross-cloud connection is now remarkably simple. Customers configure a single &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;"transport"&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; resource in Google Cloud and accept it in AWS. This &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;transport&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; is an innovative, managed construct that completely abstracts and provisions the underlying physical interconnects, VLAN attachments, and Cloud Router instances. This profound simplification enables end-to-end connectivity&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; in minutes&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, transforming multicloud deployment from a days-long engineering project into a simple, rapid configuration task.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/2-Simplifying_Cross-Cloud_connectivity.gif"
        
          alt="2-Simplifying Cross-Cloud connectivity"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Under the hood: a secure and resilient foundation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We co-designed our solution to deliver a secure and resilient foundation for cross-cloud applications, with a simple new service that doesn’t compromise on core enterprise availability tenets. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Privacy and security&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: All peering relationships are built between link local addresses, facilitating connectivity between IPv4 and IPv6 private address spaces across both environments. All underlying physical connections between Google Cloud and AWS edge routers are MACsec-encrypted, and both providers manage key rotation to meet enterprise security requirements.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Quad-redundancy:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; To enable connectivity between a Google Cloud and an AWS cloud region, quad-redundant connections are leveraged, ensuring facility redundancy, as well as edge-router redundancy. This design helps protect from multiple simultaneous failure scenarios and provides high resiliency levels for joint customers.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Managed operations &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;are key to enabling integrated solutions. The newly introduced solution not only streamlines the physical and logical builds on behalf of joint customers, it also leverages a robust underlying proactive monitoring system that detects and reacts to failures before customers suffer from their consequences. The system relies on coordinated maintenance to avoid overlaps that may impact end-to-end service availability, and streamlines support operations to address potential issues on behalf of customers.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3-Under_the_hood.max-1000x1000.png"
        
          alt="3-Under the hood"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;A variety of multicloud workloads&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These new streamlined network connections between Google Cloud and AWS enable application teams to automate network builds for a variety of interesting applications. Consider the following scenarios:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Infrastructure and AI deployments supporting active-active or active-standby disaster recovery strategies. With basic connectivity between two peer services — e.g., agentic AI applications or database replicas — applications can synchronize&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; state&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; across the cloud boundary as if they are co-located, supporting maximum application resilience and operational consistency.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;AWS customers issuing inbound requests into Google Cloud to allow a service running in AWS to securely and privately access a Google Cloud API. Examples include custom applications running on Compute Engine or a critical data warehouse hosted in BigQuery that bypasses the public internet for enhanced security and performance. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud customers issuing outbound requests towards AWS, where a data pipeline orchestrating in Google Cloud can privately pull large datasets from an AWS datastore like S3 or an RDS instance. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/4-A_variety_of_multicloud_workloads.max-1000x1000.png"
        
          alt="4-A variety of multicloud workloads"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="v0fgz"&gt;&lt;b&gt;Build applications across Google Cloud and AWS today&lt;/b&gt;&lt;/p&gt;&lt;p data-block-key="19pah"&gt;Regardless of your use case, if your organization would benefit from simple, secure, and robust on-demand connectivity between your Google Cloud and AWS environments, we invite you to start &lt;a href="https://docs.cloud.google.com/network-connectivity/docs/interconnect/concepts/partner-cci-for-aws-overview"&gt;building&lt;/a&gt; your applications across clouds and let us manage your network connectivity infrastructure for you.&lt;/p&gt;&lt;p data-block-key="c4g07"&gt;This collaboration is not restricted to Google Cloud and AWS. We invite other cloud and service providers to offer their customers this streamlined private peering capability with Google Cloud. To learn more, check out the &lt;a href="https://github.com/aws/AWSInterconnect" target="_blank"&gt;open specification&lt;/a&gt;, and contact us at cross-cloud@google.com. We are truly excited to grow this ecosystem for the benefit of our joint customers.&lt;/p&gt;&lt;/div&gt;</description><pubDate>Sun, 30 Nov 2025 19:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/extending-cross-cloud-interconnect-to-aws-and-partners/</guid><category>Hybrid &amp; Multicloud</category><category>Partners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Expanding Google Cloud’s Cross-Cloud Network with a groundbreaking AWS collaboration</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/extending-cross-cloud-interconnect-to-aws-and-partners/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Himanshu Mehra</name><title>Director of Product Management</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Judy Issa</name><title>Senior Product Manager</title><department></department><company></company></author></item><item><title>Conquering IP address scarcity: A deep dive into Google Cloud's private NAT</title><link>https://cloud.google.com/blog/products/networking/using-private-nat-for-networks-with-overlapping-ip-spaces/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Running AI workloads in a hybrid fashion — in your data center and in the cloud — requires sophisticated, global networks that unify cloud and on-premises resources. While Google’s Cloud WAN provides the necessary unified network fabric to connect VPCs, data centers, and specialized hardware, this very interconnectedness exposes a critical, foundational challenge: IP address scarcity and overlapping subnets. As enterprises unify their private and cloud environments, manually resolving these pervasive address conflicts can be a big operational burden.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Resolving IPv4 address conflicts has been a longstanding challenge in networking. And now, with a growing number of IP-intensive workloads and applications, customers face the crucial question of &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;how to ensure sufficient IP addresses for their deployments.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud offers various solutions to address private IP address challenges and facilitate communication between non-routable networks, including Private Service Connect (PSC), IPv6 addressing, and network address translation (NAT) appliances. In this post, we focus on private NAT, a feature of the &lt;/span&gt;&lt;a href="https://cloud.google.com/nat"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud NAT&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; service. This managed service simplifies private-to-private communication, allowing networks with overlapping IP spaces to connect without complex routing or managing proprietary NAT infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Getting to know private NAT&lt;/strong&gt;&lt;/h3&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Private NAT allows your Google Cloud resources to connect to other VPC networks or to on-premises networks with overlapping and/or non-routable subnets, without requiring you to manage any virtual machines or appliances.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Here are some of the key benefits of private NAT:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;A managed service:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; As a fully managed service, private NAT minimizes the operational burden of managing and scaling your own NAT gateways. Google Cloud handles the underlying infrastructure, so you can focus on your applications.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Simplified management:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Private NAT simplifies network architecture by providing a centralized and straightforward way to manage private-to-private communication — across workloads and traffic paths.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;High availability:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Being a distributed service, private NAT offers high availability, VM-to-VM line-rate performance, and resiliency, all without having to over-provision costly, redundant infrastructure.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Scalability:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Private NAT is designed to scale automatically with your needs, supporting a large number of NAT IP addresses and concurrent connections.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_Cloud_NAT_options.max-1000x1000.png"
        
          alt="1 Cloud NAT options"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="btjbt"&gt;Figure: Cloud NAT options&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3 style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Common use cases&lt;/strong&gt;&lt;/h3&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Private NAT provides critical address translation for the most complex hybrid and multi-VPC networking challenges&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Unifying global networks with Network Connectivity Center&lt;/strong&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;For organizations that use Network Connectivity Center to establish a central connectivity hub, private NAT offers the essential mechanism for linking networks that possess overlapping “ non-routable” IP address ranges. This solution facilitates two primary scenarios:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;VPC spoke-to-spoke&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Facilitates seamless private-to-private communication between distinct VPC networks (spokes) with overlapping subnets.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;VPC-to-hybrid-spoke:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Enables connectivity between a cloud VPC and an on-premises network (a hybrid spoke) connected via Cloud Interconnect or Cloud VPN. &lt;/span&gt;&lt;a href="https://cloud.google.com/nat/docs/about-private-nat-for-ncc"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Learn more here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_Private_NAT_with_Network_Connectivity_Ce.max-1000x1000.png"
        
          alt="2 Private NAT with Network Connectivity Center"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="btjbt"&gt;Figure: Private NAT with Network Connectivity Center&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enabling local hybrid connectivity in shared VPC&lt;/strong&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Organizations with shared VPC architectures can establish connectivity from non-routable or overlapping network subnets to their local Cloud Interconnects or Cloud VPN tunnels. A single private NAT gateway can manage destination routes for all workloads within the VPC.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;“Thanks to private NAT, we effortlessly connected our Orange on-prem data center with the Masmovil GCP environment, even with IP address overlaps after our joint venture. This was crucial for business continuity, as it allowed us to enable communications without altering our existing environment.” &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;– Pedro Sanz Martínez, Head of Cloud Platform Engineering, MasOrange&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/3_Enabling_local_hybrid_connectivity_using_private_NAT.jpg"
        
          alt="3 Enabling local hybrid connectivity using private NAT"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="btjbt"&gt;Figure: Enabling local hybrid connectivity using private NAT&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Accommodating Cloud Run and GKE workloads&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Dynamic, IP-intensive workloads such as &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/how-class-e-addresses-solve-for-ip-address-exhaustion-in-gke?e=13802955"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Kubernetes Engine&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (GKE) and &lt;/span&gt;&lt;a href="https://cloud.google.com/run/docs/configuring/networking-best-practices#non-rfc-1918"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; often use Non-RFC 1918 ranges such as Class E to solve for IPv4 exhaustion. These workloads often need to access resources in an on-premises network or a partner VPC, so the ability for the on-premises network to accept non-RFC 1918 ranges is critical. In most cases, central network teams do not accept non-RFC 1918 address ranges.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can solve this by applying a private NAT configuration to the non-RFC 1918 subnet. With private NAT, all egress traffic from your Cloud Run service or GKE workloads is translated, allowing it to securely communicate with the destination network despite being on non-routable subnets. Learn about how private NAT works with different workloads &lt;/span&gt;&lt;a href="https://cloud.google.com/nat/docs/nat-product-interactions"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Configuration in action: Example setups&lt;/strong&gt;&lt;/h3&gt;
&lt;p style="text-align: justify;"&gt;&lt;span style="vertical-align: baseline;"&gt;Let's look at how to configure private NAT for one of these use cases using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gcloud&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; commands.&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Example: connecting to a partner network with overlapping IPs&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Scenario:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Your &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;production-vpc&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; contains an application subnet (&lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;app-subnet-prod&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;10.20.0.0/24&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;). You need to connect to a partner's network over Cloud VPN, but the partner also uses the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;10.20.0.0/24&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; range for the resources you need to access.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Solution:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Configure a private NAT gateway to translate traffic from &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;app-subnet-prod&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; before it goes over the VPN tunnel.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;1. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Create a dedicated subnet for NAT IPs.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This subnet's range is used for translation and must not overlap with the source or destination.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud compute networks subnets create pnat-subnet-prod \\\r\n    --network=production-vpc \\\r\n    --range=192.168.1.0/24 \\\r\n    --region=us-central1 \\\r\n    --purpose=PRIVATE_NAT&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f175aad8610&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;2. &lt;strong style="vertical-align: baseline;"&gt;Create a Cloud Router&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud compute routers create prod-router \\\r\n    --network=production-vpc \\\r\n    --region=us-central1&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17545e2910&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;3. &lt;strong style="vertical-align: baseline;"&gt;Create a private NAT gateway.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This configuration specifies that only traffic from &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;app-subnet-prod&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to local dynamic (&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;match is_hybrid&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;) destinations should be translated using IPs from &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;pnat-subnet-prod&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; subnet.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;gcloud compute routers nats create pnat-to-partner \\\r\n    --router=prod-router \\\r\n    --region=us-central1 \\\r\n    --type=PRIVATE --region=us-central1 \\\r\n    --nat-custom-subnet-ip-ranges=app-subnet-prod:ALL\r\n\r\ngcloud compute routers nats rules create 1 \\\r\n    --router=prod-router --region=us-central1 \\\r\n    --nat= pnat-to-partner \\\r\n    --match=&amp;#x27;nexthop.is_hybrid&amp;#x27; \\\r\n    --source-nat-active-ranges= pnat-subnet-prod&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f17545e2370&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, any VM in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;app-subnet-prod&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; that sends traffic to the partner's overlapping network will have its source IP translated to an address from the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;192.168.1.0/24&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; range, resolving the conflict.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud's private NAT elegantly solves the common and complex problem of connecting networks with overlapping IP address spaces. As a fully managed, scalable, and highly available service, it simplifies network architecture, reduces operational overhead, and enables you to build and connect complex hybrid and multi-cloud environments with ease.&lt;/span&gt;&lt;/p&gt;
&lt;h3 style="text-align: justify;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Learn more&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ready to get started with private NAT? Check out the official &lt;/span&gt;&lt;a href="https://cloud.google.com/nat/docs/private-nat"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;private NAT documentation&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and tutorials to learn more and start building your own solutions today.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 18 Nov 2025 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/using-private-nat-for-networks-with-overlapping-ip-spaces/</guid><category>Developers &amp; Practitioners</category><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Conquering IP address scarcity: A deep dive into Google Cloud's private NAT</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/using-private-nat-for-networks-with-overlapping-ip-spaces/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Udit Bhatia</name><title>Product Manager, Google Cloud</title><department></department><company></company></author></item><item><title>Introducing Dhivaru and two new connectivity hubs</title><link>https://cloud.google.com/blog/products/networking/introducing-dhivaru-new-subsea-cable/</link><description>&lt;div class="block-paragraph"&gt;&lt;p data-block-key="l4mlz"&gt;Today, we’re announcing Dhivaru, a new Trans-Indian Ocean subsea cable system that will connect the Maldives, Christmas Island and Oman. This investment will build on the &lt;a href="https://cloud.google.com/blog/products/infrastructure/bosun-australia-connect-initiative-for-indo-pacific-connectivity?e=48754805"&gt;Australia Connect&lt;/a&gt; initiative, furthering the reach, reliability, and resilience of digital connectivity across the Indian Ocean.&lt;/p&gt;&lt;p data-block-key="fn7g5"&gt;Reach, reliability and resilience are integral to the success of AI-driven services for our users and customers. Tremendous adoption of groundbreaking services such as Gemini 2.5 Flash Image (aka Nano Banana) and Vertex AI, mean resilient connectivity has never been more important for our users. The speed of AI adoption is also outpacing anyone’s predictions, and Google is investing to meet this long-term demand.&lt;/p&gt;&lt;p data-block-key="rjoq"&gt;“Dhivaru” is the line that controls the main sail on traditional Maldivian sailing vessels, and signifies the skill, strength, and experience of the early sailors navigating the seas.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/Dhivaru_Map.max-1000x1000.jpg"
        
          alt="Dhivaru_Map"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="jklle"&gt;In addition to the cable investment, Google will be investing in creating two new connectivity hubs for the region. The Maldives and Christmas Island are naturally positioned for connectivity hubs to help improve digital connectivity for the region, including Africa, the Middle East, South Asia and Oceania.&lt;/p&gt;&lt;p data-block-key="8ko1"&gt;&lt;i&gt;“Google’s decision to invest in the Maldives is a strong signal of confidence in our country’s stable and open investment environment, and a direct contribution to my vision for a diversified, inclusive, and digitized Maldivian economy. As the world moves rapidly toward an era defined by digital transformation and artificial intelligence, this project reflects how the Maldives is positioning itself at the crossroads of global connectivity — leveraging our strategic geography to create new economic opportunities for our people and to participate meaningfully in the future of the global economy.”&lt;/i&gt; - &lt;b&gt;His Excellency the President of Republic of Maldives, Dr. Mohamed Muizzu&lt;/b&gt;&lt;/p&gt;&lt;p data-block-key="85i7m"&gt;&lt;i&gt;“We are delighted to partner with Google on this landmark initiative to establish a new connectivity hub in the Maldives. This project represents a major step forward in strengthening the nation’s digital infrastructure and enabling the next wave of digital transformation. As a leading digital provider, Ooredoo Maldives continues to expand world-class connectivity and digital services nationwide. This progress opens new opportunities for businesses such as tourism, enabling smarter operations, improved customer experiences and greater global reach. We are proud to be powering the next phase of the Digital Maldives."&lt;/i&gt;&lt;b&gt; - Ooredoo Maldives CEO and MD, Khalid Al Hamadi&lt;/b&gt;&lt;/p&gt;&lt;p data-block-key="41d79"&gt;&lt;i&gt;"Dhiraagu is committed to advancing the digital connectivity of the Maldives and empowering our people, communities, and businesses. Over the years, we have made significant investments in building robust subsea cable systems — transforming the digital landscape — connecting the Maldives to the rest of the world and enabling the rollout of high-speed broadband across the nation. We are proud and excited to partner with Google on their expansion of subsea infrastructure and the establishment of a new connectivity hub in Addu City, the southernmost city of the Maldives. This strategic collaboration with one of the world’s largest tech players marks another milestone in strengthening the nation’s presence within the global subsea infrastructure, and further enhances the reliability and resiliency of our digital ecosystem."&lt;/i&gt; -&lt;b&gt; Ismail Rasheed, CEO &amp;amp; MD, DHIRAAGU&lt;/b&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;h3 data-block-key="jklle"&gt;Connectivity hubs for the Indian Ocean region&lt;/h3&gt;&lt;p data-block-key="dfc6s"&gt;Connectivity hubs are strategic investments designed to future-proof regional connectivity and accelerate the delivery of next-generation services through three core capabilities: Cable switching, content caching, and colocation.&lt;/p&gt;&lt;p data-block-key="6jl"&gt;&lt;b&gt;Cable switching: Delivering seamless resilience&lt;/b&gt;&lt;/p&gt;&lt;p data-block-key="fabcs"&gt;Google carefully selects the locations for our connectivity hubs to minimize the distance data has to travel before it has a chance to ‘switch paths’.. This capability improves resilience, and ensures robust, high-availability connectivity across the region. The hubs also allow automatic re-routing of traffic between multiple cables. If one cable experiences a fault, traffic will automatically select the next best path and continue on its way. This ensures high availability not only for the host country, but minimizes downtime for services and users across the region.&lt;/p&gt;&lt;p data-block-key="agohb"&gt;&lt;b&gt;Content caching: Accelerating digital services&lt;/b&gt;&lt;/p&gt;&lt;p data-block-key="380io"&gt;Low latency is critical for optimal user experience. One of Google’s objectives is to serve content from as close to our users and customers as possible. By caching — storing copies of the most popular content locally — Google can reduce the latency to retrieve or view this content, improving the quality of services.&lt;/p&gt;&lt;p data-block-key="1rk0"&gt;&lt;b&gt;Colocation: Fostering a local ecosystem&lt;/b&gt;&lt;/p&gt;&lt;p data-block-key="am55b"&gt;Connectivity hubs are often in locations where users have limited access to high quality data centers to house their services and IT hardware, such as islands. Although these facilities are not very large as compared to a Google data center, Google understands the benefits of shared infrastructure, and is committed to providing rack space to carriers and local companies.&lt;/p&gt;&lt;h3 data-block-key="10og2"&gt;Energy efficiency&lt;/h3&gt;&lt;p data-block-key="acufh"&gt;Subsea cables are very energy efficient. As a result, even when supporting multiple cables, content storage and colocation, a Google connectivity hub requires far less power than a typical data center. They are primarily focused on networking and localized storage and not the large demands supporting AI, cloud and other important building blocks of the Internet. Of course, the power required for a connectivity hub can still be a lot for some smaller locations, and where it is, Google is exploring using its power demand to accelerate local investment in sustainable energy generation, consistent with its long history of stimulating renewable energy solutions.&lt;/p&gt;&lt;p data-block-key="1achf"&gt;These new connectivity hubs in the Maldives and Christmas Island are ideally situated to deepen the resilience of internet infrastructure in the Indian Ocean Region. The facilities will help power our products, strengthen local economies and bring AI benefits to people and businesses around the world. We look forward to announcing future subsea cables and connectivity hubs and further enhancing the Internet’s reach, reliability, and resilience.&lt;/p&gt;&lt;/div&gt;</description><pubDate>Mon, 17 Nov 2025 15:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/introducing-dhivaru-new-subsea-cable/</guid><category>Infrastructure</category><category>Networking</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Dhivaru_Map.max-600x600.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Introducing Dhivaru and two new connectivity hubs</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Dhivaru_Map.max-600x600.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/introducing-dhivaru-new-subsea-cable/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Bikash Koley</name><title>VP, Google Global Infrastructure, Google Cloud</title><department></department><company></company></author></item><item><title>Google Cloud Networking under the hood: How Protective ReRoute increases resilience</title><link>https://cloud.google.com/blog/products/networking/how-protective-reroute-improves-network-resilience/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud infrastructure reliability is foundational, yet even the most sophisticated global networks can suffer from a critical issue: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;slow or failed recovery from routing outages.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; In massive, planetary-scale networks like Google's, router failures or complex, hidden conditions can prevent traditional routing protocols from restoring service quickly, or sometimes at all. These brief but costly outages — what we call &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;slow convergence&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;convergence failure &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;— critically disrupt real-time applications with low tolerance to packet loss and, most acutely, today's massive, sensitive AI/ML training jobs, where a brief network hiccup can waste millions of dollars in compute time. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To solve this problem, we pioneered &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Protective ReRoute (PRR)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, a radical shift that moves the responsibility for rapid failure recovery from the centralized network core to the distributed endpoints themselves. Since putting it into production over five years ago, this host-based mechanism has dramatically increased Google’s network's resilience, proving effective in recovering from up to &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;84%&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;sup&gt;1&lt;/sup&gt; of inter-data-center outages that would have been caused by slow convergence events. Google Cloud customers with workloads that are sensitive to packet loss can also enable it in their environments — read on to learn more.   &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The limits of in-network recovery&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Traditional routing protocols are essential for network operation, but they are often not fast enough to meet the demands of modern, real-time workloads. When a router or link fails, the network must recalculate all affected routes, which is known as &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;reconvergence&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. In a network the size of Google's, this process can be complicated by the scale of the topology, leading to delays that range from many seconds to minutes. For distributed AI training jobs with their wide, fan-out communication patterns, even a few seconds of packet loss can lead to application failure and costly restarts. The problem is a matter of scale: as the network grows, the likelihood of these complex failure scenarios increases.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Protective ReRoute: A host-based solution&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Protective ReRoute is a simple, effective concept: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;empower the communicating endpoints (the hosts) to detect a failure and intelligently re-steer traffic to a healthy, parallel path.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Instead of waiting for a global network update, PRR capitalizes on the rich path diversity built into our network. The host detects packet loss or high latency on its current path, and then immediately initiates a path change by modifying carefully chosen packet header fields, which tells the network to use an alternate, pre-existing path.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This architecture represents a fundamental shift in network reliability thinking. Traditional networks rely on a combination of parallel and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;series reliability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. Serialization of components tends to reduce the reliability of a system; in a large-diameter network with multiple forwarding stages, reliability degrades as the diameter increases. In other words, every forwarding stage affects the whole system. Even if a network stage is designed with parallel reliability, it creates a serial impact on the overall network while the parallel stage reconverges. By adding PRR at the edges, we treat the network as a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;highly parallel system of paths that appear as a single stage&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, where the overall reliability increases as the number of available paths grows exponentially, effectively circumventing the serialization effects of slow network convergence in a large-diameter network. The following diagram contrasts the system reliability model for a PRR-enabled network with that of a traditional network. Traditional network reliability is in inverse proportion to the number of forwarding stages; with PRR the reliability of the same network is in direct proportion to the number of composite paths, which is exponentially proportional to the network diameter.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-Protective_ReRoute.max-1000x1000.png"
        
          alt="1-Protective_ReRoute"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;How Protective ReRoute works&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The PRR mechanism has three core functional components:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;End-to-end failure detection:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Communicating hosts continuously monitor path health. On Linux systems, the standard mechanism uses &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;TCP retransmission timeout (RTO)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to signal a potential failure. The time to detect a failure is generally a single-digit multiple of the network's round-trip time (RTT). There are also other methods for end-to-end failure detection that have varying speed and cost.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Packet-header modification at the host:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Once a failure is detected, the transmitting host modifies a packet-header field to influence the forwarding path. To achieve this, Google pioneered and contributed the mechanism that modifies the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;IPv6 flow-label&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; in the Linux kernel (version 4.20+). Crucially, the Google software-defined network (SDN) layer provides protection for &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;IPv4 traffic and non-Linux hosts&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; as well by performing the detection and repathing on the outer headers of the network overlay.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;PRR-aware forwarding:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Routers and switches in the multipath network respect this header modification and forward the packet onto a different, available path that bypasses the failed component.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Proof of impact&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;PRR is not theoretical; it is a continuously deployed, 24x7 system that protects production traffic worldwide. Its impact is compelling: PRR has been shown to reduce network downtime caused by slow convergence and convergence failures by up to the above-mentioned &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;84%&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This means that up to 8 out of every 10 network outages that would have been caused by a router failure or slow network-level recovery are now avoided by the host. Furthermore, host-initiated recovery is extremely fast, often resolving the problem in a single-digit multiple of the RTT, which is vastly faster than traditional network reconvergence times.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Key use cases for ultra-reliable networking&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The need for PRR is growing, driven by modern application requirements:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;AI/ML training and inference:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Large-scale workloads, particularly those &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;distributed across many accelerators&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (GPUs/TPUs), are uniquely sensitive to network reliability. PRR provides the ultra-reliable data distribution necessary to keep these high-value compute jobs running without disruption.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Data integrity and storage:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Significant numbers of dropped packets can result in &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;data corruption&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;data loss&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, not just reduced throughput. By reducing the outage window, PRR improves application performance and helps guarantee data integrity.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Real-time applications:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Applications like gaming and services like video conferencing and voice calls are intolerant of even brief connectivity outages. PRR reduces the recovery time for network failures to meet these strict real-time requirements.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Frequent short-lived connections:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Applications that rely on a large number of very frequent short-lived connections can fail when the network is unavailable for even a short time. By reducing the expected outage window, PRR helps these applications reliably complete their required connections.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Activating Protective ReRoute for your applications&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The architectural shift to host-based reliability is an accessible technology for Google Cloud customers. The core mechanism is open and part of the mainline Linux kernel (version 4.20 and later).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can benefit from PRR in two primary ways:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Hypervisor mode:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; PRR automatically protects traffic running across Google data centers without requiring any guest OS changes. Hypervisor mode provides recovery in the single digit seconds for traffic of moderate fanout in specific areas of the network.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Guest mode:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; For critical, performance-sensitive applications with high fan-out and in any segment of the network, you can &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;opt into guest-mode PRR&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, which&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;enables the fastest possible recovery time and greatest control. This is the optimal setting for demanding mission-critical applications, AI/ML jobs, and other latency-sensitive services.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To activate guest-mode PRR for critical applications follow the guidance in the &lt;/span&gt;&lt;a href="https://cloud.google.com/compute/docs/networking/tcp-optimization-for-network-performance-in-gcp-and-hybrid#use-prr-for-network-resiliency"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and be ready to ensure the following:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Your VM runs a modern Linux kernel (4.20+).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Your applications use TCP.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The application traffic uses IPv6. For IPv4 protection, the application needs to use the gVNIC driver.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The availability of Protective ReRoute has profound implications for a variety of Google and Google Cloud users. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;For cloud customers with critical workloads:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Evaluate and enable guest-mode PRR for applications that are sensitive to packet loss and that require the fastest recovery time, such as large-scale AI/ML jobs or real-time services.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;For network architects:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Re-evaluate your network reliability architectures. Consider the benefits of designing for rich path diversity and empowering endpoints to intelligently route around failures, shifting your model from series to parallel reliability.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;For the open-source community:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Recognize the power of host-level networking innovations. Contribute to and advocate for similar reliability features across all major operating systems to create a more resilient internet for everyone.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sup&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;1. &lt;a href="https://dl.acm.org/doi/10.1145/3603269.3604867" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://dl.acm.org/doi/10.1145/3603269.3604867&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;/em&gt;&lt;/sup&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 14 Nov 2025 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/how-protective-reroute-improves-network-resilience/</guid><category>Networking</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Google Cloud Networking under the hood: How Protective ReRoute increases resilience</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/how-protective-reroute-improves-network-resilience/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Victor Moreno</name><title>Solutions Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Yaogong Wang</name><title>Staff Software Engineer</title><department></department><company></company></author></item><item><title>How Ericsson achieves data integrity and superior governance with Dataplex</title><link>https://cloud.google.com/blog/topics/telecommunications/how-ericsson-achieves-data-integrity-and-superior-governance-with-dataplex/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Data is the engine of modern telecommunications. For Ericsson's Managed Services, which operates a global network of more than 710,000 sites, harnessing this data is not just an advantage, it's essential for business growth and leadership. To power the future of its &lt;/span&gt;&lt;a href="https://www.ericsson.com/en/ai/autonomous-networks" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;autonomous network operations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and deliver on its strategic priorities, Ericsson has been on a transformative data journey with governance at the center of its strategy.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ericsson moved from foundational practices to a sophisticated, business-enabling data governance framework using &lt;/span&gt;&lt;a href="https://cloud.google.com/dataplex"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud’s Dataplex Universal Catalog&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, turning data from a simple resource into a strategic asset.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;From a new operating model to a new data mindset&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ericsson’s journey began in 2019 with the launch of the &lt;/span&gt;&lt;a href="https://www.ericsson.com/en/managed-services/ericsson-operation-engine" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Ericsson Operations Engine&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (EOE), a groundbreaking, AI-powered operating model for managing complex, multi-vendor telecom networks. The EOE made one thing clear: to succeed,&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;data had to be at the core of everything.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This realization led Ericsson to develop its first enterprise data strategy, which established the core principles for how data is collected, managed and governed. However, building a strategy is one thing — operationalizing it at scale is another.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To move beyond theory to address real-world challenges, Ericsson needed to:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Build trust:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Provide discoverable, clean, reliable, and well-understood data to the teams deploying analytics, AI, and automation.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Balance defense and offense:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Ensure compliance with contracts and regulations (defensive governance) while empowering teams to innovate and create value from data (offensive governance).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Ensure data integrity: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Ericsson users see data integrity as the core principle for effective data management. Data quality, which is essential for reliable, trustworthy data throughout its lifecycle, is a key quality indicator (KQI) for measuring effectiveness. Any quality deviations must be managed like a high-priority incident with clear Service Level Agreements (SLA) for restoration and resolution.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To realize this vision, Ericsson sought a platform that could match its ambition for global-scale governance and innovation — and Dataplex Universal Catalog emerged as the ideal choice.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ericsson made its selection based on four key criteria. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, its capabilities aligned perfectly with Ericsson’s requirements for cloud-native transformation, business principles, and a long-term governance vision, underpinned by Ericsson’s strategic partnership with Google Cloud. Second, from a technical standpoint, Dataplex provided a tightly integrated, end-to-end ecosystem as a native Google Cloud solution, translating to faster time-to-market for use cases and reduced integration overhead.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Third, the platform offered a practical operating model that enabled quick learning, adaptation, and self-sufficiency, supporting an agile approach where Ericsson could fail fast and iterate. Finally, as an existing Google Cloud customer, Dataplex presented a clear and manageable Total Cost of Ownership (TCO), serving as a natural extension of Ericsson’s existing environment and providing a clear, manageable cost profile for both storage and compute extension with governance capabilities.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Putting governance into practice: Key capabilities in action&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With Dataplex Universal Catalog as the governance foundation, Ericsson began implementing the core pillars of its governance program, moving from manual processes to an automated, intelligent data fabric.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;More specifically, Ericsson established a unified business vocabulary within Dataplex. This transformative first step eliminated ambiguity and ensured their teams — from data scientists to data analysts — were speaking the same language. These glossaries also captured tribal knowledge and became the foundation for creating trusted data products.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition, Dataplex's catalog is at the heart of the data governance solution, making data discovery simple and intuitive for authorized users. Ericsson uses its tagging capabilities to enrich the data assets with critical metadata, including data classification, ownership, retention policies, and sensitivity labels. Dataplex’s ability to automatically visualize data lineage, down to the column level, is another game-changer. Different data personas can instantly understand a dataset's origin and its downstream impact, dramatically increasing trust and reducing investigation time. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Furthermore, trustworthy AI models are built on high-quality data. For proactive data quality, Ericsson uses Dataplex to run automated quality checks and profiles on its data pipelines. When a quality rule is breached, an alert is automatically triggered, creating an incident in its service management platform to ensure data issues are treated with the urgency they deserve.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These capabilities are all underpinned by Ericsson's&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Data Operating Model (DOM), a framework that defines the policies, people, processes, and technology needed to translate its data strategy into tangible value, comprising several facets when working with data.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_nGFHVwm.max-1000x1000.png"
        
          alt="image1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;ol style="list-style-type: upper-alpha;"&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Enterprise data architecture:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Managing data flow, enterprise data modeling and best practices for data collection till consumption&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Technology and tools&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Business glossary, master, reference and metadata management, data modeling, and data quality management&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Roles and responsibilities:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Roles to manage and govern data (i.e., end-to-end data lifecycle and stewardship)&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Data and model assurance:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Data pipelines monitoring, data observability, and data quality monitoring&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Governance: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Manage data compliance, risk and security management, managing operational level agreement, objective and key results, and audit management&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Processes:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Data governance, data quality, data management, and data consent related processes&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Looking ahead: The future is integrated and intelligent&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As a global technology leader, Ericsson is committed to shaping the future of AI-powered data governance. Technology, especially in the AI space, is evolving at a breathtaking pace and both the data and AI governance practices must keep up. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These developments are guiding Ericsson’s future priorities, which include bridging the gap between data and AI governance, especially with the rise of generative and agentic AI. These plans include evaluating using generative AI capabilities in BigQuery and Dataplex to simplify governance and pursuing solutions that ensure transparency, explainability, fairness and manage risk in the deployment of AI models. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In addition to harnessing the power of AI for at-scale governance, Ericsson will also include usage of governance workflows, glossary-driven data quality policies, at-scale assignment of terms to assets, bulk import and export of glossaries, AI-powered glossary recommendations, and data quality re-usability functionalities. Ericsson is also aligning its architecture with data fabric and data mesh principles, empowering teams with self-service access to high-quality, trusted data products.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, Ericsson will be assessing the use of more granular, policy-based access controls to complement existing role-based access, further strengthening its data security, protection and privacy.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For any organization embarking on a similar path, Ericsson’s experience offers several key lessons:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Governance is a value enabler, not a blocker:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; A modern data governance program is focused on business enablement first, driving value and innovation, to complement policies, rules and risk management.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;It's a journey, not a destination:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Be prepared to fail fast, learn, and adapt. The landscape is constantly changing at breakneck speed.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Focus on business outcomes, not tools:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Technology is a critical enabler, but the conversation is about the business value you’re creating. Simplify the story, speak the language of the business, and unpack the hype.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Culture is everything:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; For governance to be effective, it’s the responsibility of everyone. This requires strong leadership, sponsorship, and a "data-first" mindset embedded throughout the organization.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By partnering with Google Cloud and tapping into the power of Dataplex Universal Catalog, Ericsson is building a data foundation that is not only compliant and secure but agile and intelligent — ready to power the next generation of autonomous networks.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 07 Nov 2025 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/topics/telecommunications/how-ericsson-achieves-data-integrity-and-superior-governance-with-dataplex/</guid><category>Databases</category><category>Networking</category><category>Customers</category><category>Telecommunications</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How Ericsson achieves data integrity and superior governance with Dataplex</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/topics/telecommunications/how-ericsson-achieves-data-integrity-and-superior-governance-with-dataplex/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>William McCann Murphy</name><title>Head of Data Authority, Ericsson</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Akanksha Bhagwanani</name><title>EMEA Data Analytics Solution Lead, Google Cloud</title><department></department><company></company></author></item><item><title>7 ways networking powers your AI workloads on Google Cloud</title><link>https://cloud.google.com/blog/products/networking/how-google-cloud-networking-supports-your-ai-workloads/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;When we talk about artificial intelligence (AI), we often focus on the models, the powerful TPUs and GPUs, and the massive datasets. But behind the scenes, there's an unsung hero making it all possible: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;networking&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. While it's often abstracted away, networking is the crucial connective tissue that enables your AI workloads to function efficiently, securely, and at scale.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this post, we explore seven key ways networking interacts with your AI workloads on Google Cloud, from accessing public APIs to enabling next-generation, AI-driven network operations.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;#1 - Securely accessing AI APIs&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Many of the powerful AI models available today, like Gemini on Vertex AI, are accessed via public APIs. When you make a call to an endpoint like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;*-aiplatform.googleapis.com&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, you're dependent on a reliable network connection. To gain access these endpoints require proper authentication. This ensures that only authorized users and applications can access these powerful models, helping to safeguard your data and your AI investments. You can also access these endpoints privately, which we will see in more detail in point # 5.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;#2 - Exposing models for inference&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Once you've trained or tuned your model, you need to make it &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/docs/general/deployment"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;available for inference&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. In addition to managed offerings in Google Cloud, you also have the flexibility to deploy your models on infrastructure you control, using specialized &lt;/span&gt;&lt;a href="https://cloud.google.com/compute/docs/gpus#gpu-models"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;VM families with powerful GPUs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. For example, you can deploy your model on &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/kubernetes-engine-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Kubernetes Engine (GKE)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and use the &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Inference Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, Cloud Load Balancing, or a ClusterIP to expose it for private or public inference. These networking components act as the entry point for your applications, allowing them to interact with your model deployments seamlessly and reliably.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;#3 - High-speed GPU-to-GPU communication&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI workloads, especially training, involve moving massive amounts of data between GPUs. Traditional networking, which relies on CPU copy operations, can create bottlenecks. This is where protocols like &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Remote Direct Memory Access&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; (&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;RDMA) &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;come in. RDMA bypasses the CPU, allowing for direct memory-to-memory communication between GPUs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To support this, the underlying network must be lossless and high-performance. Google has built out a &lt;/span&gt;&lt;a href="https://cloud.google.com/compute/docs/gpus/gpu-network-bandwidth#h200-gpus"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;non-blocking rail-aligned network topology&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in its data center architecture to support RDMA communication and node scaling. Several high-performance GPU VM families support &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/network-profiles#about_network_profiles"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;RDMA over Converged Ethernet (RoCEv2)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, providing the speed and efficiency needed for demanding AI workloads.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;#4 - Data ingestion and storage connectivity&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Your AI models are only as good as the data they're trained on. This data needs to be stored, accessed, and retrieved efficiently. Google Cloud offers a variety of storage options, for example &lt;/span&gt;&lt;a href="https://cloud.google.com/architecture/ai-ml/storage-for-ai-ml#review-storage-options"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Storage&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/architecture/ai-ml/storage-for-ai-ml#review-storage-options"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Hyperdisk ML&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/architecture/ai-ml/storage-for-ai-ml#review-storage-options"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Lustre&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Networking is what connects your compute resources to your data. Whether you're accessing data directly or over the network, having a high-throughput, low-latency connection to your storage is essential for keeping your AI pipeline running smoothly.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;#5 - Private connectivity to AI workloads&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Security is paramount, and you often need to ensure that your AI workloads are not exposed to the public internet. Google Cloud provides several ways to achieve private communication to both managed Vertex AI services and your own DIY AI deployments. These include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/vpc-service-controls/docs/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;VPC Service Controls&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Creates a service perimeter to prevent data exfiltration.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/vpc/docs/private-service-connect"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Private Service Connect&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Allows you to access Google APIs and managed services privately from your VPC. You can use PSC endpoints to connect to your own services or Google services.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/dns/docs/best-practices"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud DNS&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt;:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://cloud.google.com/vpc/docs/configure-private-service-connect-services#configure-dns-manual"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Private DNS zones&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; can be used to resolve internal IP addresses for your AI services.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;#6 - Bridging the gap with hybrid cloud connections&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Many enterprises have a hybrid cloud strategy, with sensitive data remaining on-premises. The Cross-Cloud Network allows you to architect your network to provide any-to-any connectivity. With design cases covering &lt;/span&gt;&lt;a href="https://cloud.google.com/architecture/ccn-distributed-apps-design"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;distributed applications&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://services.google.com/fh/files/misc/global_front_end_solution_deep_dive.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Global front end&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://services.google.com/fh/files/misc/cloud_wan_solution_overview.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud WAN&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, you can build your architecture securely from on-premises, other clouds or other VPCs to connect to your AI workloads. This hybrid connectivity allows you to leverage the scalability of Google Cloud's AI services while keeping your data secured.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;#7 - The Future: AI-driven network operations&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The relationship between AI and networking is becoming a two-way street. With &lt;/span&gt;&lt;a href="https://cloud.google.com/gemini/docs/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini for Google Cloud&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, network engineers can now use natural language to design, optimize, and troubleshoot their network architectures. This is the first step towards what we call "agentic networking," where autonomous AI agents can proactively detect, diagnose, and even mitigate network issues. This transforms network engineering from a reactive discipline to a predictive and proactive one, ensuring your network is always optimized for your AI workloads.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=mNjysmJNmlw"
      data-glue-modal-trigger="uni-modal-mNjysmJNmlw-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_rd7yF6D.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Google&amp;#x27;s global network demo: fast incident response with autonomous network operations&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-mNjysmJNmlw-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="mNjysmJNmlw"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=mNjysmJNmlw"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Learn more&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To learn more about networking and AI on Google Cloud dive deeper with the following:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Documentation: &lt;/span&gt;&lt;a href="https://cloud.google.com/ai-hypercomputer/docs/create/create-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;AI Hypercomputer&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Codelabs: &lt;/span&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/terraform-gemini-cli-gce-psc" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI on GCE with a Private Service Connect endpoint&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;White paper: &lt;/span&gt;&lt;a href="https://cloud.google.com/resources/content/autonomous-network-operations?hl=en"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Leveling up with Autonomous Network Operations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Want to ask a question, find out more or share a thought? Please connect with me on &lt;/span&gt;&lt;a href="https://www.linkedin.com/in/ammett/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Linkedin&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-related_article_tout"&gt;





&lt;div class="uni-related-article-tout h-c-page"&gt;
  &lt;section class="h-c-grid"&gt;
    &lt;a href="https://cloud.google.com/blog/products/networking/google-global-network-technology-deep-dive/"
       data-analytics='{
                       "event": "page interaction",
                       "category": "article lead",
                       "action": "related article - inline",
                       "label": "article: {slug}"
                     }'
       class="uni-related-article-tout__wrapper h-c-grid__col h-c-grid__col--8 h-c-grid__col-m--6 h-c-grid__col-l--6
        h-c-grid__col--offset-2 h-c-grid__col-m--offset-3 h-c-grid__col-l--offset-3 uni-click-tracker"&gt;
      &lt;div class="uni-related-article-tout__inner-wrapper"&gt;
        &lt;p class="uni-related-article-tout__eyebrow h-c-eyebrow"&gt;Related Article&lt;/p&gt;

        &lt;div class="uni-related-article-tout__content-wrapper"&gt;
          &lt;div class="uni-related-article-tout__image-wrapper"&gt;
            &lt;div class="uni-related-article-tout__image" style="background-image: url('')"&gt;&lt;/div&gt;
          &lt;/div&gt;
          &lt;div class="uni-related-article-tout__content"&gt;
            &lt;h4 class="uni-related-article-tout__header h-has-bottom-margin"&gt;Diving into the technology behind Google&amp;#x27;s AI-era global network&lt;/h4&gt;
            &lt;p class="uni-related-article-tout__body"&gt;Google global network’s technology innovations to meet the demands of the AI era.&lt;/p&gt;
            &lt;div class="cta module-cta h-c-copy  uni-related-article-tout__cta muted"&gt;
              &lt;span class="nowrap"&gt;Read Article
                &lt;svg class="icon h-c-icon" role="presentation"&gt;
                  &lt;use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#mi-arrow-forward"&gt;&lt;/use&gt;
                &lt;/svg&gt;
              &lt;/span&gt;
            &lt;/div&gt;
          &lt;/div&gt;
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;/section&gt;
&lt;/div&gt;

&lt;/div&gt;</description><pubDate>Tue, 04 Nov 2025 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/networking/how-google-cloud-networking-supports-your-ai-workloads/</guid><category>Developers &amp; Practitioners</category><category>Networking</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/0-way-ai-hero.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>7 ways networking powers your AI workloads on Google Cloud</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/0-way-ai-hero.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/networking/how-google-cloud-networking-supports-your-ai-workloads/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ammett Williams</name><title>Developer Relations Engineer</title><department></department><company></company></author></item></channel></rss>