<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/"><channel><title>Open Source</title><link>https://cloud.google.com/blog/products/open-source/</link><description>Open Source</description><atom:link href="https://cloudblog.withgoogle.com/blog/products/open-source/rss/" rel="self"></atom:link><language>en</language><lastBuildDate>Tue, 24 Mar 2026 09:00:02 +0000</lastBuildDate><image><url>https://cloud.google.com/blog/products/open-source/static/blog/images/google.a51985becaa6.png</url><title>Open Source</title><link>https://cloud.google.com/blog/products/open-source/</link></image><item><title>The open platform for the AI era: GKE, agents, and OSS innovation at KubeCon EU 2026</title><link>https://cloud.google.com/blog/products/containers-kubernetes/gke-and-oss-innovation-at-kubecon-eu-2026/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As the cloud-native community gathers in Amsterdam for Kubecon + Cloudnativecon Europe this week, we’re excited to highlight some of the work we are doing to support both the open-source Kubernetes ecosystem and Google Kubernetes Engine (GKE). From breaking down the walls between cluster operating modes to making Kubernetes the absolute best place to run AI agents and Ray, here’s a look at what we are rolling out.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Autopilot for everyone&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Five years ago, we introduced &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Autopilot&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a fully managed GKE experience that dramatically simplified scaling and infrastructure management. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Previously, choosing between GKE Autopilot mode and Standard mode was a "fork in the road" decision made at cluster creation time. If you started with Standard and later wanted to switch to Autopilot, you had to create an entirely new cluster. This created friction for organizations managing mixed clusters, where some workloads required strict node-level control while others needed seamless, hands-off scaling.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Meet the new GKE, where Autopilot is available for every cluster. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Autopilot compute classes are now available for Standard clusters&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, allowing you to turn on Autopilot at any time, on a per-workload basis. Powered by GKE Autopilot’s &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/container-optimized-compute-delivers-autoscaling-for-autopilot?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Container-Optimized Compute Platform (COCP)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, you can unlock near-real-time, vertically and horizontally scalable compute that provides the exact capacity that you need, when you need it, at the best price and performance.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Furthermore, we are happy to announce we will open source&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; GKE Cluster Autoscaler&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, one of the core components driving infrastructure provisioning for our customers. Our goal is to provide a vendor-neutral platform that the OSS community can benefit from and build on top of.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Toward CNCF Kubernetes AI Conformance&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As the industry moves toward AI at massive scale, standardization is paramount. Together with the Kubernetes community last year, we launched the &lt;/span&gt;&lt;a href="https://www.cncf.io/announcements/2025/11/11/cncf-launches-certified-kubernetes-ai-conformance-program-to-standardize-ai-workloads-on-kubernetes/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;CNCF Kubernetes AI Conformance program&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which simplifies AI/ML on Kubernetes by establishing a standard for cluster interoperability and portability. We are proud to announce that &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;GKE is certified as an AI-conformant platform&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, so that your models and AI tools can be ported across environments.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Looking ahead to the upcoming v1.36 Kubernetes release, the AI Conformance community is proposing three new requirements to address the evolving needs of AI serving: advanced inference ingress, disaggregated serving, and high-performance networking. Google Cloud is committed to supporting these emerging community standards through GKE Inference Gateway, llm-d, and DRANET.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Model Context Protocol: An agent interface&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To streamline how AI agents interact with Kubernetes, last year, we introduced the open-source GKE &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/gke-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP) Server&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which offers a standardized interface that allows agents to manage, analyze, and monitor workloads, clusters, and resources through specific defined capabilities. By exposing these capabilities, MCP Server makes it easier to integrate various AI clients, including &lt;/span&gt;&lt;a href="https://geminicli.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://antigravity.google/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Antigravity&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, promoting more intelligent and automated management of Kubernetes ecosystems.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Kubernetes as AI infrastructure&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://llm-d.ai/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;llm-d&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is officially a CNCF Sandbox project, which marks a significant step in evolving Kubernetes into state-of-the-art AI infrastructure. Launched in May 2025 as a collaborative effort with industry leaders like Red Hat and NVIDIA, llm-d provides a Kubernetes-native distributed inference framework designed to be hardware-agnostic and vendor-neutral.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The project addresses complex AI orchestration challenges by introducing well-lit paths for inference-aware traffic management, native orchestration for multi-node replicas, and advanced state management for hierarchical KV cache offloading. By bridging the gap between cloud-native orchestration and frontier AI research, llm-d democratizes high-performance AI serving and establishes open, reproducible benchmarks for inference performance across various accelerators. We plan to work with the &lt;/span&gt;&lt;a href="https://github.com/cncf/k8s-ai-conformance" rel="noopener" target="_blank"&gt;&lt;span style="vertical-align: baseline;"&gt;CNCF AI Conformance&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; program on llm-d to help ensure critical capabilities like disaggregated serving are interoperable across the ecosystem&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. For more on llm-d, check out our blog &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/llm-d-officially-a-cncf-sandbox-project"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;DRA is the new standard for resource management&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Kubernetes was created in a simpler time, when CPU and Memory were the only variables, and clouds were seen as infinitely elastic. Today, of course, hardware is specialized and variable. Dynamic Resource Allocation, or &lt;/span&gt;&lt;a href="https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DRA&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, is an industry-standard solution for describing unique hardware in a standard format, allowing higher-level workloads and schedulers to optimize resources without access to low-level details about them. Today, we’re proud to announce the open-source release of our DRA driver for TPUs, marking a significant milestone in bringing AI workload portability to the Kubernetes ecosystem. Google and NVIDIA partnered closely on the design and implementation of DRA in OSS Kubernetes in a collaborative push to establish a unified resource management standard. We are proud to coordinate this release with the &lt;/span&gt;&lt;a href="https://blogs.nvidia.com/blog/nvidia-at-kubecon-2026" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;donation of the NVIDIA DRA Driver&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This is in addition to our DRA driver for networking, &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/allocate-network-resources-dra"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DRANET&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which is already available as a managed feature of GKE.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Supporting the agentic wave: Inference and agents&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The agentic AI wave is upon us, and we believe Kubernetes is unequivocally the best platform on which to run these agents. To execute LLM-generated code and interact with AI agents with confidence, you need deep isolation, rapid startup times, and specialized infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are heavily investing in open-source inference work to make this a reality. By leveraging innovations like &lt;/span&gt;&lt;a href="https://github.com/kubernetes-sigs/agent-sandbox" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Kubernetes Agent Sandbox&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for secure, gVisor-backed isolation, and &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/pod-snapshots"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Pod Snapshots&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which drastically improve startup latency by restoring workloads from a memory snapshot, we are establishing a standard for agentic AI on Kubernetes and providing high performance and compute efficiency for agents running on GKE.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Ray on Kubernetes: TPUs and better observability&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ray has become the standard for scaling demanding AI workloads, and we believe Kubernetes is a great place to run it. Until recently, official accelerator support was limited to NVIDIA GPUs. We are excited to announce TPUs in Ray v2.55, with full support by Anyscale and Google. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Ray on K8s users have historically struggled to debug and optimize performance, because they didn’t have access to historical data about their jobs.To solve this, we are introducing &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;the ability to debug issues after the RayJob has completed or terminated.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The Ray History Server uses Kuberay to set up and persist logs, state and metrics from live RayJobs and reproduce them in the Ray Dashboard. The Ray History Server (alpha) is available to &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/add-on/ray-on-gke/how-to/enable-ray-history-server"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;try today&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Join us at the booth&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Whether you are scaling up next-gen AI inference, deploying highly isolated agentic workflows, or simply looking to optimize compute capacity across your clusters, we are committed to making Kubernetes and GKE the ultimate platform for your success.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;If you’re at KubeCon Europe, stop by the Google Cloud booth (#310) to dive deep into these announcements and to discover our &lt;/span&gt;&lt;a href="https://rsvp.withgoogle.com/events/google-cloud-at-kubecon-europe-2026" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;sessions, lightning talks, hands on labs, and demos &lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;— plus a friendly competition with our text-based adventure game. Here's to the future of Kubernetes!&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 24 Mar 2026 09:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/containers-kubernetes/gke-and-oss-innovation-at-kubecon-eu-2026/</guid><category>GKE</category><category>Open Source</category><category>Containers &amp; Kubernetes</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>The open platform for the AI era: GKE, agents, and OSS innovation at KubeCon EU 2026</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/containers-kubernetes/gke-and-oss-innovation-at-kubecon-eu-2026/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Abdel Sghiouar</name><title>Senior Cloud Developer Advocate</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Allan Naim</name><title>Director of Product Management GKE</title><department></department><company></company></author></item><item><title>Kubernetes as AI Infrastructure: Google Cloud, llm-d, and the CNCF</title><link>https://cloud.google.com/blog/products/containers-kubernetes/llm-d-officially-a-cncf-sandbox-project/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google Cloud, serving the massive-scale needs of large foundation model builders and AI-native companies is at the forefront of our AI infrastructure strategy. As generative AI transitions to mission-critical production environments, these innovators require dynamic, relentlessly efficient infrastructure to overcome complex orchestration challenges and power an agentic future.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span style="vertical-align: baseline;"&gt;To meet this moment, we are thrilled to announce that &lt;/span&gt;&lt;a href="https://llm-d.ai/" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;llm-d&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; has &lt;/span&gt;&lt;a href="https://www.cncf.io/blog/2026/03/24/welcome-llm-d-to-the-cncf-evolving-kubernetes-into-sota-ai-infrastructure/" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;officially&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; been accepted as a Cloud Native Computing Foundation (CNCF) Sandbox project. Google Cloud is proud to be a founding contributor to llm-d alongside Red Hat, IBM Research, CoreWeave, and NVIDIA, uniting around a clear, industry-defining vision: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;any model, any accelerator, any cloud.&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This contribution underscores Google’s long-standing leadership in open-source innovation. And under the trusted stewardship of the Linux Foundation, we are helping ensure that the future of distributed AI inference is built on open standards rather than walled gardens. This gives foundation model builders the confidence to deploy their models globally without vendor lock-in, while empowering them to run the absolute best, most highly optimized implementations of these open technologies directly on Google Cloud.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_KwJQrYd.max-1000x1000.png"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Supercharging Kubernetes for inference&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Kubernetes is the undisputed industry standard for orchestration. While it provides a rock-solid foundation, it wasn’t originally built for the highly stateful and dynamic demands of LLM inference. To evolve Kubernetes for this new class of workload, we launched &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/tutorials/serve-with-gke-inference-gateway"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Inference Gateway&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which provides native APIs to go far beyond simple load balancing. Under the hood, the gateway leverages the &lt;/span&gt;&lt;a href="https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/docs/proposals/004-endpoint-picker-protocol" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;llm-d Endpoint Picker (EPP)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for scheduling intelligence. By delegating routing decisions to llm-d, the system enforces a multi-objective policy that considers real-time KV-cache hit rates, the number of inflight requests, and instance queue depth to route each request to the most optimal backend for processing.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For foundation model builders operating at massive scale, the real-world impact of this model-aware routing is transformative. Recently, our Vertex AI team &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/how-gke-inference-gateway-improved-latency-for-vertex-ai?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;validated&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; this architecture in production, proving its ability to handle highly unpredictable traffic without relying on fragile custom schedulers. For context-heavy coding tasks using Qwen Coder, Time-to-First-Token (TTFT) latency was slashed by over 35%. When handling bursty, stochastic chat workloads using DeepSeek for research, P95 tail latency improved by 52%, effectively absorbing severe load variance. Crucially, the gateway's routing intelligence doubled Vertex AI's prefix cache hit rate from 35% to 70%, drastically lowering re-computation overhead and cost-per-token.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_K56j60Q.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Beyond intelligent routing, orchestrating multi-node AI deployments requires bulletproof underlying primitives, which is why Google leads the development of the Kubernetes &lt;/span&gt;&lt;a href="https://lws.sigs.k8s.io/docs/overview/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LeaderWorkerSet&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (LWS) API. LWS enables llm-d to orchestrate wide expert parallelism and disaggregate compute-heavy prefill and memory-heavy decode phases into independently scalable pods. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;With its widespread industry adoption, LWS now orchestrates a rapidly growing footprint of production AI workloads, managing massive fleets of TPUs and GPUs at global scale. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Complementing this orchestration, Google recently &lt;/span&gt;&lt;a href="https://vllm.ai/blog/vllm-tpu" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;extended vLLM natively for Cloud TPUs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Featuring a unified PyTorch and JAX backend alongside innovations like Ragged Paged Attention v3, this integration delivers up to 5x throughput gains over our first release earlier last year. Together, whether you are scaling on Google Cloud TPUs or NVIDIA GPUs, these advancements help ensure state-of-the-art AI serving remains a highly optimized, accelerator-agnostic capability.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Building next-gen AI infrastructure together&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To build the ultimate AI infrastructure, we must bridge the gap between cloud-native Kubernetes orchestration and frontier AI research. The shift to production-grade gen AI requires an engine built on trust, transparency, and deep collaboration with the AI/ML leaders pushing the boundaries of what is possible.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are incredibly excited to partner with the Linux Foundation, the CNCF, the PyTorch Foundation, and the rest of the open-source community to build the next generation of AI infrastructure. By establishing "well-lit paths" — proven, replicable blueprints tested end-to-end under realistic load — we are ensuring that high-performance AI thrives as an open, universally accessible ecosystem that empowers innovation without boundaries.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We invite large foundation model builders, AI natives, platform engineers, and AI researchers to join us in shaping the open future of AI inference:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Explore the well-lit paths:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Visit the &lt;/span&gt;&lt;a href="https://llm-d.ai/docs/guide" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;llm-d guides&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to start deploying SOTA inference stacks on your infrastructure today.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Learn more:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Check out the official website at &lt;/span&gt;&lt;a href="https://llm-d.ai" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://llm-d.ai&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;/ &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Contribute:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Join the community on Slack and get involved in our GitHub repositories at &lt;/span&gt;&lt;a href="https://github.com/llm-d/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;https://github.com/llm-d/&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Join us in celebrating llm-d at the CNCF! We look forward to scaling the engine together.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 24 Mar 2026 09:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/containers-kubernetes/llm-d-officially-a-cncf-sandbox-project/</guid><category>GKE</category><category>AI &amp; Machine Learning</category><category>Open Source</category><category>Containers &amp; Kubernetes</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Kubernetes as AI Infrastructure: Google Cloud, llm-d, and the CNCF</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/containers-kubernetes/llm-d-officially-a-cncf-sandbox-project/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Sean Horgan</name><title>Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Abdel Sghiouar</name><title>Senior Cloud Developer Advocate</title><department></department><company></company></author></item><item><title>OTLP everywhere: Cloud Monitoring now supports OpenTelemetry Protocol metrics</title><link>https://cloud.google.com/blog/products/management-tools/otlp-opentelemetry-protocol-for-google-cloud-monitoring-metrics/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As part of our commitment to open standards, Google Cloud is deeply invested in making &lt;/span&gt;&lt;a href="http://opentelemetry.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;OpenTelemetry&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; the universal client, data format, and set of standards for telemetry data.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Last year we announced &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/management-tools/opentelemetry-now-in-google-cloud-observability"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;support in Cloud Observability for sending traces using &lt;/span&gt;&lt;/a&gt;&lt;a href="https://opentelemetry.io/docs/specs/otel/protocol/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;OpenTelemetry Protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (OTLP). Today, we’re excited to announce the next step toward our goal of OpenTelemetry everywhere: &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Observability now supports &lt;/strong&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/otlp-metrics/overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;OTLP for metrics&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; in Cloud Monitoring!&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;OTLP for metrics: More than just a new standard&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Using OpenTelemetry and OTLP lets you generate and send metric data to Google Cloud with a completely provider-agnostic pipeline: You can create OTLP metrics using the OpenTelemetry SDK, collect and transform them using the OpenTelemetry collector, and send that data directly to Cloud Monitoring in OpenTelemetry format. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By default, this data gets stored in the same format as &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/managed-prometheus"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Managed Service for Prometheus&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; data, at the same low price. This data is queryable using the same interfaces available to query any other data in Cloud Monitoring. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Using OTLP also lets you take advantage of several highly-requested new features, such as:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;DELTA-type metrics&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Send the amount that a monotonic counter changed between the last export and the current export, instead of tracking all counters in memory and always sending the latest value of the counter. This allows clients to flush memory in between exports, which significantly reduces resource consumption on the client-side and better supports collecting short-lived or infrequently incremented time series.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Exponential (dynamic) histograms&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Classic Histograms require you to explicitly set their bucket widths based on the projected data distribution. If that projection doesn’t match the actual data distribution, you can end up with all the observations lumped into a few low buckets, or with the most interesting observations smeared across a “lower than infinity” bucket. &lt;/span&gt;&lt;a href="https://opentelemetry.io/docs/specs/otel/metrics/data-model/#exponentialhistogram" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;OpenTelemetry Exponential Histograms&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; instead dynamically change the bucket boundaries based on the range of values actually seen, so you no longer have to guess and check histogram buckets. Just set it and forget it!&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dots and slashes in metric names and dots in label keys&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Cloud Monitoring now has full support for &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/monitoring/promql#promql-cm-query"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;querying URL-style names using PromQL&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Additionally, Cloud Monitoring now supports the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;.&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; character in label keys, which enables support for OpenTelemetry’s &lt;/span&gt;&lt;a href="https://opentelemetry.io/docs/specs/semconv/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;semantic conventions&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Sending metrics directly from the SDK to Cloud Monitoring with no collector&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: For extremely high-volume, high-cardinality metric sources such as Envoy (which reports pod-pod and service-service traffic) or customer-run load balancer processes, it can be prohibitively expensive to have an OpenTelemetry collector in the pipeline. Collectors can get overloaded with excessive volume of metrics, and horizontally or vertically scaling them is a lot of work for developers. With OTLP, you can point metrics exported by the OpenTelemetry SDK directly at Cloud Observability’s Telemetry API for metrics, letting you rely on Google to handle your volume rather than having to run and scale an intermediary process yourself.&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Zero-code auto-instrumentation for metrics and traces&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Use OpenTelemetry to &lt;/span&gt;&lt;a href="https://opentelemetry.io/docs/zero-code/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;automatically instrument compatible workloads&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which generates standardized traces and Golden Signal metrics without requiring any code, and then send data to Cloud Observability in OTLP format. No longer do you need to mix application and instrumentation code to get Golden Signal metrics — and that’s if you even remember to consistently instrument every RPC in your code.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Managed OpenTelemetry for Google Kubernetes Engine&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Running an OpenTelemetry collector yourself can be a lot of work, requiring you to manually deploy, configure, and scale OpenTelemetry collectors. But for typical workloads with typical observability profiles, all you typically need is a simple in-cluster endpoint for receiving and enriching OTLP signals.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That’s why we’re excited to also announce &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/managed-otel-gke"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Managed OpenTelemetry for GKE&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a fully-managed, “one-click” pipeline for generating and collecting OTLP traces, metrics, and logs on Google Kubernetes Engine. Let Google handle the collector lifecycle, upgrades, and scaling, so you can focus on your application code, not your observability infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Managed OpenTelemetry is the first fully managed trace solution for GKE. Tracing is critical for application performance monitoring and powers features like the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/observability/application-topology"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;application topology map&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a dynamic, actionable view of your application's dependencies. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can also use Managed OpenTelemetry for GKE to automatically configure and instrument workloads that use the OpenTelemetry SDK. With a single Custom Resource, you can get Golden Signals in Cloud Observability for all your OpenTelemetry-enabled applications — including AI agents built with frameworks that support OpenTelemetry such as the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/instrumentation/ai-agent-adk"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Development Kit (ADK)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;How to get started&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/otlp-metrics/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;OTLP for metrics&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is currently in preview, is open to all customers, and is supported when using OpenTelemetry versions 0.140.0 and higher. To get started with OTLP metrics using the OpenTelemetry SDK or the OpenTelemetry Collector, see the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/otlp-metrics/overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;OTLP metric ingestion documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. When running your own collector, we recommend using the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/instrumentation/google-built-otel"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google-built OpenTelemetry Collector&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; whenever possible.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/managed-otel-gke"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Managed OpenTelemetry for GKE&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is currently in preview, is open to all customers, and is available for GKE cluster versions 1.34.1-gke.2178000 or later and gcloud CLI versions 551.0.0 or later. To get started, see &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/managed-otel-gke"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;the Managed OpenTelemetry for GKE documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Finally, to get started with zero-code auto-instrumentation for Java workloads on GKE using a self-deployed OpenTelemetry Collector, see the &lt;/span&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/instrumentation/otel-zerocode-java-gke"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;zero-code documentation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-related_article_tout"&gt;





&lt;div class="uni-related-article-tout h-c-page"&gt;
  &lt;section class="h-c-grid"&gt;
    &lt;a href="https://cloud.google.com/blog/products/management-tools/opentelemetry-now-in-google-cloud-observability/"
       data-analytics='{
                       "event": "page interaction",
                       "category": "article lead",
                       "action": "related article - inline",
                       "label": "article: {slug}"
                     }'
       class="uni-related-article-tout__wrapper h-c-grid__col h-c-grid__col--8 h-c-grid__col-m--6 h-c-grid__col-l--6
        h-c-grid__col--offset-2 h-c-grid__col-m--offset-3 h-c-grid__col-l--offset-3 uni-click-tracker"&gt;
      &lt;div class="uni-related-article-tout__inner-wrapper"&gt;
        &lt;p class="uni-related-article-tout__eyebrow h-c-eyebrow"&gt;Related Article&lt;/p&gt;

        &lt;div class="uni-related-article-tout__content-wrapper"&gt;
          &lt;div class="uni-related-article-tout__image-wrapper"&gt;
            &lt;div class="uni-related-article-tout__image" style="background-image: url('')"&gt;&lt;/div&gt;
          &lt;/div&gt;
          &lt;div class="uni-related-article-tout__content"&gt;
            &lt;h4 class="uni-related-article-tout__header h-has-bottom-margin"&gt;OpenTelemetry Protocol comes to Google Cloud Observability&lt;/h4&gt;
            &lt;p class="uni-related-article-tout__body"&gt;Google Cloud Observability’s Cloud Trace now supports users sending trace data using OpenTelemetry (OTLP) via telemetry.googleapis.com.&lt;/p&gt;
            &lt;div class="cta module-cta h-c-copy  uni-related-article-tout__cta muted"&gt;
              &lt;span class="nowrap"&gt;Read Article
                &lt;svg class="icon h-c-icon" role="presentation"&gt;
                  &lt;use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#mi-arrow-forward"&gt;&lt;/use&gt;
                &lt;/svg&gt;
              &lt;/span&gt;
            &lt;/div&gt;
          &lt;/div&gt;
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/a&gt;
  &lt;/section&gt;
&lt;/div&gt;

&lt;/div&gt;</description><pubDate>Mon, 09 Feb 2026 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/management-tools/otlp-opentelemetry-protocol-for-google-cloud-monitoring-metrics/</guid><category>GKE</category><category>Open Source</category><category>Management Tools</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_hero_image_-_png_uncompressed.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>OTLP everywhere: Cloud Monitoring now supports OpenTelemetry Protocol metrics</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_hero_image_-_png_uncompressed.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/management-tools/otlp-opentelemetry-protocol-for-google-cloud-monitoring-metrics/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Lee Yanco</name><title>Senior Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>James Maffey</name><title>Senior Product Manager</title><department></department><company></company></author></item><item><title>How the Max Planck Institute is sharing expert skills through multimodal agents</title><link>https://cloud.google.com/blog/products/ai-machine-learning/planck-institute-research-expert-gen-ai-agent/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Effective monitoring and treatment of complex diseases like &lt;/span&gt;&lt;a href="https://doi.org/10.1016/j.ccell.2025.06.004" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;cancer&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://doi.org/10.15252/msb.20199356" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Alzheimer's disease&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; depends on understanding the underlying biological processes, for which proteins are essential. &lt;/span&gt;&lt;a href="https://doi.org/10.1038/nature19949" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Mass spectrometry-based proteomics&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is a powerful method for studying these proteins in a fast and global manner. Yet the widespread adoption of this technique remains constrained by technical complexity as mastering these sophisticated analytical instruments and procedures requires specialized training. This creates an expertise bottleneck&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;that slows research progress.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To address this challenge, researchers at the Max Planck Institute of Biochemistry collaborated with Google Cloud to build a &lt;/span&gt;&lt;a href="https://www.biorxiv.org/content/10.1101/2025.10.05.680425v1" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Proteomics Lab Agent&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; that assists scientists with their experiments. This&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; agent &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;simplifies performing complex scientific procedures through personalized AI guidance, making them easier to execute, while automatically documenting the process.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;“&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;A lab’s critical expertise is often tacit knowledge that is rarely documented and lost to academic turnover. This agent addresses that directly, not only by capturing hands-on practice to build an institutional memory, but by systematically detecting experimental errors to enhance reproducibility. Ultimately, this is about empowering our labs to push the frontiers of science faster than ever before.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;”, said Prof. Matthias Mann, a pioneer in mass spectrometry-based proteomics who leads the Department of Proteomics and Signal Transduction at the Max Planck Institute of Biochemistry.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The agent was built using the &lt;/span&gt;&lt;a href="https://google.github.io/adk-docs/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Development Kit (ADK)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, Google Cloud infrastructure, and Gemini models, which offer advanced video and long-context understanding uniquely suited to the needs of advanced research. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;One of the agent's core capabilities is to detect errors and omissions by analyzing a video of a researcher performing lab work and comparing their actions against a reference protocol. This process takes just over two minutes and catches about 74% of procedural errors with high accuracy&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;,&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;although domain-specific knowledge and spatial recognition should still be improved&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Our Ai-assisted approach is more efficient compared to the current manual approach, which relies on a researcher's intuition to either spot subtle mistakes during the procedure or, more commonly, to troubleshoot only after an experiment has failed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;By making it easier to spot mistakes and offering personalized guidance, the agent can reduce troubleshooting time and build towards a future where real-time AI guidance can help prevent errors from happening.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The potential of the Proteomics AI agent goes beyond life sciences, addressing a universal challenge in specialized fields: capturing and transferring the kind of expertise that is learned through hands-on practice, not from manuals. To enable other researchers and organizations to adapt this concept to their own domains, the agentic framework has been made available as an open-source project on &lt;/span&gt;&lt;a href="https://github.com/MannLabs/proteomics_lab_agent" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this post, we will detail the agentic framework of the Proteomics Lab Agent, how it uses multimodal AI to provide personalized laboratory guidance, and the results from its deployment in a real-world research environment.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=j_S_-wmJ1j8"
      data-glue-modal-trigger="uni-modal-j_S_-wmJ1j8-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/maxresdefault_04JTJc5.max-1000x1000.jpg);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Max Planck AI laboratory agent&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
      &lt;figcaption class="article-video__caption h-c-page"&gt;
        
          &lt;h4 class="h-c-headline h-c-headline--four h-u-font-weight-medium h-u-mt-std"&gt;Proteomics Lab Agent generates protocols and detects errors&lt;/h4&gt;
        
        
          &lt;p&gt;Proteomics Lab Agent generates protocols and detects errors&lt;/p&gt;
        
      &lt;/figcaption&gt;
    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-j_S_-wmJ1j8-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="j_S_-wmJ1j8"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=j_S_-wmJ1j8"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The challenge: Preserving expert knowledge in a high-turnover environment&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Imagine it’s a Friday evening in the lab. A junior researcher needs to use a sophisticated analytical instrument, a mass spectrometer, but the senior expert who is responsible for it has already left for the weekend. The researcher has to search through lengthy protocols, interpret the instrument’s performance, which depends on multiple factors reflected in diverse metrics, and proceed without guidance. A single misstep could potentially damage the expensive equipment, waste a unique and valuable sample, or compromise the entire study.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Such complexity is a regular hurdle in specialized research fields like mass spectrometry-based proteomics. Scientific progress often depends on complex techniques and instruments that require deep technical expertise. Laboratories face a significant bottleneck in training personnel, documenting procedures, and retaining knowledge, especially with the high rate of academic turnover. When an expert leaves, their accumulated knowledge often leaves with them, forcing the team to partially start over. Collectively, this creates accessibility and reproducibility challenges, which slows down new discoveries.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;A solution: an AI agent for lab guidance&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;a href="https://www.biorxiv.org/content/10.1101/2025.10.05.680425v1" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;proteomics lab agent&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; addresses these challenges by connecting directly to the lab's collective knowledge - from protocols and instrument data to past troubleshooting decisions. With this it provides researchers with personalized AI guidance for complex procedures across the entire experimental workflow. Examples include regular wet-lab work such as pipetting or the interactions with specialized equipment and software as required for operating a mass spectrometer. A further feature of the agent is the ability to automatically generate detailed protocols from videos of experiments, detect procedural errors, and provide guidance for correction, reducing troubleshooting and documentation time.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;An AI agent architecture for the lab&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The underlying multimodal agentic AI framework uses a main agent that coordinates the work of several specialized sub-agents, as shown in Figure 1. Built with Gemini models and the Agent Development Kit, this main agent acts as an orchestrator. It receives a researcher's query, interprets the request, and delegates the task to the appropriate sub-agent.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1-Fig1.max-1000x1000.png"
        
          alt="1-Fig1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="6bk09"&gt;Figure 1: Architecture of the Proteomics Lab Agent for multimodal guidance.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The sub-agents are designed for specific functions and connect to the lab's existing knowledge systems:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Lab Note and Protocol Agents:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; These agents handle video-related tasks. When a researcher provides a video of an experiment, these agents upload videos to Google Cloud Storage to allow the analysis of the visual and spoken content of a video. Following, the agent can check for errors or generate a new protocol.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Lab Knowledge Agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This agent connects to the laboratory’s knowledge base (&lt;/span&gt;&lt;a href="https://github.com/sooperset/mcp-atlassian" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;MCP Confluence&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;) to retrieve protocols or save new lab notes, making knowledge accessible to the entire team.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Instrument Agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; To provide guidance on using complex analytical instruments, this agent retrieves instrument performance metrics from a self-build MCP server that monitors the lab's mass spectrometers (&lt;/span&gt;&lt;a href="https://github.com/MannLabs/alphakraken" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;MCP AlphaKraken&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Quality Control Memory Agent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This agent captures all instrument-related decisions and their outcomes in a database (e.g. MCP BigQuery). This creates a searchable history of what has worked in the past and preserves valuable troubleshooting experience.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Together, these agents can provide guidance adapted to the current instrument status and the researcher's experience level while automatically documenting the researcher's experience.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;A closer look: Catching experimental errors with video analysis&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While generative AI has proven effective for digital tasks in science - from literature analysis to controlling lab robots through code - it has not addressed the critical gap between digital assistance and hands-on laboratory execution. Our work demonstrates how to bridge this divide by automatically generating lab notes and detecting experimental errors from a video.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2-Fig2.max-1000x1000.png"
        
          alt="2-Fig2"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="6bk09"&gt;Figure 2: Agent workflow for the video-based lab note generation and error detection.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The process, illustrated in Figure 2, unfolds in several steps:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;A researcher records their experiment and submits the video to the agent with a prompt like, "Generate a lab note from this video and check for mistakes.".&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The main agent delegates the task to the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Lab Note Agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, which uploads the video to Google Cloud Storage and analyzes the actions performed in the video.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The main agent asks the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Lab Knowledge Agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to find the protocol that matches these actions. The Lab Knowledge Agent then retrieves it from the lab's knowledge base, Confluence.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;With both the video analysis and the baseline protocol, the task is passed on to the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Lab Note Agent &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;again, which has the knowledge how to perform a step-by-step comparison of video and protocol. It flags any potential mistakes, such as missed steps, incorrectly performed actions, added steps not in the protocol, or steps completed in the wrong order.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The main agent returns the generated lab notes to the researcher with these potential errors flagged for review. The researcher can accept the notes or make corrections.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Once finalized, the corrected notes are saved back to the Confluence knowledge base via the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Lab Knowledge Agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, preserving a complete and accurate record of the experiment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Building institutional memory&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To support a lab in building a knowledge base, the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Protocol Agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; can generate lab instructions directly from a video. A researcher can record themselves performing a procedure while explaining the steps aloud. The agent analyzes the video and audio to produce a formatted, publication-ready protocol. We found that providing the model with a diverse set of examples, step-by-step instructions, and relevant background documents produced the best results.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3-Fig3.max-1000x1000.png"
        
          alt="3-Fig3"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="6bk09"&gt;Figure 3: Agent workflow for guiding instrument operations.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The agent can also support instrument operations (see Figure 3). A researcher may ask, "Is instrument X ready so that I can measure my samples?". The agent retrieves the latest instrument metrics via the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Instrument Agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; and compares it with past troubleshooting decisions from the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Quality Control Memory Agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. It then provides a recommendation, such as "Yes, the instrument is ready," or "No, calibration is recommended first”. It can even provide the relevant calibration protocol from the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Lab Knowledge Agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. Subsequently, it saves the final researcher's decision and actions with the &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Quality Control Memory Agent&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. With this, every reasoning and its outcome is saved, creating a continuously improving knowledge base for operating specialized equipment and software. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;More technical details are described in our &lt;/span&gt;&lt;a href="https://www.biorxiv.org/content/10.1101/2025.10.05.680425v1" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;full publication&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Real-world impact: Making complex scientific procedures easier&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To measure the AI agent’s value in a real-world setting, we deployed it in our department at the Max Planck Institute of Biochemistry, a group with 40 researchers. We evaluated the agent's performance across three key laboratory functions: detecting procedural errors, generating protocols, and providing personalized guidance.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The results showed strong gains in both speed and quality. Key findings include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;AI-assisted error detection:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The agent successfully identified 74% of all procedural errors (a metric known as recall) with an overall accuracy of 77% when comparing 28 recorded lab procedures against their reference protocols. While precision (41%) is still a limitation at this early stage, the results are highly promising.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fast, expert-quality protocols:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; From lab videos, the agent generated standardized, publication-ready protocols in about 2.6 minutes. This was approximately 10 times faster than manual creation and achieved an average quality score of 4.4 out of 5 across 10 diverse protocols.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Personalized, real-time support:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The agent successfully integrated real-time instrument data with past performance decisions to provide researchers with tailored advice on equipment use.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A deeper analysis of the error-detection results revealed specific strengths and areas for improvement. As shown in Figure 4, the system is already effective at recognizing general lab equipment and reading on-screen text. The main limitations were in understanding highly specialized proteomics equipment (27% of these errors were unrecognized) and perceiving fine-grained details, such as the exact placement of pipette tips on a 96-well grid (47%) or small text on pipettes (41%) (see Appendix of &lt;/span&gt;&lt;a href="https://www.biorxiv.org/content/10.1101/2025.10.05.680425v1" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;corresponding paper&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;). As multimodal models advance, we expect their ability to interpret these details will improve, strengthening this critical safeguard against experimental mistakes.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/4-Fig4.max-1000x1000.png"
        
          alt="4-Fig4"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="6bk09"&gt;Figure 4: Strengths and current limitations of the Proteomics Lab Agent in a lab.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our agent already automates documentation and flags errors in recorded videos, but its future potential lies in prevention, not just correction. We envision an interactive assistant that uses speech to prevent mistakes in real-time before they happen. By making this project open source, we invite the community to help build this future.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Scaling for the future&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In conclusion, this framework addresses critical challenges in modern science, from the reproducibility crisis to knowledge retention in high-turnover academic environments. By systematically capturing not just procedural data but also the expert reasoning behind them, the agent builds an institutional memory.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;"This approach helps us capture and share the practical knowledge that is often lost when a researcher leaves the lab", notes Matthias Mann. "This collected experience will not only accelerate the training of new team members but also creates the data foundation we need for future innovations like predictive instrument maintenance for mass spectrometers and automated protocol harmonization within individual labs and across different labs".&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The principles behind the Proteomics Lab Agent are not limited to one field. The concepts outlined in this study are a generalizable solution for any discipline that relies on complex, hands-on procedures, from life sciences to manufacturing.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Dive deeper into the methodology and results by reading our &lt;/span&gt;&lt;a href="https://www.biorxiv.org/content/10.1101/2025.10.05.680425v1" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;full paper&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Explore the code on &lt;/span&gt;&lt;a href="https://github.com/MannLabs/proteomics_specialist" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and adapt the Proteomics Lab Agent for your own research. Follow the work of the Mann Lab at the Max Planck Institute to see what comes next either on &lt;/span&gt;&lt;a href="https://www.linkedin.com/company/mann-lab/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LinkedIn&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://bsky.app/profile/mannlab.bsky.social" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;BlueSky&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; or &lt;/span&gt;&lt;a href="https://x.com/labs_mann?lang=de" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;X&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;sub&gt;&lt;em&gt;&lt;span style="vertical-align: baseline;"&gt;This project was a collaboration between the Max Planck Institute of Biochemistry and Google. The core team included Patricia Skowronek and Matthias Mann from Department of Proteomics and Signal Transduction at the Max Planck Institute for Biochemistry and Anant Nawalgaria from Google. P.S. and M.M. want to thank the entire Mann Lab for their support.&lt;/span&gt;&lt;/em&gt;&lt;/sub&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 24 Oct 2025 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/planck-institute-research-expert-gen-ai-agent/</guid><category>Public Sector</category><category>Open Source</category><category>Customers</category><category>Google Cloud in Europe</category><category>AI &amp; Machine Learning</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/images/Planck-Institute-Research-AI-Agent-Hero.max-600x600.png" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How the Max Planck Institute is sharing expert skills through multimodal agents</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/images/Planck-Institute-Research-AI-Agent-Hero.max-600x600.png</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/planck-institute-research-expert-gen-ai-agent/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Dr. Patricia Skowronek</name><title>Post-doctoral researcher, Max-Planck-Institute for Biochemistry</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Anant Nawalgaria</name><title>Sr. Staff ML Engineer &amp; PM, Google</title><department></department><company></company></author></item><item><title>Powering AI commerce with the new Agent Payments Protocol (AP2)</title><link>https://cloud.google.com/blog/products/ai-machine-learning/announcing-agents-to-payments-ap2-protocol/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, Google announced the &lt;/span&gt;&lt;a href="http://goo.gle/ap2" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent Payments Protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (AP2), an open protocol developed with leading payments and technology companies to securely initiate and transact agent-led payments across platforms. The protocol can be used as an extension of the &lt;/span&gt;&lt;a href="https://a2a-protocol.org" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Agent2Agent (A2A) protocol&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and Model Context Protocol (MCP). In concert with industry rules and standards, it establishes a payment-agnostic framework for users, merchants, and payments providers to transact with confidence across all types of payment methods.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We’re collaborating with a diverse group of more than 60 organizations to help shape the future of agentic payments, including Adyen, American Express, Ant International, Coinbase, Etsy, Forter, Intuit, JCB, Mastercard, Mysten Labs, Paypal, Revolut, Salesforce, ServiceNow, UnionPay International, Worldpay, and more. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-video"&gt;



&lt;div class="article-module article-video "&gt;
  &lt;figure&gt;
    &lt;a class="h-c-video h-c-video--marquee"
      href="https://youtube.com/watch?v=yLTp3ic2j5c"
      data-glue-modal-trigger="uni-modal-yLTp3ic2j5c-"
      data-glue-modal-disabled-on-mobile="true"&gt;

      
        

        &lt;div class="article-video__aspect-image"
          style="background-image: url(https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_YoRSHai.max-1000x1000.png);"&gt;
          &lt;span class="h-u-visually-hidden"&gt;Intro to Google Agent Payments Protocol (AP2)&lt;/span&gt;
        &lt;/div&gt;
      
      &lt;svg role="img" class="h-c-video__play h-c-icon h-c-icon--color-white"&gt;
        &lt;use xlink:href="#mi-youtube-icon"&gt;&lt;/use&gt;
      &lt;/svg&gt;
    &lt;/a&gt;

    
  &lt;/figure&gt;
&lt;/div&gt;

&lt;div class="h-c-modal--video"
     data-glue-modal="uni-modal-yLTp3ic2j5c-"
     data-glue-modal-close-label="Close Dialog"&gt;
   &lt;a class="glue-yt-video"
      data-glue-yt-video-autoplay="true"
      data-glue-yt-video-height="99%"
      data-glue-yt-video-vid="yLTp3ic2j5c"
      data-glue-yt-video-width="100%"
      href="https://youtube.com/watch?v=yLTp3ic2j5c"
      ng-cloak&gt;
   &lt;/a&gt;
&lt;/div&gt;

&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Why is a protocol needed?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AI agents are capable of transacting on behalf of users, which creates a need to establish a common foundation to securely authenticate, validate, and convey an agent’s authority to transact. While &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;today’s payment systems generally assume a human is directly clicking "buy" on a trusted surface, the rise of autonomous agents and their ability to initiate a payment breaks this fundamental assumption and raises critical questions that AP2 helps to address, including:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Authorization&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Proving that a user gave an agent the specific authority to make a particular purchase.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Authenticity&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Enabling a merchant to be sure that an agent's request accurately reflects the user's true intent. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Accountability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: Determining accountability if a fraudulent or incorrect transaction occurs. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AP2 is an open, shared protocol that provides a common language for secure, compliant transactions between agents and merchants, helping to prevent a fragmented ecosystem. It also supports different payment types–from credit and debit cards to stablecoins and real-time bank transfers. This helps ensure a consistent, secure, and scalable experience for users and merchants, while also providing financial institutions with the clarity they need to effectively manage risk.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;How it works: Establishing trust via mandates and verifiable credentials&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AP2 builds trust by using Mandates—tamper-proof, cryptographically-signed digital contracts that serve as verifiable proof of a user's instructions. These mandates are signed by verifiable credentials (VCs) and act as the foundational evidence for every transaction.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Mandates address the two primary ways a user will shop with an agent:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Real-time purchases (human present&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;): When you ask an agent, “Find me new white running shoes,” your request is captured in an initial &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Intent Mandate&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This provides the auditable context for the entire interaction in a transaction process. After the agent presents a cart with the shoes you want, your approval signs a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cart Mandate&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. This is a critical step that creates a secure, unchangeable record of the exact items and price, ensuring what you see is what you pay for.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Delegated tasks (human not present)&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: When you delegate a task like, “Buy concert tickets the moment they go on sale,” you sign a detailed &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Intent Mandate&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; upfront. This mandate specifies the rules of engagement—price limits, timing, and other conditions. It serves as verifiable, pre-authorized proof that can allow the agent to automatically generate a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cart Mandate&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; on your behalf once your precise conditions are met.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In both scenarios, this chain of evidence culminates in securely linking your payment method to the verified contents of the Cart Mandate. This complete sequence—from intent, to cart, to payment—creates a non-repudiable audit trail that answers the critical questions of authorization and authenticity, providing a clear foundation for accountability.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Unlocking new commerce experiences&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AP2’s flexible design provides a foundation to support both simple and entirely new commercial models. Let’s consider a few examples below, which all assume Intent Mandates have been signed on behalf of a user: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Smarter shopping&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: A customer discovers a winter jacket they want is unavailable in a specific color, so they tell their agent: "I really want this jacket in green, and I'm willing to pay up to 20% more for it." The agent then monitors prices and availability and automatically executes a secure purchase the moment that specific variant is found, capturing a high-intent sale that would have otherwise been lost.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Personalized offers&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: A shopper tells their agent they want a new bicycle for an upcoming trip from a specific merchant. Their agent communicates this information—which includes the trip's date—to the merchant, whose own agent can respond by creating a custom, time-sensitive bundle offer that includes the bike, a helmet, and a travel rack at a 15% discount, turning a simple query into a more valuable sale.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Coordinated tasks&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: A user is planning a weekend trip and tells their agent: "Book me a round-trip flight and a hotel in Palm Springs for the first weekend of November, with a total budget of $700." The agent can then interact with both airline and hotel agents, as well as online travel agencies and booking platforms, and once it finds a combination that fits the budget, it can execute both cryptographically-signed bookings simultaneously.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Support for emerging payments systems&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AP2 is designed as a universal protocol, providing security and trust for a variety of payments like stablecoins and cryptocurrencies. To accelerate support for the web3 ecosystem, in collaboration with Coinbase, Ethereum Foundation, MetaMask and other leading organizations, we have extended the core constructs of AP2 and launched the &lt;/span&gt;&lt;a href="https://github.com/google-a2a/a2a-x402" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;A2A x402 extension&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a production-ready solution for agent-based crypto payments. Extensions like these will help shape the evolution of cryptocurrency integrations within the core AP2 protocol. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What’s next: A call for collaboration &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;AP2 provides a trusted foundation to fuel a new era of AI-driven commerce. It establishes the core building blocks for secure transactions, creating clear opportunities for the industry–including networks, issuers, merchants, technology providers, and end users–to innovate on adjacent areas like seamless agent authorization and decentralized identity. We are committed to evolving this protocol in an open, collaborative process, including through standards bodies, and  invite the entire payments and technology community to build this future with us. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Many of the partners building A2A agents have extended their support to AP2. This growing ecosystem will continue to make their agents available in our AI Agent Marketplace, including new, transactable experiences enabled by AP2. For example, enterprise companies could use AP2 for B2B applications, such as enabling autonomous procurement of partner-built solutions via Google Cloud Marketplace or the automatic scaling of software licenses based upon real-time needs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To get started, visit our public &lt;/span&gt;&lt;a href="http://goo.gle/ap2" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to review the complete technical specification, documentation, and reference implementations. Moving forward, this repository will be updated regularly with additional reference implementations from Google and innovations from the community to demonstrate the power and scalability of AP2.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/AP2_Partners.jpg"
        
          alt="New Logo AP2"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Support from our ecosystem&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Accenture: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“Google Cloud's Agent Payments Protocol (AP2) complements the Agent2Agent protocol and Model Context Protocol to provide a unified framework for agents to transact. Innovations like this will enable many of the agentic solutions that reinvent payments for clients – not only for today’s needs, but for the evolving models of future commerce.” – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Scott Alfieri, Google Business lead at Accenture&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Adobe:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; “Adobe is proud to work with Google to advance secure and authenticated agentic commerce - our role in the &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Payments Protocol&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; (AP2) underscores our commitment to trusted, AI-driven experiences. With Adobe Commerce and AI agents powering customer journeys, we are focused on delivering secure, reliable, and authentic transactions for businesses and consumers."  -&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Loni Stark, VP of Strategy and Product at Adobe&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Adyen:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "Agentic commerce is not just about a consumer-facing chatbot, but about the underlying infrastructure that powers it all. Adyen’s collaboration on Google’s Agent Payments Protocol (AP2) is a natural extension of our mission to provide the merchants with the payments building blocks for tomorrow’s commerce. We're excited to help establish a common rulebook that ensures security and interoperability for everyone involved in the payments ecosystem." - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Ingo Uytdehaage, Co-CEO at Adyen&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Airwallex: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;"Airwallex is thrilled to support Google’s &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Payments Protocol (AP2)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. This is a critical step forward in building a secure, interoperable ecosystem for agentic AI payments. This protocol gives businesses and consumers the confidence to delegate tasks to AI agents, aligning with our mission to build the future of finance by empowering businesses globally.” - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Jacob Dai, Co-Founder &amp;amp; CTO at Airwallex&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;American Express: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“With the rise of AI-driven commerce, trust and accountability are more important than ever. American Express is excited to contribute to the creation of AP2 as a protocol intended to protect customers and enable participation in the next generation of digital payments.” -&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Luke Gebb, EVP, Amex Digital Labs, American Express&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Ant International: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;"Ant International is excited to partner with Google on protocol-setting for practical AI applications in agentic commerce to unlock new merchant growth and elevate consumer experience, by leveraging our expertise in alternative payment methods and trusted AI innovations." - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Jiangming Yang, Chief Innovation Officer at Ant International&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;BHN: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“As a trusted partner processing billions of transactions globally, BHN is excited to help shape emerging protocols like AP2 that will enable both merchants and consumers to leverage the power of stored value in secure, autonomous commerce, enabled by AI Agents.” - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Nik Sathe, CPTO at BHN&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;BVNK: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;"Stablecoins provide an obvious solution to the scaling challenges agentic systems are already facing with legacy financial infrastructure. We at BVNK were extremely excited to hear that Google has been working on solving this problem and couldn't wait to contribute" - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Donald Jackson, CTO at BVNK&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong style="vertical-align: baseline;"&gt;Checkout.com: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;"Agentic commerce is reshaping the checkout moment, and Google’s Agent Payments Protocol (AP2) is a pivotal step forward. At Checkout.com, we’re proud to support open protocols that strengthen trust and give merchants the flexibility to meet their customers where they are, however they want to shop.” –&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Meron Colbeci, Chief Product Officer, Checkout.com&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Coinbase: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;"x402 and AP2 show that agent-to-agent payments aren’t just an experiment anymore, they’re becoming part of how developers actually build. Bringing x402 into AP2 to power stablecoin payments made sense - it’s a natural playground for agents to start transacting with each other and testing out crypto rails. And it’s exciting to see the idea of agents paying each other resonate with the broader AI community." –&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Erik Reppel, Head of Engineering at Coinbase Developer Platform&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Crossmint: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“With Crossmint’s tools, developers can let agents buy anything using both credit cards or stablecoins. Our goal is to unlock instantaneous, global commerce, giving agent builders the greatest flexibility. Our partnership with Google on AP2 represents our commitment that agentic commerce wins everyone’s trust as a secure, reliable, and seamless way to transact. Time to accelerate!” – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Alfonso Gomez, Co-founder at Crossmint&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Confluent:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "Confluent is excited to support Google in this effort to build an open, secure, and high-trust payments protocol. Agent Payments Protocol (AP2) aligns perfectly with our vision of a real-time data-driven world, and we believe our expertise in data streaming with Apache Kafka will be critical in creating a resilient and scalable payments ecosystem for the agentic web." – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Pascal Vantrepote, Partner CTO at Confluent &lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Dell:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; “At Dell Technologies, we’re &lt;/span&gt;&lt;a href="https://www.dell.com/en-us/blog/dell-securing-the-future-of-agentic-payments/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;dedicated to making agentic AI a reality for businesses worldwide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. The transformative potential of agentic automation hinges on trust, security and standardization, especially for customer-facing eCommerce platforms. By supporting the Agent Payments Protocol (AP2) with Google, we’re laying the groundwork for a future where AI-driven commerce is reliable, accessible, and trusted by all." – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Satish Iyer, Vice President, Innovation &amp;amp; Ecosystems, Office of the CTO, Dell Technologies&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Deloitte: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“As Agentic Commerce rapidly emerges as a transformative force, the industry will need robust standards to empower AI agents to transact payments securely and effectively. These standards must address critical areas such as security, identity, frictionless commerce, trust, and privacy, all while providing compatibility with the existing global payments infrastructure. Deloitte is proud to help shape this evolving industry alongside Google, extending the widely adopted A2A protocol to enable agent-driven payments and commerce.” – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Gopal Srinivasan, Alphabet Google Alliance Global AI &amp;amp; Data Leader at Deloitte Consulting LLP&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;DLocal: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;"Payment agents are no longer an idea, they’re rapidly becoming a reality. In the dynamic emerging markets we serve, payments are fragmented and complex, from cards to local payment methods, to wallets and stablecoin.Agent Payments Protocol (AP2) turns that complexity into a single, interoperable framework, enabling agent-initiated payments that are safe, seamless, and designed to boost merchant conversion while keeping users in control." - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Pedro Arnt, CEO at DLocal&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Ebanx:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "Agent Payments Protocol  (AP2) will power the next era of commerce, and to build a safe and secure environment for this is now the most important step. EBANX is proud to be part of this effort with Google." - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Eduardo de Abreu, Vice President of Product at Ebanx&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Eigen Labs: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“Google’s new &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Payments Protocol&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; (AP2) is a major step toward a future where AI agents are meaningful economic actors, whether that’s on behalf of humans, organizations or themselves. EigenCloud is proud to partner with Google on this initiative to provide the verifiability infrastructure that ensures these agents are held accountable by any counterparty. Together, we’re helping create a global verifiable economy where agents can coordinate, transact, and prove their actions to humans and to each other.” - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Sreeram Kannan, Founder &amp;amp; CEO at Eigen Labs &lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Fiuu:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; ""As agentic commerce reshapes payments infrastructure, Fiuu supports open protocols like A2A and AP2 to enable secure, scalable agent-to-agent transactions across multi-channel systems, advancing interoperability, trust, and inclusive payment ecosystems." - Eng Sheng Guan, CEO at Fiuu&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Forter: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;"At Forter we believe in the potential of agents to revolutionize commerce and we are proud to collaborate with Google in creating modern protocols that benefit brands, consumers and AI developers alongside Forter’s Trusted Agentic Commerce Protocol (TACP).” - &lt;strong&gt;Michael Reitblat, CEO at Forter&lt;/strong&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gr4vy:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "We are proud to support this new open protocol (AP2). By working together as an industry, we can ensure this next chapter of payments is built on trust, transparency and flexibility." - &lt;strong&gt;John Lunn, CEO and Founder at Gr4vy&lt;/strong&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Gravitee:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; “In the agentic world, secure and trusted transactions demand open protocols. Google’s Open Standard for Agent Payments Protocol (AP2) addresses this need. Gravitee’s Agent Mesh already supports A2A and MCP with a strong focus on security and governance, and we are committed to extending this support so customers in financial services, retail, and beyond can confidently benefit” – &lt;strong&gt;Linus Hakansson, Chief Product Officer at Gravitee &lt;/strong&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Global Fashion Group: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“Integrating A2A and MCP into the Agent Payments Protocol (AP2) enables a modular, interoperable architecture with versioned contracts, making integration and testing straightforward. Modern payments, engineered for scale—secure, seamless, and built to power global commerce.” - &lt;strong&gt;Quy Tran, Director of Engineering at Global Fashion Group&lt;/strong&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Intuit: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“Intuit focuses on enabling the financial success of consumers, businesses, and accountants. We are excited to leverage our AI and data capabilities to help develop the open Agent Payments Protocol (AP2) to create better experiences for all. Our technologists will have the ability to use the protocol to deploy AI agents towards autonomous financial workflows as part of our done-for-you experiences for customers.” – &lt;strong&gt;Tapasvi Moturu, Vice President, Software Engineering at Intuit&lt;/strong&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;JCB:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "JCB champions Google’s &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Payments Protocol&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; (AP2) initiative as the innovative and important protocol that will unlock a new era of payments, and JCB looks forward to contributing to the protocol to benefit our entire ecosystem, including our banking and payment institution partners, cardmembers, and merchants." - &lt;strong&gt;Shinya Kubotera, Executive Officer &amp;amp; Head of Strategic Innovations at JCB co., Ltd&lt;/strong&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;JusPay:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "The future of payments is inherently open and interoperable - our work with UPI and the development of Hyperswitch, the world's first open-source payments orchestration platform has demonstrated this power. We believe this new protocol (AP2) provides the secure, shared foundation needed to make AI-driven commerce a reality, and we are ready to contribute our expertise to this initiative."  - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Sheetal Lalwani, Co-founder at Juspay&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;KCP: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“NHN KCP endorses the &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Payments Protocol&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; (AP2) as a key advancement in the global payments ecosystem and looks forward to collaborating with global partners to help make AI-based payments more reliable, convenient, and widely adopted.” &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;–&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Jae-wook Noh, Executive Managing Director at NHN KCP  &lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Lightspark&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: “Having worked on payments at Google, I’ve seen how open and verified protocols can unlock powerful network effects. Google’s Agent Payments Protocol (AP2) is a big step toward a future where trusted AI agents transact seamlessly on our behalf. At Lightspark, we’re committed to that vision of open, global interoperability." - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Alberto Martin CPO at Lightspark&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;ManusAI:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; “Google's &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Payments Protocol&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; (AP2) represents a breakthrough solution that finally addresses the fundamental monetization challenges we've long faced in the agent ecosystem—enabling seamless, standardized compensation between AI agents while eliminating the unsustainable resource imbalances that have hindered true multi-agent collaboration.” -&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Tao Zhang, CTO at Manus&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Mastercard: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“Mastercard is committed to ongoing, responsible innovation – and we are excited to be collaborating with Google, leading banks, merchants, AI platforms and other industry leaders to help shape the future of agentic commerce. These efforts include critical work with standards bodies such as the FIDO Alliance, where we are advancing verifiable credentials to capture and secure consumers’ intent in this dynamic new context. Together, we’re playing an essential role in securing the payments ecosystem – ensuring that trust and safety remain at the core of every transaction.” –&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; Pablo Fourez, Chief Digital Officer at Mastercard &lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;MetaMask: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“Blockchains are the natural payment layer for agents, and Ethereum will be the backbone of this. With Agent Payments Protocol (AP2) and x402, MetaMask will deliver maximum interoperability for developers and will enable users to pay agents with full composability and choice—while retaining the security and control of true self-custody” - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Marco De Rossi, AI Lead at MetaMask&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Mesh: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“For AI to truly drive commerce, agents need a secure and universal way to handle payments. Google's new Agent Payments Protocol (AP2) is a huge step forward, providing the foundational framework to make this possible. We're proud to support this effort because it unlocks the full potential of agent-led commerce, particularly with programmable assets like crypto. Our technology abstracts away the complexity of the crypto ecosystem, giving agents seamless access to hundreds of wallets and exchanges and supporting over 100 tokens. This ensures payments are not just completed, but are routed through the most efficient paths to guarantee speed and success.” - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Bam Azizi, CEO and co-Founder at Mesh&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Mysten Labs: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“Verified agents making purchases on behalf of verified users is the next frontier for AI-powered automation.  Google's Agent Payments Protocol (AP2) combines programmable payments via modern blockchains like Sui with open protocols like A2A and MCP that are enjoying rapid growth. It's the perfect substrate for real-world agentic commerce." - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Sam Blackshear, Chief Technology Officer and Co-Founder at Mysten Labs, the original contributor to Sui.&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Nexi: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“We are delighted to partner with Google Cloud on AP2 in order to contribute to shape a fundamental paradigm shift in commerce. As part of our DNA of being European by scale and local by nature, we aim at empowering European merchants to continue to compete on a global scale, while delivering frictionless and personalised online shopping experiences to consumers. Leveraging Agentic AI eCommerce technology from Google Cloud, we will continue to simplify payments for our merchants and partners" – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Roberto Catanzaro, Chief Business Officer Merchant Solutions at Nexi&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Okta: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“Extending the A2A protocol into payments is an important step toward building a secure, interoperable foundation for commerce between AI agents. At Auth0, we’re excited to support the Agent Payments Protocol (AP2) and help ensure that future payments between AI agents are both seamless and secure.” – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Stephen Lee, Vice President, Technical Strategy and Partnerships at Okta&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Payoneer:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; “At Payoneer, we see enormous potential in AI agents to simplify financial workflows for millions of small businesses. By supporting the &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Agent Payments Protocol (AP2)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, we’re ensuring agents can collaborate securely and seamlessly, just as our platform connects SMBs worldwide.” – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Guy Shalev, Vice President of AI at Payoneer&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;PayPal: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;"AP2 provides the critical foundation for trusted agent payments, giving the ecosystem much needed clarity on how to facilitate trusted transactions. PayPal is fully aligned with this vision and excited to build on it, bringing our commerce expertise to help extend these principles across the entire purchase journey.” - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Prakhar Mehrotra, SVP and Global Head of AI at PayPal&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;PwC:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; "PwC is committed to fostering innovation with agentic AI that focuses on maintaining trust, safety and privacy for critical tasks like payments and money movement broadly.  We believe the Agent Payments Protocol (AP2) and extension to the Agent2Agent protocol represent a significant leap forward, enhancing safety without compromising information." – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Scott Likens, Global / US Chief AI Engineering Officer at PwC&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Salesforce&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: "With extensive expertise in powering digital commerce, Salesforce is excited to help &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;businesses harness agentic payments at scale - creating truly frictionless commerce experiences and driving the productivity that is crucial to becoming an Agentic Enterprise today. " – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Nitin Mangtani, SVP &amp;amp; GM Commerce &amp;amp; Retail Cloud at Salesforce &lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;&lt;strong style="vertical-align: baseline;"&gt;ServiceNow: &lt;/strong&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;“Our partnership with Google is focused on unlocking the full potential of the agentic ecosystem. As an early adopter and launch partner of the Agent2Agent protocol, we’re excited to see autonomous AI Agents now empowered to seamlessly conduct eCommerce transactions. Together, we’re advancing the next generation of sales and procurement workflows—rooted in trust, security, and governance—while setting a new standard for how enterprises scale with agentic AI.” &lt;/span&gt;&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;&lt;span style="vertical-align: baseline;"&gt;- &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Jon Sigler, EVP &amp;amp; GM, AI Platform, ServiceNow&lt;/strong&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Shopee: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;"At Shopee, we see immense potential for agents to transform e-commerce, and believe that industry protocols such as Google’s Agent Payments Protocol (AP2) will be critical to enabling this future.” - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;David Chen, Chief Product Officer at Shopee &lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Worldpay: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;“Worldpay shares Google's vision of an open, interoperable foundation for agentic commerce, built on trust and safety to empower merchants and shoppers. The AP2 protocol represents a meaningful first step in defining how agents, merchants, and payment providers can transact securely, at scale.” – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Cindy Turner, Chief Product Officer at Worldpay&lt;/strong&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;1password:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; “Open protocols like A2A and AP2 are critical to driving broad adoption of AI while ensuring security and transparency remain foundational. At 1Password, we see support for digital payment credentials as just the beginning. The future is multi-agent, and managing agent access and authorization starts with securing credentials, all while upholding our core security values of privacy, transparency, and trust.” – &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Anand Srinivas, Vice President, Product &amp;amp; AI at 1Password&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Get started by checking out our public &lt;/span&gt;&lt;a href="http://goo.gle/ap2" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to see the complete technical specification, documentation, and reference implementations.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 16 Sep 2025 13:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/announcing-agents-to-payments-ap2-protocol/</guid><category>Open Source</category><category>AI &amp; Machine Learning</category><media:content height="540" url="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/AP2_1iaAPko.jpg" width="540"></media:content><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Powering AI commerce with the new Agent Payments Protocol (AP2)</title><description></description><image>https://storage.googleapis.com/gweb-cloudblog-publish/original_images/AP2_1iaAPko.jpg</image><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/announcing-agents-to-payments-ap2-protocol/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Stavan Parikh</name><title>VP/GM, Payments, Google</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Rao Surapaneni</name><title>VP/GM, Business Applications Platform, Google Cloud</title><department></department><company></company></author></item><item><title>OpenTelemetry Protocol comes to Google Cloud Observability</title><link>https://cloud.google.com/blog/products/management-tools/opentelemetry-now-in-google-cloud-observability/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;OpenTelemetry Protocol (OTLP) is a data exchange protocol designed to transport telemetry from a source to a destination in a vendor-agnostic fashion. Today, we’re pleased to announce that Cloud Trace, part of &lt;/span&gt;&lt;a href="https://cloud.google.com/stackdriver/docs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud Observability&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, now supports users &lt;/span&gt;&lt;a href="https://cloud.google.com/trace/docs/migrate-to-otlp-endpoints"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;sending trace data using OTLP&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; via &lt;/span&gt;&lt;a href="http://telemetry.googleapis.com" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;telemetry.googleapis.com&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_69Q6vSM.max-1000x1000.png"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="37zow"&gt;Fig 1: Both in-process and collector based configurations can use native OTLP exporters to transmit telemetry data&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Using OTLP to send telemetry data to observability tooling with these benefits:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Vendor-agnostic telemetry pipelines: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Use native OTLP exporters from in-process or collectors. This eliminates the need to use vendor-specific exporters in your telemetry pipelines.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Strong telemetry data integrity:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Ensure your telemetry data preserves the OTel data model during transmission and storage and avoid transformations into proprietary formats.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Interoperability with your choice of observability tooling:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Easily send telemetry to one or more observability backends that support native OTLP without any additional &lt;/span&gt;&lt;a href="https://opentelemetry.io/docs/languages/go/exporters/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;OTel exporters&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Reduced client-side complexity and resource usage:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Move your telemetry processing logic such as applying filters to the observability backend, reducing the need for custom rules and thus client-side processing overhead.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Let’s take a quick look at how to use OTLP from Cloud Trace. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Trace and OTLP in action&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://cloud.google.com/trace/docs/migrate-to-otlp-endpoints"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Sending trace data using OTLP&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; via &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;telemetry.googleapis.com is now the recommended best practice for both new and existing users — especially for those who expect to send high volumes of trace data.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_N3PyrlX.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="37zow"&gt;Fig 2: Trace explore page in Cloud Trace highlighting fields that leverage OpenTelemetry semantic conventions&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Trace explorer page makes extensive use of OpenTelemetry conventions to offer a rich user experience when filtering and finding traces of interest. For example, &lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The OpenTelemetry convention &lt;/span&gt;&lt;a href="http://service.name" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;service.name&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is used to indicate which services a span is originating from.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The status of the span is indicated by the OpenTelemetry’s &lt;/span&gt;&lt;a href="https://opentelemetry.io/docs/concepts/signals/traces/#span-status" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;span status&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Trace’s internal storage system now uses the OpenTelemetry data model natively for organizing and storing your trace data. The new storage system enables much &lt;/span&gt;&lt;a href="https://cloud.google.com/trace/docs/quotas#telemetry-api-limits"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;higher limits&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; when trace data is sent through telemetry.googleapis.com. Key changes include:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Attribute sizes:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Attribute keys can now be up to 512 bytes (from 128 bytes), and values up to 64 KiB (from 256 bytes).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Span details:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Span names can be up to 1024 bytes (from 128 bytes), and spans can have up to 1024 attributes (from 32).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Event and link counts:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Events per span increase to 256 (from 128), and links per span are now 128.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We believe sending your trace data using OTLP will result in an better user experience in the trace explorer UI and Observability Analytics, along with the above storage limit increases.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Google Cloud’s vision for OTLP&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Providing OTLP support for Cloud Trace is just the beginning. Our vision is to leverage OpenTelemetry to generate, collect, and access telemetry across Google Cloud. Our commitment to OpenTelemetry extends across all telemetry types — traces, metrics, and logs — and is a cornerstone of our strategy to simplify telemetry management and foster an open cloud environment.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We understand that in today's complex cloud environments, managing telemetry data across disparate systems, inconsistent data formats, and vast volumes of information can lead to observability gaps and increased operational overhead. We are dedicated to streamlining your telemetry pipeline, starting with focusing on native OTLP ingestion for all telemetry types so you can seamlessly send your data to Google Cloud Observability. This will help foster true vendor neutrality and interoperability, eliminating the need for complex conversions or vendor-specific agents.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Beyond seamless ingestion, we're also building capabilities for managed server-side processing, flexible routing to various destinations, and unified management and control over your telemetry across environments. This will further our observability experience with advanced processing and routing capabilities all in one place.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The introduction of OTLP trace ingestion with telemetry.googleapis.com is a significant first step in this journey. We're continually working to expand our OpenTelemetry support across all telemetry types with additional processing and routing capabilities to provide you with a unified and streamlined observability experience on Google Cloud.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started today&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We encourage you to begin using telemetry.googleapis.com for your trace data by following this &lt;/span&gt;&lt;a href="https://cloud.google.com/trace/docs/migrate-to-otlp-endpoints"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;migration guide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This new endpoint offers enhanced capabilities, including higher storage limits and an improved user experience within Cloud Trace Explorer and Observability Analytics.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 12 Sep 2025 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/management-tools/opentelemetry-now-in-google-cloud-observability/</guid><category>Open Source</category><category>Management Tools</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>OpenTelemetry Protocol comes to Google Cloud Observability</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/management-tools/opentelemetry-now-in-google-cloud-observability/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Sujay Solomon</name><title>Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Keith Chen</name><title>Product Manager</title><department></department><company></company></author></item><item><title>Automate app deployment and security analysis with new Gemini CLI extensions</title><link>https://cloud.google.com/blog/products/ai-machine-learning/automate-app-deployment-and-security-analysis-with-new-gemini-cli-extensions/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Find and fix security vulnerabilities. Deploy your app to the cloud. All without leaving your command-line. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we’re closing the gap between your terminal and the cloud with a first look at the future of Gemini CLI, delivered through two new extensions: &lt;/span&gt;&lt;a href="https://github.com/google-gemini/gemini-cli-security/tree/main" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;security extension&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/cloud-run-mcp/?tab=readme-ov-file#use-as-a-gemini-cli-extension" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run extension&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. These extensions are designed to handle critical parts of your workflows with simple, intuitive commands:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="vertical-align: baseline;"&gt;1)  &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;/security:analyze&lt;/strong&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;performs a comprehensive scan right in your local repository, with support for GitHub pull requests coming soon. This makes security a natural part of your development cycle.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="vertical-align: baseline;"&gt;2)  &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;/deploy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; deploys your application to Cloud Run, our fully managed serverless platform, in just a few minutes. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These commands are the first expression of a new extensibility framework for Gemini CLI. While we'll be sharing more about the full &lt;/span&gt;&lt;a href="https://github.com/google-gemini/gemini-cli/blob/main/docs/extension.md" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI extension&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; world soon, we couldn't wait to get these capabilities into your hands. Consider this a sneak peak of what’s coming next!&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Security extension: automate security analysis with /security:analyze &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To help teams address software vulnerabilities early in the development lifecycle, we are launching the &lt;/span&gt;&lt;a href="https://github.com/google-gemini/gemini-cli-security" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini CLI Security extension&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This new open-source tool automates security analysis, enabling you to proactively catch and fix issues using the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;/security:analyze &lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;command at the terminal or through a soon-coming GitHub Actions integration. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Integrated directly into your local development workflow and CI/CD pipeline, this extension:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Analyzes code changes:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; When triggered, the extension automatically takes the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;git diff&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; of your local changes or pull request.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Identifies vulnerabilities:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Using a specialized prompt and tools, Gemini CLI analyzes the changes for a wide range of potential vulnerabilities, such as hardcoded-secrets, injection vulnerabilities, broken access control, and insecure data handling.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Provides actionable feedback:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Gemini returns a detailed, easy-to-understand report directly in your terminal or as a comment on your pull request. This report doesn't just flag issues; it explains the potential risks and provides concrete suggestions for remediation, helping you fix issues quickly and learn as you go.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;And after the report is generated, you can also ask Gemini CLI to save it to disk or even implement fixes for each issue.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/1_Gemini_CLI_Security_Extension_Terminal_Gif.gif"
        
          alt="1 Gemini CLI Security Extension Terminal Gif"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Getting started with /security:analyze&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Integrating security analysis into your workflow is simple. First, download the Gemini CLI and install the extension &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;(requires Gemini CLI v0.4.0+)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gemini extensions install https://github.com/google-gemini/gemini-cli-security&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dda3b8880&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Then you can start run your first scan:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Locally:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; After making local changes, simply run &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;/security:analyze &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; in the Gemini CLI.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;In CI/CD (Coming Soon): &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;We're bringing security analysis directly into your CI/CD workflow. Soon,&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;you’ll be able to configure the GitHub Action to automatically review pull requests as they are opened.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This is just the beginning. The team is actively working on further enhancing the extension's capabilities, and we are also inviting the community to contribute to this open source project by reporting bugs, suggesting features, continuously improving security practices and submitting code improvements. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For complete documentation and to contribute, visit the &lt;/span&gt;&lt;a href="https://github.com/google-gemini/gemini-cli-security" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;official GitHub repository&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Cloud Run extension: automate deployment with &lt;/strong&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;/deploy&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;/deploy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; command in Gemini CLI automates the entire deployment pipeline for your web applications. You can now deploy a project directly from your local workspace. Once you issue the command, Gemini returns a public URL for your live application.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;/deploy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; command automates a full CI/CD pipeline to deploy web applications and cloud services from the command line using the &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/cloud-run-mcp/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run MCP server&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. What used to be a multi-step process of building, containerizing, pushing, and configuring is now a single, intuitive command from within the Gemini CLI.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can access this feature across three different surfaces – in Gemini CLI in the terminal, in VS Code via &lt;/span&gt;&lt;a href="https://codeassist.google/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Gemini Code Assist&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; agent mode, and in Gemini CLI in &lt;/span&gt;&lt;a href="https://cloud.google.com/shell/docs"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Shell&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/2_aA6mg0y.gif"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="dvesx"&gt;Use /deploy command in Gemini CLI at the terminal to deploy application to Cloud Run&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started with /deploy:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For existing Google Cloud users, getting started with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;/deploy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; is straightforward in Gemini CLI at the terminal:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;Prerequisites:&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; You'll need the gcloud CLI installed and configured on your machine and have an existing app or use Gemini CLI to create one.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Step 1: Install the Cloud Run extension&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;/deploy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; command is enabled through a &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/cloud-run-mcp" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Context Protocol (MCP) server&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which is included in the Cloud Run extension.  To install the Cloud Run extension &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;(Requires Gemini CLI v0.4.0+)&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, run this command:  &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gemini extensions install https://github.com/GoogleCloudPlatform/cloud-run-mcp&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dda3b89d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p style="padding-left: 40px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Step 2: Authenticate with Google Cloud&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Ensure your local environment is authenticated to your Google Cloud account by running:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;gcloud auth login\r\ngcloud auth application-default login&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dda3b8580&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p style="padding-left: 40px;"&gt;&lt;strong style="vertical-align: baseline;"&gt;Step 3: Deploy your app&lt;br/&gt;&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Navigate to your application's root directory in your terminal and type &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;gemini&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to launch Gemini CLI. Once inside, type &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;/deploy&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; to deploy your app to Cloud Run.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That's it! In a few moments, Gemini CLI will return a public URL where you can access your newly deployed application. You can also visit the Google Cloud Console to see your new service running in Cloud Run. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Besides Gemini CLI at the terminal, this feature can also be accessed  in VS Code via Gemini Code Assist &lt;/span&gt;&lt;a href="https://cloud.google.com/gemini/docs/codeassist/release-notes"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;agent mode&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, powered by Gemini CLI,  and in Gemini CLI in Cloud Shell, where the authentication step will be automatically handled out of the box.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/3_deploy-agentmode.gif"
        
          alt="3 deploy-agentmode"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="dvesx"&gt;Use /deploy command to deploy application to Cloud Run in VS Code via Gemini Code Assist agent mode.&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Building a robust extension ecosystem  &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Security and Cloud Run extensions are two of the first extensions from Google built on our new framework, which is designed to create a rich and open ecosystem for the Gemini CLI. We are building a platform that will allow any developer to extend and customize the CLI's capabilities, and this is just an early preview of the full platform's potential. We will be sharing a more comprehensive look at our extensions platform soon, including how you can start building and sharing your own.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Try Gemini CLI today, visit the GitHub &lt;/span&gt;&lt;a href="http://github.com/google-gemini/gemini-cli" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 10 Sep 2025 14:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/automate-app-deployment-and-security-analysis-with-new-gemini-cli-extensions/</guid><category>Application Development</category><category>Serverless</category><category>Open Source</category><category>AI &amp; Machine Learning</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Automate app deployment and security analysis with new Gemini CLI extensions</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/automate-app-deployment-and-security-analysis-with-new-gemini-cli-extensions/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Prithpal Bhogill</name><title>Group Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Evan Otero</name><title>Senior Product Manager</title><department></department><company></company></author></item><item><title>Build with more flexibility: New open models arrive in the Vertex AI Model Garden</title><link>https://cloud.google.com/blog/products/ai-machine-learning/deepseek-r1-is-available-for-everyone-in-vertex-ai-model-garden/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In our ongoing effort to provide businesses with the flexibility and choice needed to build innovative AI applications, we are expanding the catalog of open models available as Model-as-a-Service (MaaS) offerings in Vertex AI Model Garden. Following the addition of&lt;/span&gt;&lt;a href="https://www.googlecloudcommunity.com/gc/Community-Blogs/Introducing-Llama-4-on-Vertex-AI/ba-p/892578" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt; Llama 4 models&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; earlier this year, we are announcing &lt;/span&gt;&lt;a href="https://console.cloud.google.com/vertex-ai/publishers/deepseek-ai/model-garden/deepseek-r1-0528-maas"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;DeepSeek R1&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is available for everyone through our Model-as-a-Service (MaaS) offering. This expansion reinforces our commitment to an open AI ecosystem, ensuring our customers can access a diverse range of powerful models to find the one best suited for their specific use case.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Deploying and managing today's large-scale models presents operational and financial challenges. For instance, a large model such as DeepSeek R1 can require an infrastructure of eight advanced H200 GPUs to run inference. For many organizations, procuring and managing such resources is a major undertaking that can divert focus from core application development.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Vertex AI’s MaaS offering is designed to remove this complexity. By providing these models as fully managed, serverless APIs, we eliminate the need for customers to provision or manage the underlying infrastructure. This allows your teams to bypass the complexities of GPU management and focus directly on building and innovating. With Vertex AI, you benefit from a secure, enterprise-grade platform with built-in data privacy and compliance, all under a flexible, pay-as-you-go pricing model that scales with your needs.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;$300 in free credit to try Google Cloud AI and ML&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dce696490&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;Start building for free&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Getting started&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Below we provide a step-by-step guide on how you can use open models available on MaaS. We have used DeepSeek R1 on Vertex AI as an example. It can be accessed both via the UI and API.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;1. Enable the DeepSeek API Service&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Navigate to the DeepSeek API Service from the Vertex AI Model Garden and click on the title to open the model card. Then, enable access to the DeepSeek API Service. It may take a few minutes for permissions to propagate after enablement. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_ypu16Hl.max-1000x1000.png"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="vj2xu"&gt;DeepSeek API Service from the Vertex AI Model Garden&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;2. Try out the model via the UI&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Navigate to the DeepSeek API Service from the Vertex AI Model Garden and click on the tile to open the model card. You can use the UI in the sidebar to test the service. &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_bWuZIG8.max-1000x1000.png"
        
          alt="2"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="vj2xu"&gt;DeepSeek API Service with UI sidebar to test the service&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;3. Try out the model via Vertex AI API&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To integrate DeepSeek R1 within your applications, you can use either REST API or OpenAI Python API Client Library. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Note&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;: For security of your data, DeepSeek MaaS endpoint does not have any outbound internet access. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get Predictions via the REST API&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You can make API requests via curl from the Cloud Shell or your machine with gcloud credentials configured. Remember to replace the placeholders with this code:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;export PROJECT_ID=&amp;lt;ENTER_PROJECT_ID&amp;gt;\r\nexport REGION_ID=&amp;lt;ENTER_REGION_ID&amp;gt; \r\n\r\ncurl \\\r\n-X POST \\\r\n-H &amp;quot;Authorization: Bearer $(gcloud auth print-access-token)&amp;quot; \\\r\n-H &amp;quot;Content-Type: application/json&amp;quot; \\\r\n&amp;quot;https://${REGION_ID}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION_ID}/endpoints/openapi/chat/completions&amp;quot; \\\r\n-d \&amp;#x27;{\r\n  &amp;quot;model&amp;quot;: &amp;quot;deepseek-ai/deepseek-r1-0528-maas&amp;quot;,\r\n  &amp;quot;max_tokens&amp;quot;: 200,\r\n  &amp;quot;stream&amp;quot;: true,\r\n  &amp;quot;messages&amp;quot;: [\r\n    {\r\n      &amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;,\r\n      &amp;quot;content&amp;quot;: &amp;quot;which is bigger - 9.11 or 9.9&amp;quot;\r\n    }\r\n  ]\r\n}\&amp;#x27;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dda3fe190&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get Predictions via the OpenAI Python API Client Library &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Install the OpenAI Python API Library:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;pip install openai&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddabae820&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Initialize the client and configure the endpoint URL. To get the access token to use as an API key, you can read more &lt;/span&gt;&lt;a href="https://cloud.google.com/sdk/gcloud/reference/auth/application-default/print-access-token"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. If run from a local machine, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;GOOGLE_APPLICATION_CREDENTIALS&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; will authenticate your requests.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;import os\r\nimport openai\r\n\r\nPROJECT_ID = “ENTER_PROJECT_ID”\r\nLOCATION = &amp;quot;us-central1&amp;quot;\r\nMODEL_ID = &amp;quot;deepseek-ai/deepseek-r1-0528-maas&amp;quot;\r\nAPI_KEY = os.environ[&amp;quot;GOOGLE_APPLICATION_CREDENTIALS&amp;quot;] # or add output from gcloud auth print-access-token \r\n\r\ndeepseek_vertex_endpoint_url = (\r\n    f&amp;quot;https://{LOCATION}-aiplatform.googleapis.com/v1beta1/&amp;quot;\r\n    f&amp;quot;projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/openapi&amp;quot;\r\n)\r\n\r\nclient = openai.OpenAI(\r\n    base_url=deepseek_vertex_endpoint_url,\r\n    api_key=API_KEY\r\n)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddabaea30&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Make completions requests via the client:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;response = client.chat.completions.create(\r\n    model=&amp;quot;deepseek-ai/deepseek-r1-0528-maas&amp;quot;,\r\n    messages=[\r\n        {&amp;quot;role&amp;quot;: &amp;quot;system&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;You are a helpful assistant&amp;quot;},\r\n        {&amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;, &amp;quot;content&amp;quot;: &amp;quot;How many r\&amp;#x27;s are in strawberry ?&amp;quot;},\r\n    ],\r\n    stream=False,\r\n)\r\n\r\nprint(response.choices[0].message.content)\r\n\r\n# ChatCompletion(&amp;quot;id=&amp;quot;&amp;quot;&amp;quot;,\r\n# &amp;quot;choices=&amp;quot;[\r\n#    &amp;quot;Choice(finish_reason=&amp;quot;&amp;quot;length&amp;quot;,\r\n#    index=0,\r\n#    &amp;quot;logprobs=None&amp;quot;,\r\n#    &amp;quot;message=ChatCompletionMessage(content=&amp;quot;&amp;quot;&amp;lt;think&amp;gt;\\nFirst, the question is: \\&amp;quot;How many r\\\\\&amp;#x27;s are in strawberry?\\&amp;quot; I need to count the number of times the letter \\\\\&amp;#x27;r\\\\\&amp;#x27; appears in the word \\&amp;quot;strawberry\\&amp;quot;.\\n\\nLet me write down the word: S-T-R-A&amp;quot;,\r\n#    &amp;quot;refusal=None&amp;quot;,\r\n#    &amp;quot;role=&amp;quot;&amp;quot;assistant&amp;quot;,\r\n#    &amp;quot;annotations=None&amp;quot;,\r\n#    &amp;quot;audio=None&amp;quot;,\r\n#    &amp;quot;function_call=None&amp;quot;,\r\n#    &amp;quot;tool_calls=None))&amp;quot;\r\n# ],\r\n# created=,\r\n# &amp;quot;model=&amp;quot;&amp;quot;deepseek-ai/deepseek-r1-0528-maas&amp;quot;,\r\n# &amp;quot;object=&amp;quot;&amp;quot;chat.completion&amp;quot;,\r\n# &amp;quot;service_tier=None&amp;quot;,\r\n# &amp;quot;system_fingerprint=&amp;quot;&amp;quot;&amp;quot;,\r\n# usage=CompletionUsage(completion_tokens=50,\r\n# prompt_tokens=18,\r\n# total_tokens=68,\r\n# &amp;quot;completion_tokens_details=None&amp;quot;,\r\n# &amp;quot;prompt_tokens_details=None))&amp;quot;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;lang-py&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dce78dc40&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What's next?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Vertex AI Model Garden opens up new possibilities for building applications that require state-of-the-art foundation models. Here are some next steps:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Review documentation guide for DeepSeek R1 MaaS &lt;/span&gt;&lt;a href="http://cloud.google.com/vertex-ai/generative-ai/docs/maas/deepseek"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt; &lt;span style="vertical-align: baseline;"&gt;and Llama MaaS &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/llama"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Review pricing &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/generative-ai/pricing"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; for both models &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Explore the &lt;/span&gt;&lt;a href="https://console.cloud.google.com/vertex-ai/model-garden"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Model Garden&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: Discover other models available as managed services&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Build a proof-of-concept: Start with a small project to understand the model's capabilities&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Join the community: Share your experiences and learn from others in the&lt;/span&gt;&lt;a href="https://www.googlecloudcommunity.com/gc/AI-ML/bd-p/cloud-ai-ml" rel="noopener" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt; Google Cloud AI Community&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Wed, 16 Jul 2025 21:30:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/deepseek-r1-is-available-for-everyone-in-vertex-ai-model-garden/</guid><category>Open Source</category><category>AI &amp; Machine Learning</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Build with more flexibility: New open models arrive in the Vertex AI Model Garden</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/deepseek-r1-is-available-for-everyone-in-vertex-ai-model-garden/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ivan Nardini</name><title>Developer Relations Engineer</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Abhishek Bhagwat</name><title>ML Engineer, Applied AI</title><department></department><company></company></author></item><item><title>Introducing the next generation of AI inference, powered by llm-d</title><link>https://cloud.google.com/blog/products/ai-machine-learning/enhancing-vllm-for-distributed-inference-with-llm-d/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As the world transitions from prototyping AI solutions to deploying AI at scale, efficient AI inference is becoming the gating factor. Two years ago, the challenge was the ever-growing size of AI models. Cloud infrastructure providers responded by supporting orders of magnitude more compute and data. Today, agentic AI workflows and reasoning models create highly variable demands and another exponential increase in processing, easily bogging down the inference process and degrading the user experience. Cloud infrastructure has to evolve again.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Open-source inference engines such as vLLM are a key part of the solution. At Google Cloud Next 25 in April, we announced full &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-vllm-tpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;vLLM support for Cloud TPUs&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Kubernetes Engine&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (GKE), Google Compute Engine, Vertex AI, and Cloud Run. Additionally, given the widespread adoption of Kubernetes for orchestrating inference workloads, we &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/google-bytedance-and-red-hat-improve-ai-on-kubernetes"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;introduced&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; the open-source Gateway API Inference Extension project to add AI-native routing to Kubernetes, and made it available in our &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/about-gke-inference-gateway"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;GKE Inference Gateway&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Customers like Samsung and BentoML are seeing great results from these solutions. And later this year, customers will be able to use these solutions with our seventh-generation &lt;/span&gt;&lt;a href="https://blog.google/products/google-cloud/ironwood-tpu-age-of-inference/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Ironwood TPU&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, purpose-built to build and serve reasoning models by scaling to up to 9,216 liquid-cooled chips in a single pod linked with breakthrough Inter-Chip Interconnect (ICI). But, there’s opportunity for even more innovation and value.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;$300 in free credit to try Google Cloud AI and ML&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddc2d82b0&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;Start building for free&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we’re making inference even easier and more cost-effective, by making vLLM fully scalable with Kubernetes-native distributed and disaggregated inference. This new &lt;/span&gt;&lt;a href="http://github.com/llm-d/llm-d/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;project&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is called llm-d. Google Cloud is a founding contributor alongside Red Hat, IBM Research, NVIDIA, and CoreWeave, joined by other industry leaders AMD, Cisco, Hugging Face, Intel, Lambda, and Mistral AI. Google has a long history of founding and contributing to key open-source projects that have shaped the cloud, such as &lt;/span&gt;&lt;a href="http://kubernetes.io" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Kubernetes&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/guide-to-jax-for-pytorch-developers?e=48754805"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;JAX&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://istio.io/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Istio&lt;/span&gt;&lt;/a&gt;,&lt;span style="vertical-align: baseline;"&gt; and is committed to being the best platform for AI development. We believe that making llm-d open-source, and community-led, is the best way to make it widely available, so you can run it everywhere and know that a strong community supports it.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;llm-d builds upon vLLM’s highly efficient inference engine, adding Google’s proven technology and extensive experience in securely and cost-effectively serving AI at billion-user scale. &lt;/span&gt;&lt;a href="https://github.com/llm-d/llm-d?tab=readme-ov-file#-architecture" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;llm-d includes three major innovations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;: First, instead of traditional round-robin load balancing, llm-d &lt;span style="vertical-align: baseline;"&gt;includes a vLLM-aware inference scheduler, which enables routing requests to instances with prefix-cache hits and low load, achieving latency SLOs with fewer hardware resources&lt;/span&gt;. Second, to serve longer requests &lt;span style="vertical-align: baseline;"&gt;with higher throughput and lower latency&lt;/span&gt;, llm-d supports disaggregated serving, which handles the prefill and decode stages of LLM inference with independent instances. Third, llm-d &lt;span style="vertical-align: baseline;"&gt;introduces a multi-tier&lt;/span&gt; &lt;span style="vertical-align: baseline;"&gt;KV cache for intermediate values (prefixes) &lt;/span&gt;to improve response time across different storage tiers and reduce storage costs. llm-d works across frameworks (PyTorch today, JAX later this year), and both GPU and TPU accelerators, to provide choice and flexibility.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/llm-d_stack_v1.max-1000x1000.jpg"
        
          alt="llm-d stack"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are excited to partner with the community to help you cost-effectively scale AI in your business. llm-d incorporates state-of-the-art distributed serving technologies into an easily deployed Kubernetes stack. Deploying llm-d on Google Cloud provides low-latency and high-performance inference by leveraging Google Cloud’s vast global network, GKE AI capabilities, and AI Hypercomputer integrations across software and hardware accelerators. Early tests by Google Cloud using llm-d show 2x improvements in time-to-first-token for use cases like code completion, enabling more responsive applications.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Visit the &lt;/span&gt;&lt;a href="https://github.com/llm-d/llm-d" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;llm-d project&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; to learn more, contribute, and get started today.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 20 May 2025 12:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/enhancing-vllm-for-distributed-inference-with-llm-d/</guid><category>AI Hypercomputer</category><category>Compute</category><category>Open Source</category><category>AI &amp; Machine Learning</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Introducing the next generation of AI inference, powered by llm-d</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/enhancing-vllm-for-distributed-inference-with-llm-d/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Mark Lohmeyer</name><title>VP and GM, AI and Computing Infrastructure</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Gabe Monroy</name><title>VP &amp; GM, Cloud Runtimes</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Gabe Monroy</name><title>VP &amp; GM, Cloud Runtimes</title><department></department><company></company></author></item><item><title>How to deploy serverless AI with Gemma 3 on Cloud Run</title><link>https://cloud.google.com/blog/products/ai-machine-learning/serverless-ai-with-gemma-3-on-cloud-run/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;a href="http://blog.google/technology/developers/gemma-3" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Today, we introduced Gemma 3&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, a family of lightweight, open models built with the cutting-edge technology behind Gemini 2.0. The Gemma 3 family of models have been designed for speed and portability, empowering developers to build sophisticated AI applications at scale. Combined with Cloud Run, it has never been easier to deploy your serverless workloads with AI models.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this post, we’ll explore the functionalities of Gemma 3, and how you can run it on &lt;/span&gt;&lt;a href="http://cloud.run" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Gemma 3: Power and efficiency for Cloud deployments&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Gemma 3 is engineered for exceptional performance with lower memory footprints, making it ideal for cost-effective inference workloads. &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Built with the world's best single-accelerator model: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Gemma 3 delivers optimal performance for its size, outperforming Llama-405B, DeepSeek-V3 and o3-mini in preliminary human preference evaluations on &lt;/span&gt;&lt;a href="https://goo.gle/Gemma3Report" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;LMArena’s leaderboard&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This helps you to create engaging user experiences that can fit on a single GPU or TPU.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Create AI with advanced text and visual reasoning capabilities: &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Easily build applications that analyze images, text and short videos, opening up possibilities for interactive applications.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Handle complex tasks with a large context window:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Gemma 3 offers a 128k-token context window to let your applications process and understand vast amounts of information — even entire novels — enabling more sophisticated AI capabilities.&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;$300 in free credit to try Google Cloud AI and ML&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddabde280&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;Start building for free&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Serverless inference with Gemma 3 and Cloud Run&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Gemma 3 is a great fit for inference workloads on Cloud Run using Nvidia L4 GPUs. Cloud Run is Google Cloud's fully managed serverless platform, &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;helping developers leverage container runtimes without having to concern themselves with the underlying infrastructure&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;Models scale to zero when inactive, and scale dynamically with demand. Not only does this optimize costs and performance, but you only pay for what you use. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For example, you could host an LLM on one Cloud Run service and a chat agent on another, enabling independent scaling and management. And with GPU acceleration, a Cloud Run service can be ready with the first AI inference &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/application-development/run-your-ai-inference-applications-on-cloud-run-with-nvidia-gpus"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;results in under 30 seconds, with only 5 seconds to start an instance&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;This rapid deployment ensures that your applications deliver responsive user experiences. We also reduced the GPU price in Cloud Run down to ~$0.6/hr. And of course, if your service isn't receiving requests, it will scale down to zero.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started today&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Cloud Run and Gemma 3 combine to create a powerful, cost-effective, and scalable solution for deploying advanced AI applications. Gemma 3 is supported by a variety of tools and frameworks, such as &lt;/span&gt;&lt;a href="https://huggingface.co/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Hugging Face Transformers&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://ollama.com/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Ollama&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;a href="https://docs.vllm.ai/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;vLLM&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To get started, visit &lt;/span&gt;&lt;a href="https://cloud.google.com/run/docs/tutorials/gpu-gemma-with-ollama"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;this guide&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; which will show you how to build a service with Gemma 3 on Cloud Run with Ollama.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 12 Mar 2025 07:30:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/serverless-ai-with-gemma-3-on-cloud-run/</guid><category>Open Source</category><category>AI &amp; Machine Learning</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How to deploy serverless AI with Gemma 3 on Cloud Run</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/serverless-ai-with-gemma-3-on-cloud-run/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>James Ma</name><title>Sr. Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Vlad Kolesnikov</name><title>Developer Relations Engineer</title><department></department><company></company></author></item><item><title>Meet Kubernetes History Inspector, a log visualization tool for Kubernetes clusters</title><link>https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-history-inspector-visualizes-cluster-logs/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Kubernetes, the container orchestration platform, is inherently a complex, distributed system. While it provides resilience and scalability, it can also introduce operational complexities, particularly when troubleshooting. Even with Kubernetes' self-healing capabilities, identifying the root cause of an issue often requires deep dives into the logs of various independent components.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;At Google Cloud, our engineers have been directly confronting this Kubernetes troubleshooting challenge for years as we support large-scale, complex deployments. In fact, the Google Cloud Support team has developed deep expertise in diagnosing issues within Kubernetes environments through routinely analyzing a vast number of customer support tickets, diving into user environments, and leveraging our collective knowledge to pinpoint the root causes of problems. To address this pervasive challenge, the team developed an internal tool: the &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/khi" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Kubernetes History Inspector (KHI)&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, and today, we’ve released it as open source for the community. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;The Kubernetes troubleshooting challenge&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In Kubernetes, each pod, deployment, service, node, and control-plane component generates its own stream of logs. Effective troubleshooting requires collecting, correlating, and analyzing these disparate log streams. But manually configuring logging for each of these components can be a significant burden, requiring careful attention to detail and a thorough understanding of the Kubernetes ecosystem. Fortunately, managed Kubernetes services such as &lt;/span&gt;&lt;a href="https://cloud.google.com/kubernetes-engine"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Kubernetes Engine&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (GKE) simplify log collection. For example, GKE offers built-in integration with &lt;/span&gt;&lt;a href="https://cloud.google.com/logging"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Logging&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, aggregating logs from all parts of the Kubernetes environment. This centralized repository is a crucial first step.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;However, simply collecting the logs solves only half the problem. The real challenge lies in analyzing them effectively. Many issues you’ll encounter in a Kubernetes deployment are not revealed by a single, obvious error message. Instead, they manifest as a chain of events, requiring a deep understanding of the causal relationships between numerous log entries across multiple components.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Consider the scale: a moderately sized Kubernetes cluster can easily generate gigabytes of log data, comprising tens of thousands of individual entries, within a short timeframe. Manually sifting through this volume of data to identify the root cause of a performance degradation, intermittent failure, or configuration error is, at best, incredibly time-consuming, and at worst, practically impossible for human operators. The signal-to-noise ratio is incredibly challenging.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;$300 in free credit to try Google Cloud containers and Kubernetes&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddbf22130&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;Start building for free&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Introducing the Kubernetes History Inspector&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;KHI is a powerful tool that analyzes logs collected by Cloud Logging, extracts state information for each component, and visualizes it in a chronological timeline. Furthermore, KHI links this timeline back to the raw log data, allowing you to track how each element evolved over time.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The Google Cloud Support team often assists users in critical, time-sensitive situations. A tool that requires lengthy setup or agent installation would be impractical. That's why we packaged KHI as a container image — it requires no prior setup, and is ready to be launched with a single command.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It's easier to show than to tell. Imagine a scenario where end users are reporting "Connection Timed Out" errors on a service running on your GKE cluster. Launching KHI, you might see something like this:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/1-Launched_aHPBxar.jpg"
        
          alt="1-Launched"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, notice the colorful, horizontal rectangles on the left. These represent the state changes of individual components over time, extracted from the logs – the timeline. This timeline provides a macroscopic view of your Kubernetes environment. In contrast, the right side of the interface displays microscopic details: raw logs, manifests, and their historical changes related to the component selected in the timeline. By providing both macroscopic and microscopic perspectives, KHI makes it easy to explore your logs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, let's go back to our hypothetical problem. Notice the alternating green and orange sections in the "Ready" row of the timeline:  &lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2-Timeline_rjRvqHn.max-1000x1000.jpg"
        
          alt="2-Timeline"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This indicates that the readiness probe is fluctuating between failure (orange) and success (green). That's a smoking gun! You now know exactly where to focus your troubleshooting efforts.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;KHI also excels at visualizing the relationships between components at any given point in the past. The complex interdependencies within a Kubernetes cluster are presented in a clear, understandable way.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3-Diagram_N0kjEVd.max-1000x1000.jpg"
        
          alt="3-Diagram"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;What’s next for KHI and Kubernetes troubleshooting&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We've only scratched the surface of what KHI can do. There's a lot more under the hood: how the timeline colors actually work, what those little diamond markers mean, and many other features that can speed up your troubleshooting. To make this available to everyone, we open-sourced KHI.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For detailed specifications, a full explanation of the visual elements, and instructions on how to deploy KHI on your own managed Kubernetes cluster, visit the &lt;/span&gt;&lt;a href="https://github.com/GoogleCloudPlatform/khi" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;KHI GitHub page&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Currently KHI only works with GKE and Kubernetes on Google Cloud combined with Cloud Logging, but we plan to extend its capabilities to the vanilla open-source Kubernetes setup soon.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;While KHI represents a significant leap forward in Kubernetes log analysis, it's designed to amplify your existing expertise, not replace it. Effective troubleshooting still requires a solid understanding of Kubernetes concepts and your application's architecture. KHI helps you, the engineer, navigate the complexity by providing a powerful map to view your logs to diagnose issues more quickly and efficiently.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;KHI is just the first step in our ongoing commitment to simplifying Kubernetes operations. We're excited to see how the community uses and extends KHI to build a more observable and manageable future for containerized applications. The journey to simplify Kubernetes troubleshooting is ongoing, and we invite you to join us.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 07 Mar 2025 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-history-inspector-visualizes-cluster-logs/</guid><category>Management Tools</category><category>Open Source</category><category>Containers &amp; Kubernetes</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Meet Kubernetes History Inspector, a log visualization tool for Kubernetes clusters</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/containers-kubernetes/kubernetes-history-inspector-visualizes-cluster-logs/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Kakeru Ishii</name><title>Technical Solutions Engineer</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Takeie Torinomi</name><title>Technical Solutions Engineer</title><department></department><company></company></author></item><item><title>Introducing agent evaluation in Vertex AI Gen AI evaluation service</title><link>https://cloud.google.com/blog/products/ai-machine-learning/introducing-agent-evaluation-in-vertex-ai-gen-ai-evaluation-service/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Comprehensive agent evaluation is essential for building the next generation of reliable AI. It's not enough to simply check the outputs; we need to understand the "why" behind an agent's actions – its reasoning, decision-making process, and the path it takes to reach a solution.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;That’s why today, we're thrilled to announce &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI Gen AI evaluation service&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; is now in public preview. This new feature empowers developers to rigorously assess and understand their AI agents. It includes a powerful set of evaluation metrics specifically designed for agents built with different frameworks, and provides native agent inference capabilities to streamline the evaluation process.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this post, we’ll explore how evaluation metrics work and share an example of how you can apply this to your agents.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;$300 in free credit to try Google Cloud AI and ML&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd88568e0&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;Start building for free&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Evaluate agents using Vertex AI Gen AI evaluation service&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Our evaluation metrics can be grouped in two categories: final response and trajectory evaluation. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Final response&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; asks a simple question: does your agent achieve its goals? You can define custom final response criteria to measure success according to your specific needs. For example, you can assess whether a retail chatbot provides accurate product information or if a research agent summarizes findings effectively, &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-overview"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;using appropriate tone and style&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To look below the surface, we offer &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;trajectory evaluation &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;to analyze the agent's decision-making process. Trajectory evaluation is crucial for understanding your agent’s reasoning, identifying potential errors or inefficiencies, and ultimately improving performance. We offer six trajectory evaluation metrics to help you answer these questions:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;1. Exact match:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Requires the AI agent to produce a sequence of actions (a "trajectory") that perfectly mirrors the ideal solution. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;2. In-order match:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; The agent's trajectory needs to include all the necessary actions in the correct order, but it might also include extra, unnecessary steps. Imagine following a recipe correctly but adding a few extra spices along the way.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;3. Any-order match:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; Even more flexible, this metric only cares that the agent's trajectory includes all the necessary actions, regardless of their order. It's like reaching your destination, regardless of the route you take.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;4. Precision:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This metric focuses on the accuracy of the agent's actions. It calculates the proportion of actions in the predicted trajectory that are also present in the reference trajectory. A high precision means the agent is making mostly relevant actions.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;5. Recall:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This metric measures the agent's ability to capture all the essential actions. It calculates the proportion of actions in the reference trajectory that are also present in the predicted trajectory. A high recall means the agent is unlikely to miss crucial steps.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;6. Single-tool use:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This metric checks for the presence of a specific action within the agent's trajectory. It's useful for assessing whether an agent has learned to utilize a particular tool or capability.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Compatibility meets flexibility &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Vertex AI Gen AI evaluation service supports a variety of agent architectures. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With today’s launch, you can evaluate agents built with Reasoning Engine (LangChain on Vertex AI), the managed runtime for your agentic applications on Vertex AI. We also support agents built by open-source frameworks, including LangChain, LangGraph, and CrewAI – and we are planning to support upcoming Google Cloud services to build agents. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;For maximum flexibility, you can evaluate agents using a custom function that processes prompts and returns responses. To make your evaluation experience easier, we offer native agent inference and automatically log all results in Vertex AI experiments. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Agent evaluation in action&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Let's say you have the following LangGraph customer support agent, and you aim to assess both the responses it generates and the sequence of actions (or "trajectory") it undertakes to produce those responses.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/1_graph.jpg"
        
          alt="1_graph"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To assess an agent using Vertex AI Gen AI evaluation service, you start preparing an evaluation dataset. This dataset should ideally contain the following elements:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;User prompt:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This represents the input that the user provides to the agent.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Reference trajectory:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This is the expected sequence of actions that the agent should take to provide the correct response.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Generated trajectory:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This is the actual sequence of actions that the agent took to generate a response to the user prompt.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Response:&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; This is the generated response, given the agent's sequence of actions.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A sample evaluation dataset is shown below.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_table.max-1000x1000.png"
        
          alt="3_table"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;After you gather your evaluation dataset, define the metrics that you want to use to evaluate the agent. For a complete list of metrics and their interpretations, refer to &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-agents"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Evaluate Gen AI agents&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Some metrics you can define are listed here:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;response_tool_metrics = [\r\n    &amp;quot;trajectory_exact_match&amp;quot;, &amp;quot;trajectory_in_order_match&amp;quot;, &amp;quot;safety&amp;quot;, response_follows_trajectory_metric\r\n]&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddc29f3d0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Notice that the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;response_follows_trajectory_metric&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; is a custom metric that you can define to evaluate your agent. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Standard text generation metrics, such as coherence, may not be sufficient when evaluating AI agents that interact with environments, as these metrics primarily focus on text structure. Agent responses should be assessed based on their effectiveness within the environment.  Vertex AI Gen AI Evaluation service allows you to define custom metrics, like &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;response_follows_trajectory_metric&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;, that assess whether the agent's response logically follows from its tool choices. For more information on these metrics, please refer to the official notebook.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With your evaluation dataset and metrics defined, you can now run your first agent evaluation job on Vertex AI. Please see the code sample below.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Import libraries \r\nimport vertexai\r\nfrom vertexai.preview.evaluation import EvalTask\r\n\r\n# Initiate Vertex AI session\r\nvertexai.init(project=&amp;quot;my-project-id&amp;quot;, location=&amp;quot;my-location&amp;quot;, experiment=&amp;quot;evaluate-langgraph-agent)\r\n\r\n# Define an EvalTask\r\nresponse_eval_tool_task = EvalTask(\r\n    dataset=byod_eval_sample_dataset,\r\n    metrics=response_tool_metrics,\r\n)\r\n\r\n# Run evaluation\r\nresponse_eval_tool_result = response_eval_tool_task.evaluate(                                                      experiment_run_name=&amp;quot;response-over-tools&amp;quot;)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddc29fc40&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To run the evaluation, initiate an `EvalTask` using the predefined dataset and metrics. Then, run an evaluation job using the evaluate method. Vertex AI Gen AI evaluation tracks the resulting evaluation as an experiment run within &lt;/span&gt;&lt;a href="https://cloud.google.com/vertex-ai/docs/experiments/intro-vertex-ai-experiments"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI Experiments&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, the managed experiment tracking service on Vertex AI. The evaluation results can be viewed both within the notebook and the Vertex AI Experiments UI. If you're using Colab Enterprise, you can also view the results in the Experiment side panel as shown below.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_experiment.max-1000x1000.png"
        
          alt="2_experiment"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Vertex AI Gen AI evaluation service offers summary and metrics tables, providing detailed insights into agent performance. This includes individual user input, trajectory results, and aggregate results for all user input and trajectory pairs across all requested metrics.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Access to these granular evaluation results enables you to create meaningful visualizations of agent performance, including bar and radar charts like the one below:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/4_radar.max-1000x1000.png"
        
          alt="4_radar"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Get started today&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Explore the Vertex AI Gen AI evaluation service in public preview and unlock the full potential of your agentic applications.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Documentation&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/models/evaluation-agents"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Evaluate gen AI agents &lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Notebooks&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/evaluation/evaluating_langgraph_agent.ipynb" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Evaluating a LangGraph agent&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main//generative-ai/gemini/evaluation/evaluating_crewai_agent.ipynb" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Evaluating a Crew AI agent&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/evaluating_langchain_agent_reasoning_engine_prebuilt_template.ipynb" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Evaluating LangChain agent on Vertex AI Reasoning Engine&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/evaluating_langgraph_agent_reasoning_engine_customized_template.ipynb" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Evaluating LangGraph agent on Vertex AI Reasoning Engine&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/reasoning-engine/evaluating_crewai_agent_reasoning_engine_customized_template.ipynb" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Evaluating CrewAI agent on Vertex AI Reasoning Engine&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Fri, 24 Jan 2025 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/introducing-agent-evaluation-in-vertex-ai-gen-ai-evaluation-service/</guid><category>Open Source</category><category>AI &amp; Machine Learning</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Introducing agent evaluation in Vertex AI Gen AI evaluation service</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/introducing-agent-evaluation-in-vertex-ai-gen-ai-evaluation-service/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Irina Sigler</name><title>Product Manager, Cloud AI</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ivan Nardini</name><title>Developer Relations Engineer</title><department></department><company></company></author></item><item><title>How to deploy Llama 3.2-1B-Instruct model with Google Cloud Run GPU</title><link>https://cloud.google.com/blog/products/ai-machine-learning/how-to-deploy-llama-3-2-1b-instruct-model-with-google-cloud-run/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As open-source large language models (LLMs) become increasingly popular, developers are looking for better ways to access new models and deploy them on &lt;/span&gt;&lt;a href="https://cloud.google.com/run/docs/configuring/services/gpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run GPU&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. That’s why Cloud Run now offers fully managed NVIDIA GPUs, which removes the complexity of driver installations and library configurations. This means you’ll benefit from the same on-demand availability and effortless scalability that you love with Cloud Run's CPU and memory, with the added power of NVIDIA GPUs. When your application is idle, your GPU-equipped instances automatically scale down to zero, optimizing your costs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog post, we'll guide you through deploying the &lt;/span&gt;&lt;a href="https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Meta Llama 3.2 1B Instruction&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; model on Cloud Run. We'll also share best practices to streamline your development process using local model testing with &lt;/span&gt;&lt;a href="https://huggingface.co/docs/text-generation-inference/en/index" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Text Generation Inference&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (TGI) Docker image, making troubleshooting easy and boosting your productivity.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-aside"&gt;&lt;dl&gt;
    &lt;dt&gt;aside_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;title&amp;#x27;, &amp;#x27;$300 in free credit to try Google Cloud AI and ML&amp;#x27;), (&amp;#x27;body&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd9a4fa00&amp;gt;), (&amp;#x27;btn_text&amp;#x27;, &amp;#x27;Start building for free&amp;#x27;), (&amp;#x27;href&amp;#x27;, &amp;#x27;http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/&amp;#x27;), (&amp;#x27;image&amp;#x27;, None)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Why Cloud Run with GPU?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;There are four critical reasons developers benefit from deploying open models on Cloud Run with GPU:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Fully managed: No need to worry about drivers, libraries, or infrastructure.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;On-demand scaling: Scale up or down automatically based on demand.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Cost effective: Only pay for what you use, with automatic scaling down to zero when idle.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Performance: NVIDIA GPU-optimized for Meta Llama 3.2.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Initial Setup&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;First, create a Hugging Face token. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Second, check that your Hugging Face token has permission to access and download Llama 3.2 model weight &lt;/span&gt;&lt;a href="https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. Keep your token handy for the next step.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Third, use Google Cloud's Secret Manager to store your Hugging Face token securely. In this example, we will be using Google &lt;/span&gt;&lt;a href="https://cloud.google.com/docs/authentication/gcloud" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;user credentials&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. You may need to authenticate for using gcloud CLI, setting default project ID, and enable necessary APIs, and grant access to &lt;/span&gt;&lt;a href="https://cloud.google.com/secret-manager/docs" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Secret Manager&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;a href="https://cloud.google.com/storage/docs" style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Storage&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# Authenticate CLI\r\ngcloud auth login\r\n\r\n# Set default project\r\ngcloud config set project &amp;lt;your_project_id&amp;gt;\r\n\r\n# Create new secret key, remember to update &amp;lt;your_huggingface_token&amp;gt;\r\ngcloud secrets create HF_TOKEN --replication-policy=&amp;quot;automatic&amp;quot;\r\necho -n &amp;lt;your_huggingface_token&amp;gt; | gcloud secrets versions add HF_TOKEN --data-file=-\r\n\r\n# Retrieve the key\r\nHF_TOKEN=$(gcloud secrets versions access latest --secret=&amp;quot;HF_TOKEN&amp;quot;)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd9a4f790&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Local debugging&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Install &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;huggingface_cli&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; python package in your virtual environment.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Run &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;huggingface-cli login&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; to set up a Hugging Face credential.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Use the TGI Docker image to test your model locally. This allows you to iterate and debug your model locally before deploying it to Cloud Run.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;export LOCAL_MODEL_DIR=~/.cache/huggingface/hub\r\nexport CONTAINRE_MODEL_DIR=/root/.cache/huggingface/hub\r\nexport LOCAL_PORT=3002\r\n\r\ndocker run --gpus all -ti --shm-size 1g -p $LOCAL_PORT:8080 \\\r\n   -e MODEL_ID=meta-llama/Llama-3.2-1B-Instruct \\\r\n   -e NUM_SHARD=1 \\\r\n   -e HF_TOKEN=$(gcloud secrets versions access latest --secret=&amp;quot;HF_TOKEN&amp;quot;) \\\r\n   -e MAX_INPUT_LENGTH=500 \\\r\n   -e MAX_TOTAL_TOKENS=1000 \\\r\n   -e HUGGINGFACE_HUB_CACHE=$CONTAINRE_MODEL_DIR \\\r\n   -v $LOCAL_MODEL_DIR:$CONTAINRE_MODEL_DIR \\\r\nus-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu121.2-2.ubuntu2204.py310&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd9a4faf0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Deployment to Cloud Run&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Deploy the model to Cloud Run with NVIDIA L4 GPU: (Remember to update &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;SERVICE_NAME&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;).&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;export LOCATION=us-central1\r\nexport CONTAINER_URI=us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu121.2-2.ubuntu2204.py310\r\nexport SERVICE_NAME=&amp;lt;your-cloudrun-service-name&amp;gt;\r\n\r\ngcloud beta run deploy $SERVICE_NAME \\\r\n   --image=$CONTAINER_URI \\\r\n   --args=&amp;quot;--model-id=meta-llama/Llama-3.2-1B-Instruct,--max-concurrent-requests=1&amp;quot; \\\r\n   --port=8080 \\\r\n   --cpu=8 \\\r\n   --memory=32Gi \\\r\n   --no-cpu-throttling \\\r\n   --gpu=1 \\\r\n   --gpu-type=nvidia-l4 \\\r\n   --max-instances=3 \\\r\n   --concurrency=64 \\\r\n   --region=$LOCATION \\\r\n   --no-allow-unauthenticated \\\r\n   --set-secrets=HF_TOKEN=HF_TOKEN:latest&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd9a4f9a0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Endpoint testing&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Test your deployed model using &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;curl&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;This sends a request to your Cloud Run service for a chat completion, demonstrating how to interact with the deployed model.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;URL=https://your-url.us-central1.run.app\r\n\r\n\r\ncurl $URL/v1/chat/completions \\\r\n   -X POST \\\r\n   -H &amp;quot;Authorization: Bearer $(gcloud auth print-identity-token)&amp;quot; \\\r\n   -H \&amp;#x27;Content-Type: application/json\&amp;#x27; \\\r\n   -d \&amp;#x27;{\r\n       &amp;quot;model&amp;quot;: &amp;quot;tgi&amp;quot;,\r\n       &amp;quot;messages&amp;quot;: [\r\n           {\r\n               &amp;quot;role&amp;quot;: &amp;quot;system&amp;quot;,\r\n               &amp;quot;content&amp;quot;: &amp;quot;You are a helpful assistant.&amp;quot;\r\n           },\r\n           {\r\n               &amp;quot;role&amp;quot;: &amp;quot;user&amp;quot;,\r\n               &amp;quot;content&amp;quot;: &amp;quot;What is Cloud Run?&amp;quot;\r\n           }\r\n       ],\r\n       &amp;quot;max_tokens&amp;quot;: 128\r\n   }\&amp;#x27;&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd9a4f220&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Cold start improvements with Cloud Storage FUSE&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;You’ll notice that it takes more than a minute during a cold start for the response to return. Can we do better? &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We can use Cloud Storage FUSE. Cloud Storage FUSE is an open-source tool that lets you mount Google Cloud Storage buckets as a file system.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;First, you need to download the model files and upload them to the Cloud Storage bucket. (Remember to update &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;GCS_BUCKET&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;).&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;# 1. Download model\r\nMODEL=meta-llama/Llama-3.2-1B-Instruct\r\nLOCAL_DIR=/mnt/project/google-cloudrun-gpu/gcs_folder/hub/Llama-3.2-1B-Instruct\r\nGCS_BUCKET=gs://&amp;lt;YOUR_BUCKET_WITH_MODEL_WEIGHT&amp;gt;\r\n\r\nhuggingface-cli download $MODEL --exclude &amp;quot;*.bin&amp;quot; &amp;quot;*.pth&amp;quot; &amp;quot;*.gguf&amp;quot; &amp;quot;.gitattributes&amp;quot; --local-dir $LOCAL_DIR\r\n\r\n# 2. Copy to GCS\r\ngsutil -o GSUtil:parallel_composite_upload_threshold=150M -m cp -e -r $LOCAL_DIR $GCS_BUCKET&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd9a4f880&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now, we will create a new Cloud Run service using the deployment script as follows. (Remember to update &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;BUCKET_NAME)&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. You may also need to update the &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;network&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;subnet&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; name as well.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;export LOCATION=us-central1\r\nexport CONTAINER_URI=us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu124.2-3.ubuntu2204.py311\r\nexport SERVICE_NAME=cloudrun-gpu-fuse-llama32-1b-instruct\r\nexport VOLUME_NAME=fuse\r\nexport BUCKET_NAME=&amp;lt;YOUR_BUCKET_WITH_MODEL_WEIGHT&amp;gt;\r\nexport MOUNT_PATH=/mnt/fuse\r\n\r\ngcloud beta run deploy $SERVICE_NAME \\\r\n    --image=$CONTAINER_URI \\\r\n    --args=&amp;quot;--model-id=$MOUNT_PATH/Llama-3.2-1B-Instruct,--max-concurrent-requests=1&amp;quot; \\\r\n    --port=8080 \\\r\n    --cpu=8 \\\r\n    --memory=32Gi \\\r\n    --no-cpu-throttling \\\r\n    --gpu=1 \\\r\n    --gpu-type=nvidia-l4 \\\r\n    --max-instances=3 \\\r\n    --concurrency=64 \\\r\n    --region=$LOCATION \\\r\n    --network=default \\\r\n    --subnet=default \\\r\n    --vpc-egress=all-traffic \\\r\n    --no-allow-unauthenticated \\\r\n    --update-env-vars=HF_HUB_OFFLINE=1 \\\r\n    --add-volume=name=$VOLUME_NAME,type=cloud-storage,bucket=$BUCKET_NAME \\\r\n    --add-volume-mount=volume=$VOLUME_NAME,mount-path=$MOUNT_PATH&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd9a4fc40&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Next Steps&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;To learn more about Cloud Run with NVIDIA GPUs and to deploy your own open-source model from Hugging Face, check out these resources below:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/run/docs/configuring/services/gpu"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Cloud Run GPU&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://cloud.google.com/run/docs/configuring/services/gpu-best-practices"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Best practices&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;a href="https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Meta Llama 3.2 1B Instruct&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;</description><pubDate>Thu, 14 Nov 2024 17:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/how-to-deploy-llama-3-2-1b-instruct-model-with-google-cloud-run/</guid><category>Open Source</category><category>AI &amp; Machine Learning</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How to deploy Llama 3.2-1B-Instruct model with Google Cloud Run GPU</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/how-to-deploy-llama-3-2-1b-instruct-model-with-google-cloud-run/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Wei Yih Yap</name><title>Generative AI Field Solutions Architect</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Sagar Randive</name><title>Product Manager</title><department></department><company></company></author></item><item><title>Real-time data for real-world AI with support for Apache Flink in BigQuery</title><link>https://cloud.google.com/blog/products/data-analytics/introducing-bigquery-engine-for-apache-flink/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today’s organizations aspire to become "by-the-second" businesses, capable of adapting in real time to changes in their supply chain, inventory, customer behavior, and more. They also strive to provide exceptional customer experiences, whether it's through a support interaction or an online checkout process. We believe that real-time intelligence should be accessible to all businesses, regardless of their size or budget and should be integrated into a unified data platform, so that everything works together. Today, we’re taking a big step toward helping businesses realize these aspirations, with &lt;/span&gt;&lt;a href="https://cloud.google.com/products/bigquery-engine-for-apache-flink"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;BigQuery Engine for Apache Flink&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, now in preview. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Introducing BigQuery Engine for Apache Flink: Familiar Flink, now serverless &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;BigQuery Engine for Apache Flink provides a state-of-the art real-time intelligence platform, empowering customers to:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Use familiar streaming technologies on Google Cloud&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;. BigQuery Engine for Apache Flink makes it easier to lift and shift existing streaming applications relying on open-source Apache Flink to Google Cloud, without rewriting code or relying on third-party services. Combined with &lt;/span&gt;&lt;a href="https://cloud.google.com/products/managed-service-for-apache-kafka"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Managed Service for Apache Kafka&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (now GA), it is easy to migrate and modernize your streaming analytics on Google Cloud.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Reduce operational burden.&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; BigQuery Engine for Apache Flink is entirely serverless, reducing operational burden and allowing customers to focus on what they do best — innovate their businesses.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Bring real-time data to AI. &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;Enterprise developers experimenting with gen AI are looking for a well-integrated and scalable streaming platform that’s based on familiar technologies — Apache Flink and Apache Kafka — and that they can combine with Google’s differentiated AI/ML capabilities in BigQuery.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/FlinkUI.max-1000x1000.jpg"
        
          alt="FlinkUI"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;BigQuery Engine for Apache Flink arrives during a time when Google Cloud customers are leveraging many &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/data-analytics/google-clouds-innovations-for-continuous-real-time-intelligence"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;innovations&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in real-time analytics, including &lt;/span&gt;&lt;a href="https://cloud.google.com/bigquery/docs/continuous-queries-introduction"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;BigQuery continuous queries,&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; which enables customers to analyze incoming data in BigQuery in real time using SQL, and &lt;/span&gt;&lt;a href="https://cloud.google.com/dataflow/docs/guides/job-builder"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Dataflow Job Builder&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, which helps customers define and deploy a streaming pipeline using a visual UI.  &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With BigQuery Engine for Apache Flink, our streaming portfolio now spans SQL-based easy streaming with BigQuery continuous queries, popular open-source Flink and Kafka platforms, and advanced multimodal data streaming with Dataflow, including support for Iceberg. These capabilities are integrated with BigQuery, which connects your data with industry leading AI, including Gemini, Gemma and open models.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;New AI capabilities unlocked when your data is real-time&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As we look ahead, it's clear that generative AI has reignited interest in the potential of data-driven insights and experiences. AI, especially generative AI, is most effective when it has access to the latest context. If you’re a retailer, you can combine historical purchase data with real-time interactions to personalize shopping experiences for your customers. If you’re a financial services company, you can use up-to-the-second transactions to refine your fraud detection model. Real-time data connected to AI means fresh data for training models, real-time user assistance with &lt;/span&gt;&lt;a href="https://github.com/apache/beam/pull/31657" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Retrieval Augmented Generation&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (RAG), and real-time predictions and inferences for your business applications, including &lt;/span&gt;&lt;a href="https://developers.googleblog.com/en/gemma-for-streaming-ml-with-dataflow/" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;integrating small models like Gemma into your streaming pipelines&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.  &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are taking a platform approach to introduce capabilities across the board so that, no matter what specific streaming architecture you need, or which streaming engine you prefer, you have the ability to leverage real-time data for your gen AI use cases. Features such as Dataflow &lt;/span&gt;&lt;a href="https://cloud.google.com/dataflow/docs/guides/enrichment"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;enrichment transform&lt;/span&gt;&lt;/a&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;s&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, support for &lt;/span&gt;&lt;a href="https://cloud.google.com/dataflow/docs/notebooks/vertex_ai_text_embeddings"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Vertex AI text-embeddings&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, the &lt;/span&gt;&lt;a href="https://cloud.google.com/dataflow/docs/notebooks/run_inference_vertex_ai"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;RunInference transform&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/databases/distributed-counting-with-bigtable"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;distributed counting&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in Bigtable, and many others make the task of building real-time AI applications easier than ever.   &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We are very excited to get these capabilities into your hands and continue giving you more flexibility and choice when it comes to making your unified data and AI platform operate in real-time data. Learn more about &lt;/span&gt;&lt;a href="https://cloud.google.com/products/bigquery-engine-for-apache-flink/"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;BigQuery Engine for Apache Flink&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and get started using it today in the &lt;/span&gt;&lt;a href="https://console.cloud.google.com/"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud console&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Wed, 09 Oct 2024 08:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/data-analytics/introducing-bigquery-engine-for-apache-flink/</guid><category>Open Source</category><category>Streaming</category><category>Data Analytics</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Real-time data for real-world AI with support for Apache Flink in BigQuery</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/data-analytics/introducing-bigquery-engine-for-apache-flink/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Yuriy Zhovtobryukh</name><title>Senior Product Manager</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Angela Soares</name><title>Senior Product Marketing Manager</title><department></department><company></company></author></item><item><title>Introducing Valkey 8.0 on Memorystore: unmatched performance and fully open-source</title><link>https://cloud.google.com/blog/products/databases/memorystore-launches-valkey-8-0-on-google-cloud/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;strong style="font-style: italic; vertical-align: baseline;"&gt;Editor's note&lt;/strong&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;: Ping Xie is a Valkey maintainer on the Valkey Technical Steering Committee (TSC)&lt;/span&gt;&lt;/p&gt;
&lt;hr/&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Today, we’re thrilled to announce &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Valkey 8.0&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; on &lt;/span&gt;&lt;a href="https://cloud.google.com/memorystore"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Memorystore&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in preview, making Google Cloud the first major cloud platform to offer Valkey 8.0 as a fully managed service. Building upon the launch of &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/databases/announcing-memorystore-for-valkey"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Memorystore for Valkey 7.2&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in August 2024, this further solidifies Google Cloud’s commitment to open source, providing you with the latest and greatest features from the Valkey open-source ecosystem. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Valkey 8.0 on Memorystore is a testament to our commitment to supporting customers such as &lt;/span&gt;&lt;a href="https://www.mlb.com/" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;Major League Baseball (MLB)&lt;/strong&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. As the most historic professional sports league, MLB uses Memorystore to power its real-time analytics, processing vast amounts of data to provide fans with insights and statistics during games.&lt;/span&gt;&lt;/p&gt;
&lt;p style="padding-left: 40px;"&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;"At MLB, we're obsessed with delivering the best possible experience for our fans. Valkey's truly open-source approach to caching is a game-changer, promising the performance and innovation we need to keep fans engaged and connected. We're excited to be part of this community and look forward to Valkey's continued innovation on Memorystore." - &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Rob Engel, Vice President of Software Engineering, Major League Baseball&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;The Valkey 8.0 release&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Earlier this year, after Redis Inc. changed the license of Redis OSS from the permissive BSD 3-Clause license to a restrictive Source Available License (RSAL), the open-source community rallied to create Valkey &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;(&lt;/span&gt;&lt;a href="https://www.linuxfoundation.org/press/linux-foundation-launches-open-source-valkey-community" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;1&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linuxfoundation.org/press/linux-foundation-launches-open-source-valkey-community" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;2&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;a href="https://www.linuxfoundation.org/press/valkey-welcomes-new-partners-amid-growing-momentum" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;3&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;) &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;— a fully open-source alternative under the BSD 3-clause license. In just a few months, the Valkey community released &lt;/span&gt;&lt;a href="https://www.linuxfoundation.org/press/valkey-8-0" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;the open source Valkey 8.0&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; in GA, showcasing the power of open-source collaboration and unfettered innovation.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Memorystore for Valkey 8.0&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;delivers enhanced performance, improved reliability, and full compatibility with Redis OSS — all as a fully Google managed service.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Improvements to the Valkey performance benchmarks are thanks to newly introduced asynchronous I/O capabilities. The enhanced I/O threading system allows the main thread and I/O threads to operate concurrently, enabling parallel processing of commands and I/O operations, and maximizing throughput by reducing bottlenecks in handling incoming requests. Memorystore for Valkey 8.0 achieves up to a &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;2x Queries Per Second (QPS) at microsecond latency&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; when compared to Memorystore for Redis Cluster, allowing applications to handle higher throughput with similarly sized clusters. This makes Valkey 8.0 a great choice for high-throughput, real-time applications that aim to provide highly responsive user experiences.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Along with the throughput gain, Valkey 8.0 includes other optimizations that further enhance the overall speed of the service:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;SUNION &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;command is optimized for faster set union operations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;SDIFF &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;ZUNIONSTORE &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;commands have been refactored for improved execution times.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;DEL &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;command avoids Redundant deletions for expired keys.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;CLUSTER SLOTS&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; responses are cached for better throughput and reduced latency in cluster operations.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;CRC64 &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;performance is improved for large data batches, which is crucial for RDB snapshot and slot migration scenarios.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Valkey 8.0 also brings key-memory efficiency improvements&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, allowing you to store more data without requiring changes to your application. Keys are now embedded directly into the main dictionary, reducing memory overhead while enhancing performance. Additionally, the new per-slot dictionary splits the main dictionary by slot, further reducing the memory overhead by 16 bytes per key-value pair without degrading performance.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Meanwhile, Valkey 8.0 has &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;improved reliability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; thanks to &lt;/span&gt;&lt;a href="https://cloud.google.com/blog/products/databases/zero-downtime-scaling-in-memorystore-for-redis-cluster"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;several features developed by Google&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; that were subsequently contributed to the project, significantly enhancing cluster resilience and availability: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Automatic failover&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; for empty shards helps ensure high availability even during the initial scaling stages, allowing new, slotless shards to fail over smoothly. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;strong style="vertical-align: baseline;"&gt;Replicating slot migration states&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; helps ensure that all &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;CLUSTER SETSLOT&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; commands are synchronized across replicas before execution on the primary, reducing the risk of data unavailability during failover events, and enabling new replicas to automatically inherit the correct state. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Additionally, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;slot migration state recovery&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; ensures that after a failover, the source and target nodes are updated automatically, maintaining accurate routing of requests to the correct primary without operator intervention. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Thanks to these enhancements, Valkey 8.0 clusters are more resilient against failures during slot movement, giving customers peace of mind that their data remains available even during complex scaling operations.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Compatible with Redis OSS 7.2&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Just like Valkey 7.2, Valkey 8.0 maintains &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;full backwards compatibility&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; with Redis OSS 7.2&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;APIs, allowing for a seamless migration from Redis. Popular Redis clients like &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Jedis, redis-py, node-redis&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;go-redis&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; are fully supported so that migrating workloads to Valkey doesn’t require modifications to application code.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This fusion of open-source flexibility and managed service reliability provides you with a balance of control and convenience, making Valkey a great destination for your Redis OSS workloads.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style="vertical-align: baseline;"&gt;Get started with Valkey 8.0 on Memorystore today&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We invite you to get started with &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Valkey 8.0&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; on &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Memorystore&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt; today and experience the above enhancements for yourself. With features such as &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;zero-downtime scaling&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;high availability&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, and &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;RDB snapshot and AOF logging &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;based persistence, Memorystore's Valkey 8.0 provides the performance, reliability, and scalability today’s high demanding workloads deserve.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Get started today by creating a fully managed Valkey Cluster through t&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;he &lt;/strong&gt;&lt;a href="https://console.cloud.google.com/memorystore"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Google Cloud console&lt;/span&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; or gcloud&lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;, and join the growing community that is shaping the future of truly open-source data management.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 03 Oct 2024 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/databases/memorystore-launches-valkey-8-0-on-google-cloud/</guid><category>Open Source</category><category>Databases</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Introducing Valkey 8.0 on Memorystore: unmatched performance and fully open-source</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/databases/memorystore-launches-valkey-8-0-on-google-cloud/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ping Xie</name><title>Software Engineer</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Ankit Sud</name><title>Senior Product Manager, Google</title><department></department><company></company></author></item><item><title>What’s new in PostgreSQL 16: New features available in Cloud SQL today</title><link>https://cloud.google.com/blog/products/databases/postgresql-16-now-available-in-cloud-sql/</link><description>&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In an effort to improve usability and facilitate informed decision-making, Cloud SQL customers can now use PostgreSQL 16, which introduces new features for deeper insights into database operations and enhanced usability. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In this blog post we cover some of the highlights of the&lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt; &lt;/strong&gt;&lt;span style="vertical-align: baseline;"&gt;PostgreSQL 16 version, including: &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Improvements in observability&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Performance improvements&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Vacuum efficiency&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Replication improvements&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Let’s take a deeper look at each of these areas.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;Observability improvements&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Observability is an important aspect of databases, helping operators optimize resource consumption by providing insights into how resources are being utilized. Here are some important observability enhancements introduced in PostgreSQL 16. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;PG_STAT_IO&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;PostgreSQL16 adds a new view &lt;/span&gt;&lt;a href="https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-IO-VIEW" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;pg_stat_io&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; that &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;provides insights into the Input/Output (IO) behavior of a PostgreSQL database. &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;We can use this view to make i&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;nformed decisions to optimize database performance, improve resource utilization and ensure the overall health and scalability of the database system. This view presents the stats for the entire instance. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;What can we infer from this view? &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Like  most other &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;pg_stat_*&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; views, the statistics in the view are cumulative. T&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;o track changes in the pg_stat_io view over a specific time period, record the values at the beginning and end of the workload.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This view tracks the stats mainly by the columns in &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;backend_type&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;, &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;io_context&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;io_object&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;backend_type&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; is a connection process and can be one of client backend, background worker, checkpointer, standalone backend, autovacuum launcher, autovacuum worker. The &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;io_context&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; is classified based on the load as normal, bulk read, bulk write, or vacuum.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The actual stats to be considered for knowing the I/O status of the instance are reads, writes, extends, hits, evictions, and reuses.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We can monitor the shared buffers efficiency by comparing the evictions-to-hits ratio. The buffer hit ratio is considered effective when hits for each context are much higher than evictions. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The bulk reads and bulk writes indicate sequential scans. The evictions, hits and reuses for these indicate the efficiency of ring buffers in this case.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;We can also observe the amount of data read or written as part of the autovacuum or vacuum process. The metric data related to autovacuum are observed by &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;io_context =’ vacuum’&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;backend_type as ‘autovacuum worker’&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;. A vacuum process goes by &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;backend_type&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; as &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;‘standalone backend’&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; with&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; io_context &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;as &lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt;‘vacuum’&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Here’s an image of the view:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_h0fYBp8.max-1000x1000.png"
        
          alt="image1"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Last sequential and index scans on tables and indexes&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The views &lt;/span&gt;&lt;a href="https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW" rel="noopener" target="_blank"&gt;&lt;code style="text-decoration: underline; vertical-align: baseline;"&gt;pg_stat_*_tables&lt;/code&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; have two new columns&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;last_seq_scan&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code style="vertical-align: baseline;"&gt;last_idx_scan&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Want to know when the last time sequential scan or index scan happened on your tables? Check the newly introduced columns &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;last_seq_scan&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; and &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;last_idx_scan&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; in &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;pg_stat_*_tables&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The timestamp of the last sequential or index scan on a table is indicated in these columns. This can be helpful for identifying any “read query” issues. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Similarly, the column&lt;/span&gt;&lt;span style="font-style: italic; vertical-align: baseline;"&gt; &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;last_idx_scan&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt; has been introduced to &lt;/span&gt;&lt;a href="https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-INDEXES-VIEW" rel="noopener" target="_blank"&gt;&lt;code style="text-decoration: underline; vertical-align: baseline;"&gt;pg_stat_*_indexes&lt;/code&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;. This column indicates the timestamp last time the index was used. If we were to drop an index, we can make an informed decision based on the value present in this column for the index. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Statistics on the occurrence of tuples moving to a new page for updates&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The views &lt;/span&gt;&lt;a href="https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ALL-TABLES-VIEW" rel="noopener" target="_blank"&gt;&lt;code style="text-decoration: underline; vertical-align: baseline;"&gt;pg_stat_*_tables&lt;/code&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; now has a new column, &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;n_tup_newpage_upd.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As we perform updates on a table and want to monitor how many of the rows end up in new heap pages, we can now view this in the column &lt;/span&gt;&lt;code style="vertical-align: baseline;"&gt;n_tup_newpage_upd&lt;/code&gt;&lt;span style="vertical-align: baseline;"&gt;. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This can reveal the factors contributing to the table's growth over time. The value in this column also can be used to validate the ‘fillfactor’ set for the table. Especially for updates which are expected to be ‘HOT’, by observing the stats in this column we can establish if the ‘fillfactor’ is optimal or not.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;Performance improvements&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Performance is always a top priority for databases. Performance improvements are adopted much faster than other enhancements in a major version release. Here are some of the performance improvements in PostgreSQL 16.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Tables with only BRIN index on a table column are considered ‘HOT’&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;With PostgreSQL16, updates to a table with BRIN index are now considered as HOT considering the fillfactor for the table is optimal.’  ‘Fillfactor’ is an important setting for this update to be marked ‘HOT’. This improvement makes vacuuming such a table fast and resource-efficient. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Parallelization of FULL or OUTER joins  &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;This performance improvement is very beneficial for selects involving very large tables joined by full or outer joins. In PostgreSQL16, this will result in a parallel hash after a parallel seq scan for each table, instead of a merge or hash after a full heap fetch. In our tests, it has shown quite a large improvement compared to PG15.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Example for full outer join&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;postgres=&amp;gt; explain (analyze, buffers, verbose) select count(*) from object_store s full outer join object_store2 g on (s.project_id=g.project_id);\r\n                                                                               QUERY PLAN                                                                                \r\n-------------------------------------------------------------------------------------------------------------------------------------------------------------------------\r\n Finalize Aggregate  (cost=145953.66..145953.67 rows=1 width=8) (actual time=6095.420..6236.950 rows=1 loops=1)\r\n   Output: count(*)\r\n   Buffers: shared hit=74980, temp read=34714 written=35020\r\n   -&amp;gt;  Gather  (cost=145953.44..145953.65 rows=2 width=8) (actual time=6083.804..6236.922 rows=3 loops=1)\r\n         Output: (PARTIAL count(*))\r\n         Workers Planned: 2\r\n         Workers Launched: 2\r\n         Buffers: shared hit=74980, temp read=34714 written=35020\r\n         -&amp;gt;  Partial Aggregate  (cost=144953.44..144953.45 rows=1 width=8) (actual time=6068.822..6069.193 rows=1 loops=3)\r\n               Output: PARTIAL count(*)\r\n               Buffers: shared hit=74980, temp read=34714 written=35020\r\n               Worker 0:  actual time=6053.795..6053.802 rows=1 loops=1\r\n                 Buffers: shared hit=23966, temp read=12066 written=11200\r\n               Worker 1:  actual time=6069.306..6069.313 rows=1 loops=1\r\n                 Buffers: shared hit=26385, temp read=10995 written=12292\r\n               -&amp;gt;  Parallel Hash Full Join  (cost=83021.80..140786.80 rows=1666658 width=0) (actual time=3824.778..5852.278 rows=1333333 loops=3)\r\n                     Hash Cond: (g.project_id = s.project_id)\r\n                     Buffers: shared hit=74980, temp read=34714 written=35020\r\n                     Worker 0:  actual time=3857.567..5832.558 rows=1361655 loops=1\r\n                       Buffers: shared hit=23966, temp read=12066 written=11200\r\n                     Worker 1:  actual time=3851.661..5870.054 rows=1244012 loops=1\r\n                       Buffers: shared hit=26385, temp read=10995 written=12292\r\n                     -&amp;gt;  Parallel Seq Scan on public.object_store2 g  (cost=0.00..41652.00 rows=421200 width=16) (actual time=0.029..936.699 rows=1333333 loops=3)\r\n                           Output: g.project_id\r\n                           Buffers: shared hit=37440\r\n                           Worker 0:  actual time=0.026..977.947 rows=1347470 loops=1\r\n                             Buffers: shared hit=12650\r\n                           Worker 1:  actual time=0.043..1017.822 rows=1298124 loops=1\r\n                             Buffers: shared hit=12132\r\n                     -&amp;gt;  Parallel Hash  (cost=54050.57..54050.57 rows=1666658 width=16) (actual time=1456.617..1456.619 rows=1333333 loops=3)\r\n                           Output: s.project_id\r\n                           Buckets: 262144  Batches: 32  Memory Usage: 7968kB\r\n                           Buffers: shared hit=37384, temp written=17236\r\n                           Worker 0:  actual time=1451.741..1451.743 rows=1202466 loops=1\r\n                             Buffers: shared hit=11238, temp written=5200\r\n                           Worker 1:  actual time=1450.062..1450.064 rows=1516637 loops=1\r\n                             Buffers: shared hit=14175, temp written=6524\r\n                           -&amp;gt;  Parallel Seq Scan on public.object_store s  (cost=0.00..54050.57 rows=1666658 width=16) (actual time=0.023..530.669 rows=1333333 loops=3)\r\n                                 Output: s.project_id\r\n                                 Buffers: shared hit=37384\r\n                                 Worker 0:  actual time=0.018..506.262 rows=1202466 loops=1\r\n                                   Buffers: shared hit=11238\r\n                                 Worker 1:  actual time=0.025..578.219 rows=1516637 loops=1\r\n                                   Buffers: shared hit=14175\r\n Query Identifier: -5913048123863832940\r\n Planning Time: 0.211 ms\r\n Execution Time: 6237.051 ms\r\n(47 rows)&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd445dac0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Explain (generic_plan)&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Prior to PostgreSQL 16, for parameterized SQLs the value of the parameter has to be passed to to obtain an execution plan. In PostgreSQL 16, with the option (generic_plan) we do not need to provide any additional values to the SQL to get the execution plan. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;Example&lt;/strong&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;db=&amp;gt; CREATE TABLE measurement (\r\n    city_id         int not null,\r\n    logdate         date not null,\r\n    peaktemp        int,\r\n    unitsales       int\r\n) PARTITION BY RANGE (logdate);\r\n\r\nCREATE TABLE measurement_y2006m02 PARTITION OF measurement\r\n    FOR VALUES FROM (&amp;#x27;2006-02-01&amp;#x27;) TO (&amp;#x27;2006-03-01&amp;#x27;);\r\n\r\nCREATE TABLE measurement_y2006m03 PARTITION OF measurement\r\n    FOR VALUES FROM (&amp;#x27;2006-03-01&amp;#x27;) TO (&amp;#x27;2006-04-01&amp;#x27;);\r\n\r\nPrepare statement\r\n\r\ndb=&amp;gt; PREPARE partitioned_selfjoin (int) AS\r\nSELECT *\r\n  FROM measurement a\r\n  JOIN measurement b\r\n    ON a.peaktemp = b.peaktemp\r\n WHERE a.city_id = $1;\r\nPREPARE\r\n\r\nGet execution plan\r\n\r\nPre PostgreSQL 16: Pass a value for the parameter $1 = 10\r\n\r\ndb=&amp;gt; EXPLAIN EXECUTE partitioned_selfjoin(10);\r\n\r\nFor PG - 16\r\n\r\ndb=&amp;gt; explain (generic_plan) SELECT *\r\n  FROM measurement a\r\n  JOIN measurement b\r\n    ON a.peaktemp = b.peaktemp\r\n WHERE a.city_id = $1;&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6dd445d2b0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;Vacuum improvements&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Vacuum is a significant part of PostgreSQLMVCC. Vacuum releases space after deleting the dead tuples, minimizing table bloat. This prevents the database from ending up in transaction wrap-around problems. Here are some ways vacuum processes improved in PostgreSQL16.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Improved VACUUM operation performance for large tables &lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;BUFFER_USAGE_LIMIT&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;PostgreSQL 16 introduces a new server variable ‘vacuum_buffer_usage_limit’ to set the ring buffers allocated for VACUUM and ANALYZE operations with a default value of 256K. Setting the ‘BUFFER_USAGE_LIMIT’ option during a VACUUM operation overrides the default value of ‘vacuum_buffer_usage_limit’ and allocates the specified ring buffer size. A larger ‘buffer_usage_limit’ can speed up vacuum operations but may displace buffers used by the main workload from ‘shared_buffers’, which may result in performance degradation. It is often advisable to limit the usage of ring buffers for VACUUM operations using ‘buffer_usage_limit’ when vacuuming very large tables. This option can be used judiciously when approaching Txid wraparound, at which point completing the VACUUM is critical. When ANALYZE is also part of the VACUUM operation, both operations together use the ring buffer size specified in ‘buffer_usage_limit’. A setting of 0 for ‘buffer_usage_limit’ results in disabling the buffer access strategy, which can result in evicting huge numbers of shared buffers, causing performance degradation. The limits for ‘buffer_usage_limit’ are between 128K and 16 GB. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong style="vertical-align: baseline;"&gt;VACUUM to only process TOAST tables&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Now in PostgreSQL 16 we can vacuum only TOAST tables related to a relation. Historically, the option ‘process_toast’ was introduced to turn off vacuuming the TOAST table when set to FALSE. Otherwise, vacuum ran on both the main and TOAST table of a relation. In PostgreSQL 16, based on the requirement, we can either vacuum both the main and TOAST table or just do one of them that belongs to a relation. This allows better control to vacuum either main, TOAST, or both, depending on your need. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Here’s an example of how it can be applied:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;Vacuum only toast table for a relation\r\n\r\npostgres=&amp;gt; vacuum (PROCESS_TOAST TRUE, PROCESS_MAIN FALSE) prodattribbig;\r\n\r\nVacuumdb only toast table for a relation\r\n\r\n$ vacuumdb -h &amp;lt;ipaddress&amp;gt; -U postgres -d testdb -t prodattribbig --no-process-main&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddbe9a670&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;a href="https://www.postgresql.org/docs/current/app-vacuumdb.html" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;vacuumdb&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; option to process schema &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Vacuumdb now has an option to vacuum or analyze all the tables belonging to a schema in the database. This is a very useful feature when we are targeting tables of only one schema.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;#x27;$ vacuumdb -h &amp;lt;host/ipaddress&amp;gt;  -v -U postgres -d testdb  -n testschema&amp;#x27;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddbe9a3a0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;Replication improvements&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Replication is an important part of the database high availability feature. In PostgreSQL 16, the community has added several usability features to replication. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Initial table synchronization in logical replication to copy rows in binary format&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In PostgreSQL 16, we can initialize the copy of the rows for logical replication in binary format. This can be much faster, especially with columns that have binary data. Here is an example on how to create a subscription where in the initial data copy is in binary format:&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-code"&gt;&lt;dl&gt;
    &lt;dt&gt;code_block&lt;/dt&gt;
    &lt;dd&gt;&amp;lt;ListValue: [StructValue([(&amp;#x27;code&amp;#x27;, &amp;quot;testdb=&amp;gt;  create subscription  testtab connection &amp;#x27;host=10.101.0.20 port=5432 dbname=testdb user=replication_user password=&amp;lt;&amp;lt;pwd&amp;gt;&amp;gt;&amp;#x27; PUBLICATION testtab  WITH (copy_data=on, binary=true);&amp;quot;), (&amp;#x27;language&amp;#x27;, &amp;#x27;&amp;#x27;), (&amp;#x27;caption&amp;#x27;, &amp;lt;wagtail.rich_text.RichText object at 0x7f6ddbe9a4c0&amp;gt;)])]&amp;gt;&lt;/dd&gt;
&lt;/dl&gt;&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Improved &lt;/strong&gt;&lt;a href="https://www.postgresql.org/docs/current/logical-replication-architecture.html" rel="noopener" target="_blank"&gt;&lt;strong style="text-decoration: underline; vertical-align: baseline;"&gt;logical replication apply&lt;/strong&gt;&lt;/a&gt;&lt;strong style="vertical-align: baseline;"&gt; without a primary key&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Traditionally, PostgreSQL logical replication relied on full table scans for tables that lacked primary keys, impacting performance. However, with PostgreSQL 16, any available B-tree index on the table is now leveraged, significantly enhancing logical apply efficiency. Index usage statistics are available in the pg_stat_*_indexes view.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Logical decoding on standby&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;In PostgreSQL 16, logical decoding is enabled on the read replica, allowing subscribers to connect to the read replicas instead of the primary db instance. By doing so, the workload is shared between the primary instance and the replica, reducing strain on the former. This offloads the logical replication workload off of the primary instance onto the replica. This represents a huge performance improvement for the primary node, especially with nodes having many logical replication slots. Another advantage is, in case of a promotion of the replica, subscribers are not affected by the change and continue to operate without any hindrance. Be aware that any delay on the read replica will subsequently affect the logical subscriber, unlike before.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong style="vertical-align: baseline;"&gt;Try PostgreSQL 16 today&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It's time to try out PostgreSQL16 on Cloud SQL with improved observability, improved logical replication, vacuuming and much more. Start your PostgreSQL16 journey on Cloud SQL from &lt;/span&gt;&lt;a href="https://console.cloud.google.com/"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;here&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Fri, 07 Jun 2024 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/databases/postgresql-16-now-available-in-cloud-sql/</guid><category>Open Source</category><category>Databases</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>What’s new in PostgreSQL 16: New features available in Cloud SQL today</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/databases/postgresql-16-now-available-in-cloud-sql/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Indu Akkineni</name><title>Database Engineer, Cloud SQL</title><department></department><company></company></author></item><item><title>How to choose a known, trusted supplier for open source software</title><link>https://cloud.google.com/blog/products/identity-security/how-to-choose-a-known-trusted-supplier-for-open-source-software/</link><description>&lt;div class="block-paragraph"&gt;&lt;p data-block-key="fnyba"&gt;Open-source software is used throughout the technology industry to help developers build software tools, apps, and services. While developers building with open-source software can (and often do) benefit greatly from the work of others, they should also conduct appropriate due diligence to protect against &lt;a href="https://www.dni.gov/files/NCSC/documents/supplychain/Software_Supply_Chain_Attacks.pdf" target="_blank"&gt;software supply chain attacks&lt;/a&gt;.&lt;/p&gt;&lt;p data-block-key="cr34o"&gt;With an increasing focus on managing open-source software supply chain risk, both Citi and Google strive to apply more rigor across risk mitigation, especially while choosing known and trusted suppliers where open source components are sourced from.&lt;/p&gt;&lt;h3 data-block-key="8pjtn"&gt;&lt;b&gt;Key open source attack vectors&lt;/b&gt;&lt;/h3&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/Key_open_source_attack_vectors.max-1000x1000.png"
        
          alt="Key open source attack vectors"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The diagram above highlights key open source attack vectors.  We can divide the common software supply chain security attacks into five main types:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li style="list-style-type: none;"&gt;
&lt;ol&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Attacks at runtime leveraging vulnerabilities in the code  &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Attacks on the repositories, tooling and processes &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Attacks on the integrity of the artifacts as they progress through the pipeline&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Attacks on the primary open source dependencies that customers applications leverage&lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Attacks throughout the inherited transitive dependency chain of the open source packages &lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Application security experts have seen their work increase and get harder as these attacks have increased in recent years. Open-source components often include and depend on the functionality of other open-source components in order to function. These components can have two types of dependencies: direct and transitive. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Generally, the interactions work like this: The application makes an initial call to a direct dependency. If the direct dependency requires any outside components for it to function, those outside components are the application’s transitive dependencies.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;These types of dependencies are notoriously difficult to remediate. This is because they are not readily accessible to the developer. Their code base resides with their maintainers, rendering the application entirely dependent upon their work. If the maintainer of one of these transitive dependencies releases a fix, the amount of time before it makes its way up the supply chain to impact your direct dependency could be a while. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Thus, the management of vulnerabilities needs to be extended to the full transitive dependency chain as this is where &lt;/span&gt;&lt;a href="https://www.forbes.com/sites/forbestechcouncil/2023/05/26/the-hidden-risk-lurking-in-the-software-supply-chain-transitive-open-source-dependencies/?sh=5e66dc87512f" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;95% of the vulnerabilities&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; are found. Maintaining a regular upgrade and patching process for your software development lifecycle (SDLC) tooling is now a must; as is upgrading the security of both your repositories and processes combined with active security testing of each. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Tamper-evident provenance and signing can increase confidence in the ability to maintain artifact integrity throughout the pipeline. And mapping and understanding the full transitive dependency chain of all external components and depending on only known and trusted providers for these components becomes a required condition. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.cisa.gov/sites/default/files/2023-10/Fact_Sheet_Improving_OSS_in_OT_ICS_508c.pdf" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Recent guidance from CISA&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; and other government agencies supports the focus on appropriately selecting and testing open source software ahead of ingestion from a trusted source. While some organizations load built software artifacts directly from public package repositories, others with a more restrictive security risk appetite will require more stringent security controls requiring the use of curated open-source software providers. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;They may opt to only leverage open-source software they themselves have built from source, although this would be prohibitively expensive for most. But if they chose to use a curated third party, what checks must they look for before delegating that critical authority?&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;There are three main criteria to evaluate a curated OSS vendor:&lt;/span&gt;&lt;/p&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;1. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;High level of security maturity&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;A trusted supplier must demonstrate a high level of security maturity. Common areas of focus are to examine the security hygiene of the supplier in particular. Look for details of the vulnerability management culture and ability to quickly keep up to date with patching within the organisation. They should also have a well trained team, prepared to quickly address any incidents and a regular penetration testing team, continuously validating the security posture of the organisation. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The trusted supplier should be able to demonstrate the security of their own underlying foundational infrastructure. Check that they:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li style="list-style-type: none;"&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;Have an up-to-date inventory of their own external dependencies.  &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;Demonstrate knowledge and control of all ingest points.  &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;Leverage a single production build service so that they can maintain a singular logical control point. &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="vertical-align: baseline;"&gt;Meet best practice standards for managing their infrastructure including:&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li style="list-style-type: none;"&gt;
&lt;ul&gt;
&lt;li style="list-style-type: none;"&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Well designed separation of duties and IAM control&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Built-in organizational policy and guard rails to secure Zero Trust network design&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Automated and regular patching with associated evidence&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Support for these posture controls with complementary continuous threat detection with detection, logging and monitoring systems.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Bonus points if they operate with "everything as code" and with hermetic, reproducible and verifiable builts.      &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;2. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;High level of internal SDLC security&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The security of the SDLC used within the trusted supplier must be extremely high, particularly around the control plane of the SDLC and the components that interact with the source code to build the end product. Each system must be heavily secured and vetted to ensure any changes to the software is reviewed, audited, and requires multi-party approvals before progressing to the next stage or deployment. Strong authentication and authorisation policies must be in place to ensure that only highly trusted individuals could ever build, or change the vendor infrastructure. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The SDLC security also needs to extend to the beginning of the ingestion of the source code material into the facility and to any code or functionality used within the control plane of the system itself.&lt;/span&gt;&lt;/p&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;3. &lt;/span&gt;&lt;strong style="vertical-align: baseline;"&gt;Effective insider threat program&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;As the trusted supplier is a high value target, there will be the potential for an insider threat as an attack vector.Therefore, the curated vendor would be expected to have an active and effective insider threat program. This personnel vetting approach should also extend to ensuring the location of all staff are within approved proximity and not outsourced. &lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Trust but verify&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;It is also important that the trusted supplier provide supporting evidence and insights. This evidence includes:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Checkable attestations on infrastructure security and processes via third party certifications and/or your own independent audit.  &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Checkable attestations for the security posture and processes for their SDLC against a standard framework like SLSA or SSDF.   &lt;/span&gt;&lt;/li&gt;
&lt;li role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Cryptographic signatures on the served packages and any associated accompanying metadata so that you can verify source and distribution integrity.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The actual relevance and security risk of an issue in a package is the combination of&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;inherent criticality of in isolation, the context it's used in, the environmental conditions in which its deployed, any external compensating controls, and decreased or increased risk in the environment. The figure below shows the interrelationship and interaction between vulnerabilities and threats in the application and those from the underlying infrastructure.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/End_to_end_risk_diagram.max-1000x1000.png"
        
          alt="End to end risk diagram"&gt;
        
        &lt;/a&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph_advanced"&gt;&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;4. Enhanced security and risk metadata that should accompany each served package to increase your understanding and insights to both the inherent component risk of the code or artifact as well as how that risk can change in context of your specific application and environment. Key metadata can include:  &lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Standard SBOM with SCA insights - vulnerabilities, licensing info, fully mapped transitive dependencies and associated vulnerability and licensing risk.  &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;VEX statements for how the inherited vulnerabilities from transitive dependencies affect the primary package being served. &lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li aria-level="1" style="list-style-type: disc; vertical-align: baseline;"&gt;
&lt;p role="presentation"&gt;&lt;span style="vertical-align: baseline;"&gt;Any related threat intelligence specific to the package, use case, or your organization.&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;The ability of the supplier to provide this type of enhanced data reinforces the evidence that they have achieved a high level of security and that the components they serve represent assured and more trustable ingredients you can employ with greater confidence.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;strong style="vertical-align: baseline;"&gt;Better control and balancing benefits of open source components&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Leveraging open source components is critical to developer velocity, quality and accelerating innovation and execution. Applying these recommendations and requirements can enable you to better control and balance the benefits of using open source components with the potential risk of introducing targetable weak points in your SDLC and ultimately reduce your risk and exposure.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Google Cloud’s &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/assured-open-source-software"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Assured Open Source Software&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; (Assured OSS) service for Java and Python ecosystems gives any organization that uses open source software the opportunity to leverage the security and experience Google applies to open source dependencies by incorporating the same OSS packages that Google secures and uses into their own developer workflows. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="vertical-align: baseline;"&gt;Learn more about &lt;/span&gt;&lt;a href="https://cloud.google.com/security/products/assured-open-source-software"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;Assured Open Source Software&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt;, enable Assured OSS through our&lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;a href="https://developers.google.com/assured-oss?utm_source=blog&amp;amp;utm_medium=referral#get-started" rel="noopener" target="_blank"&gt;&lt;span style="text-decoration: underline; vertical-align: baseline;"&gt;self-serve onboarding&lt;/span&gt;&lt;/a&gt;&lt;span style="vertical-align: baseline;"&gt; &lt;/span&gt;&lt;span style="vertical-align: baseline;"&gt;form, use the metadata API to list available Python and Java packages and determine which Assured OSS packages you want to use.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 26 Mar 2024 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/identity-security/how-to-choose-a-known-trusted-supplier-for-open-source-software/</guid><category>Open Source</category><category>Partners</category><category>Security &amp; Identity</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>How to choose a known, trusted supplier for open source software</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/identity-security/how-to-choose-a-known-trusted-supplier-for-open-source-software/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Jonathan Meadows</name><title>Managing Director, Citi Tech Fellow, Citi</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Andy Chang</name><title>Group Product Manager, Google Cloud Security</title><department>Security &amp; Privacy</department><company></company></author></item><item><title>A window into protein folding: Lowering the barriers for AlphaFold Inferencing</title><link>https://cloud.google.com/blog/products/ai-machine-learning/alphafold-portal-on-vertex-ai-alphafold-inference-pipeline/</link><description>&lt;div class="block-paragraph"&gt;&lt;p data-block-key="39r49"&gt;The open-source tool &lt;a href="https://github.com/GoogleCloudPlatform/vertex-ai-alphafold-inference-pipeline" target="_blank"&gt;Vertex AI AlphaFold Inference Pipeline&lt;/a&gt; has enabled biotech companies in streamlining protein-folding activities, accelerating their go to market timeline. It addresses key challenges in protein structure prediction by unleashing the power of parallel processing, optimizing compute resources, and scaling to meet high-throughput demands. Furthermore, it ensures reproducibility, lineage analysis, flexibility, adaptability, and seamless integration with upstream and downstream systems – all within &lt;a href="https://cloud.google.com/ai-platform/"&gt;Vertex AI&lt;/a&gt; as the one-stop platform. With this tool, researchers can unlock new possibilities, make groundbreaking discoveries faster than ever before, and drive end-to-end efficiency in their biotech drug discovery efforts.&lt;/p&gt;&lt;p data-block-key="3beui"&gt;However, even with Google Cloud's efforts to make the &lt;a href="https://deepmind.google/technologies/alphafold/" target="_blank"&gt;AlphaFold&lt;/a&gt; algorithm more &lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/running-alphafold-on-vertexai"&gt;accessible&lt;/a&gt; to biotech firms, many bioscience organizations still struggle to integrate this technology seamlessly into their researchers' workflows.&lt;/p&gt;&lt;p data-block-key="3at2l"&gt;The biggest challenge is this: scientists who obsess over protein shapes aren't usually coding ninjas or cloud wizards. Asking them to wrestle with complicated setups just to get a glimpse of a protein is like asking a chef to build their own oven before they can cook dinner. It's not the best recipe for success (or tasty results).&lt;/p&gt;&lt;h3 data-block-key="bcmth"&gt;&lt;b&gt;Solution Overview&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="a6mko"&gt;To reduce the friction, we are making our &lt;a href="https://github.com/GoogleCloudPlatform/vertex-ai-alphafold-inference-pipeline" target="_blank"&gt;Vertex AI AlphaFold Inference Pipeline&lt;/a&gt; easier to use, including introducing a user-friendly &lt;b&gt;AlphaFold Portal&lt;/b&gt; – think of it like protein modeling for beginners. We empower scientists, irrespective of their prior experience with cloud computing, to derive protein structures with minimal effort. The portal eliminates the need to engage with intricate coding (like Python on a Jupyter notebook), enabling users to focus on protein inference results iterations.&lt;/p&gt;&lt;p data-block-key="24i4q"&gt;The Google Cloud AlphaFold &lt;a href="https://github.com/GoogleCloudPlatform/vertex-ai-alphafold-inference-pipeline" target="_blank"&gt;repository&lt;/a&gt; now includes the option to deploy this serverless portal, which offers a streamlined, secure, and centralized way to manage protein folding experiments. Launch new experiments with a single click, simplifying workflows and saving valuable time.&lt;/p&gt;&lt;h3 data-block-key="bv21u"&gt;&lt;b&gt;Centralized Pipelines&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="tomv"&gt;The portal makes researchers' work more efficient in several ways:&lt;/p&gt;&lt;ul&gt;&lt;li data-block-key="966ug"&gt;&lt;b&gt;Centralized access&lt;/b&gt;: Multiple researchers can access the portal through a single web address instead of running their own Jupyter notebook instances or deploying infrastructure on separate projects.&lt;/li&gt;&lt;li data-block-key="3ah29"&gt;&lt;b&gt;Streamlined protein folding&lt;/b&gt;: Researchers can run protein folding pipeline jobs under their usernames and filter simulation results based on other researchers' work. This allows for easy comparison and fine-tuning.&lt;/li&gt;&lt;li data-block-key="8s2pk"&gt;&lt;b&gt;Enhanced collaboration&lt;/b&gt;: Previously, each researcher needed to run their own Jupyter notebook instance to run each protein-folding job. Now, they can collaborate more easily by accessing and comparing simulation results in a centralized location.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_-_AlphaFold_Dashboard.max-1000x1000.png"
        
          alt="1 - AlphaFold Dashboard"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="amkoi"&gt;1- AlphaFold Portal Dashboard&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="39r49"&gt;Consider this dashboard to be the central hub for protein folding endeavors. Users can personalize the display, expertly filter results, and utilize designated link buttons to directly access protein resources. The need to navigate through complex configuration or executions has now been simplified.&lt;/p&gt;&lt;p data-block-key="93bs9"&gt;Are you prepared to engage in protein folding? With just two clicks, your sequence (in FASTA format) will be processed and simulated. The UI will auto select recommendations for the optimal GPU machine configuration based on the type and size of your protein. However, if you are not satisfied with the suggested settings, you have the option to expand the advanced settings and customize them to your desired specifications.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/original_images/image1_lh3JcSq.png"
        
          alt="2 - New protein folding"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="amkoi"&gt;2 - New Protein Folding&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="39r49"&gt;Furthermore, we have integrated a preview function for your protein models. Tapping into an &lt;a href="https://3dmol.csb.pitt.edu/" target="_blank"&gt;open-source visualization tool&lt;/a&gt;, you can now seamlessly explore the intricate molecular structures without leaving the interface.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_-_Protein_structure_visualization.max-1000x1000.png"
        
          alt="3 - Protein structure visualization"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="amkoi"&gt;3 - Protein structure visualization&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="39r49"&gt;This tool empowers everyone in your biotech organization to harness the power of protein folding, regardless of their cloud or coding experience. Executing this highly complex and compute intensive workload seamlessly on a streamlined, optimized infrastructure, ensuring efficiency and ease of use.&lt;/p&gt;&lt;h3 data-block-key="phhl"&gt;&lt;b&gt;Getting started&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="cn3v"&gt;If you're a Google Cloud newbie, no worries! We recommend checking out the &lt;a href="https://cloud.google.com/docs/get-started"&gt;Getting Started page&lt;/a&gt; to get familiarized with Google Cloud. Then, &lt;a href="https://cloud.google.com/resource-manager/docs/creating-managing-projects#creating_a_project"&gt;create a project&lt;/a&gt; to house all this protein-folding magic.&lt;/p&gt;&lt;p data-block-key="1mtok"&gt;To proceed, follow the instructions provided in the open-source Google Cloud AlphaFold repository, accessible via the &lt;a href="https://github.com/GoogleCloudPlatform/vertex-ai-alphafold-inference-pipeline" target="_blank"&gt;link&lt;/a&gt;. This repository contains convenient, pre-built templates that will assist you in setting up all the necessary components. Kindly note that this part of the process may require some technical expertise. If you encounter any challenges or require guidance, your dedicated GCP representative is readily available to assist you in navigating the complexities of the cloud.&lt;/p&gt;&lt;/div&gt;</description><pubDate>Mon, 11 Mar 2024 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/ai-machine-learning/alphafold-portal-on-vertex-ai-alphafold-inference-pipeline/</guid><category>Healthcare &amp; Life Sciences</category><category>Open Source</category><category>AI &amp; Machine Learning</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>A window into protein folding: Lowering the barriers for AlphaFold Inferencing</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/ai-machine-learning/alphafold-portal-on-vertex-ai-alphafold-inference-pipeline/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Yudy Hendry</name><title>Solutions Architect, Google Cloud</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Alfonso Miranda</name><title>Customer Engineer, Machine Learning</title><department></department><company></company></author></item><item><title>A decade of Kubernetes leadership: why Google Cloud should be your choice for Kubernetes</title><link>https://cloud.google.com/blog/products/containers-kubernetes/why-choose-gke-as-your-kubernetes-service/</link><description>&lt;div class="block-paragraph"&gt;&lt;p data-block-key="f3xdp"&gt;Kubernetes has become a critical part of the modern software development landscape. Originally developed by Google, it is now the second largest open source project in history, with &lt;a href="https://k8s.devstats.cncf.io/d/24/overall-project-statistics?orgId=1&amp;amp;var-period_name=Last%20decade&amp;amp;var-repogroup_name=All&amp;amp;var-repo_name=kubernetes%2Fkubernetes" target="_blank"&gt;over 83,000 unique contributors&lt;/a&gt; over the past decade, and is the de facto standard for running containerized applications in production.&lt;/p&gt;&lt;p data-block-key="5to0d"&gt;Kubernetes has also helped to democratize the cloud, making it possible for businesses of all sizes to take advantage of the cloud with the benefits of containerization. A powerful and flexible platform that can run a wide variety of applications, Kubernetes is used by companies of all sizes and powers some of the world's largest and most complex applications. More recently, with the explosion of generative AI and large language models (LLMs), companies are turning to Kubernetes to run and scale complex and compute-intensive machine learning platforms.&lt;/p&gt;&lt;p data-block-key="5b8mc"&gt;The success of Kubernetes is a testament to the power of open-source software. Kubernetes is a radically open, community-first project. Tens of thousands of developers from across the globe contribute to it, enhancing its capabilities and adapting it to new use cases. As a result, Kubernetes continues to evolve at a pace that is only possible through open source.&lt;/p&gt;&lt;h3 data-block-key="1skaa"&gt;&lt;b&gt;Open-sourcing Kubernetes expanded opportunities for an entire industry&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="ep8rd"&gt;Kubernetes was born at Google and released as open source in 2014. Its roots trace back to &lt;a href="https://kubernetes.io/blog/2015/04/borg-predecessor-to-kubernetes/" target="_blank"&gt;Google’s internal Borg system&lt;/a&gt; (introduced between 2003 and 2004), which powers everything from Google Search to Maps to YouTube. On average, Google launches more than 4 billion containers a week!&lt;/p&gt;&lt;p data-block-key="c3g5q"&gt;Open-sourcing Kubernetes was a revolutionary move. It spawned the Cloud Native Computing Foundation (CNCF) and fostered a community of contributors and users around the world. As this global community continues to grow, Google’s commitment to Kubernetes is stronger than ever, acting as a steward and providing consistent leadership to ensure its continued growth.&lt;/p&gt;&lt;p data-block-key="4tokm"&gt;Today, Google is the &lt;a href="https://k8s.devstats.cncf.io/d/9/companies-table?orgId=1&amp;amp;var-period_name=Last%20decade&amp;amp;var-metric=contributions" target="_blank"&gt;largest contributor&lt;/a&gt; to Kubernetes with over one million contributions — that’s more than &lt;i&gt;the next four organizations combined&lt;/i&gt;. In addition to investing time and development resources, &lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/google-cloud-credits-support-cncf-work-on-kubernetes"&gt;Google Cloud also donates millions of dollars per year&lt;/a&gt; to support the infrastructure needed to host Kubernetes containers and build and test each release.&lt;/p&gt;&lt;p data-block-key="9hd22"&gt;Looking strictly at cloud providers over the past year, Google Cloud has made three times the number of contributions as the next closest provider:&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_ZxyK4KX.max-1000x1000.jpg"
        
          alt="1"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;p data-block-key="ump78"&gt;Source: &lt;a href="https://k8s.devstats.cncf.io/d/9/companies-table?orgId=1&amp;amp;var-period_name=Last%20year&amp;amp;var-metric=contributions"&gt;Kubernetes Companies Statistics - Past Year&lt;/a&gt;&lt;/p&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;p data-block-key="f3xdp"&gt;Our contributions to and engagements in Kubernetes are far-reaching:&lt;/p&gt;&lt;ul&gt;&lt;li data-block-key="7di7g"&gt;Co-chairing and acting as technical leads for many core Special Interest Groups (SIGs) including API Machinery, Autoscaling, Networking, Scheduling, and Storage.&lt;/li&gt;&lt;li data-block-key="bp9g0"&gt;Identifying and resolving complex problems that impact both the community and Google's customers. For example, Google has invested heavily with the community on improving upgrades and deprecations for all of Kubernetes, which has helped provide a much more stable platform for all customers.&lt;/li&gt;&lt;li data-block-key="18j8n"&gt;Fixing &lt;a href="https://kubernetes.io/docs/reference/issues-security/official-cve-feed/" target="_blank"&gt;over half of the security vulnerabilities&lt;/a&gt; that have been found in Kubernetes. This is a significant contribution to Kubernetes security, and demonstrates Google's commitment to keeping Kubernetes secure for users.&lt;/li&gt;&lt;li data-block-key="3a8i9"&gt;Working closely with Googlers who work on Go to keep &lt;a href="https://kubernetes.io/blog/2023/04/06/keeping-kubernetes-secure-with-updated-go-versions/" target="_blank"&gt;Kubernetes secure with updated Go versions&lt;/a&gt;. The Go team is responsible for developing the Go programming language, which is used to write Kubernetes code. Googlers work closely with the Go team to ensure that Kubernetes is compatible with the latest Go versions, and to fix any security vulnerabilities that are found in Go.&lt;/li&gt;&lt;li data-block-key="1ama0"&gt;Leading the development of &lt;a href="https://kubernetes.io/docs/concepts/security/pod-security-standards/" target="_blank"&gt;Pod Security Standards&lt;/a&gt;, a set of best practices for securing Kubernetes pods. Googlers have been leading the development of these standards, and have published a number of guides and resources to help users secure their Kubernetes pods.&lt;/li&gt;&lt;li data-block-key="9u7hf"&gt;Creating the initial &lt;a href="https://github.com/container-storage-interface/spec/blob/master/spec.md" target="_blank"&gt;Container Storage Interface&lt;/a&gt; (CSI) specification, defining how containers can access storage. Googlers were involved in the early development of CSI, and they helped to create the initial specification. CSI is now widely used by open source and commercial storage vendors.&lt;/li&gt;&lt;li data-block-key="k4sj"&gt;Creating the &lt;a href="https://github.com/google/cel-spec" target="_blank"&gt;Common Expression Language&lt;/a&gt; (CEL) for expressing queries and transformations on structured data. CEL is used in a variety of Kubernetes components, including &lt;a href="https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/" target="_blank"&gt;Validating Admission Policy&lt;/a&gt; and &lt;a href="https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/" target="_blank"&gt;Custom Resource Validation Expressions&lt;/a&gt;. CEL is a powerful and flexible language that has helped to improve the extensibility and usability of Kubernetes.&lt;/li&gt;&lt;/ul&gt;&lt;p data-block-key="1gg3l"&gt;Google's contributions to Kubernetes have been significant and have helped make the platform more robust, scalable, secure, and reliable. Moreover, Google continues to push Kubernetes forward into new domains such as &lt;a href="https://thenewstack.io/kubernetes-evolution-from-microservices-to-batch-processing-powerhouse/" target="_blank"&gt;batch processing&lt;/a&gt; and machine learning, with contributions to CNCF such as job queueing with &lt;a href="https://kubernetes.io/blog/2022/10/04/introducing-kueue/" target="_blank"&gt;Kueue&lt;/a&gt; and ML operations and workflows with &lt;a href="https://www.cncf.io/blog/2023/07/25/kubeflow-brings-mlops-to-the-cncf-incubator/" target="_blank"&gt;Kubeflow&lt;/a&gt;. These contributions matter; if the Kubernetes community is thriving, it’s thanks to a core group of individuals and companies actually investing their time in the critical “chopping wood and carrying water” tasks and building new functionality from which everyone can benefit. For Kubernetes to continue to be a great platform for new workloads such as AI/ML, we need more companies who benefit from Kubernetes to do their part and contribute.&lt;/p&gt;&lt;h3 data-block-key="1lgum"&gt;&lt;b&gt;Why customers trust Google Kubernetes Engine for mission-critical workloads&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="bp49h"&gt;Google Kubernetes Engine (GKE) is the most scalable and fully automated Kubernetes service available. It is a popular choice for businesses of all sizes and industries, and is used to host some of the world's largest and most complex applications. With GKE, you can be confident that your applications are running on a reliable and scalable platform that is backed by Google Cloud's expertise. GKE now includes multi-cluster and distributed team management, policy enforcement with &lt;a href="https://cloud.google.com/anthos-config-management/docs/concepts/policy-controller"&gt;Policy Controller&lt;/a&gt;, GitOps-based configuration with &lt;a href="https://cloud.google.com/anthos-config-management/docs/config-sync-overview"&gt;Config Sync&lt;/a&gt;, self-service provisioning of your Google Cloud Resources with &lt;a href="https://cloud.google.com/anthos-config-management/docs/concepts/config-controller-overview"&gt;Config Controller&lt;/a&gt;, and a fully managed &lt;a href="https://cloud.google.com/service-mesh/docs/overview#managed_anthos_service_mesh"&gt;Istio-powered service mesh&lt;/a&gt;. All of these new capabilities are integrated with &lt;a href="https://cloud.google.com/anthos/docs/concepts/gke-editions"&gt;GKE Enterprise&lt;/a&gt; and are ideal for customers getting started with Kubernetes or those already deployed globally.&lt;/p&gt;&lt;p data-block-key="e2jf2"&gt;Customers use GKE to run mission-critical applications for a variety of reasons:&lt;/p&gt;&lt;ul&gt;&lt;li data-block-key="8md8d"&gt;Who better to operate and manage your environment than the team that created Kubernetes? The entire open source Kubernetes project is built, tested, and distributed on Google Cloud, and we use GKE for several services including &lt;a href="https://cloud.google.com/vertex-ai"&gt;Vertex AI&lt;/a&gt; and &lt;a href="https://deepmind.google/" target="_blank"&gt;DeepMind&lt;/a&gt;.&lt;/li&gt;&lt;li data-block-key="33sai"&gt;GKE is a Leader in the &lt;a href="https://inthecloud.withgoogle.com/gartner-magic-quadrant-report-containers-2023/dl-cd.html" target="_blank"&gt;2023 Gartner Magic Quadrant for Container Management&lt;/a&gt;.&lt;/li&gt;&lt;li data-block-key="3r1ed"&gt;It accelerates and efficiently scales &lt;a href="https://g.co/cloud/gke-aiml" target="_blank"&gt;AI/ML workloads&lt;/a&gt; with &lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/timesharing-gpus"&gt;GPU time-sharing&lt;/a&gt; and &lt;a href="https://cloud.google.com/blog/products/compute/how-to-use-cloud-tpus-with-gke"&gt;Cloud TPUs&lt;/a&gt;.&lt;/li&gt;&lt;li data-block-key="7ala4"&gt;GKE offers the first fully-managed, serverless Kubernetes experience with &lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview"&gt;GKE Autopilot&lt;/a&gt;, a hands-off mode of operation that manages the underlying compute infrastructure while providing the full power of the Kubernetes API and being backed by a pod-level SLA and Google’s renowned SRE team.&lt;/li&gt;&lt;li data-block-key="fi0pl"&gt;It scales to meet the needs of even the largest and most demanding applications with unparalleled &lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/google-kubernetes-engine-clusters-can-have-up-to-15000-nodes"&gt;15,000 node clusters&lt;/a&gt;. For instance, &lt;a href="https://www.pgs.com/company/newsroom/news/industry-insights--hpc-in-the-cloud/" target="_blank"&gt;PGS replaced&lt;/a&gt; its Cray with a GKE-based supercomputer capable of 72.02 petaFLOPS.&lt;/li&gt;&lt;li data-block-key="4b3hl"&gt;GKE delivers enterprise-grade security with features such as &lt;a href="https://cloud.google.com/blog/products/identity-security/gke-security-posture-now-generally-available-with-enhanced-features"&gt;GKE Security Posture&lt;/a&gt; to scan for misconfigured workloads and container image vulnerabilities, &lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/network-policy"&gt;network policy enforcement&lt;/a&gt; with built-in Kubernetes Network Policy, &lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/sandbox-pods"&gt;GKE Sandbox&lt;/a&gt; for isolating untrusted workloads, and &lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/confidential-gke-nodes"&gt;Confidential Nodes&lt;/a&gt; for encrypting workload data in use.&lt;/li&gt;&lt;li data-block-key="374rb"&gt;Seamless &lt;a href="https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-upgrades"&gt;automatic upgrades&lt;/a&gt; with fine-grained controls such as &lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/node-pool-upgrade-strategies#blue-green-upgrade-strategy"&gt;blue-green upgrades&lt;/a&gt; and &lt;a href="https://cloud.google.com/kubernetes-engine/docs/concepts/maintenance-windows-and-exclusions"&gt;maintenance windows and exclusions&lt;/a&gt;.&lt;/li&gt;&lt;li data-block-key="cb38"&gt;Flexible deployment options to meet business, regulatory and/or compliance needs and requirements. These include &lt;a href="https://cloud.google.com/distributed-cloud"&gt;Google Distributed Cloud&lt;/a&gt;, to extend Google Cloud to customer data centers or edge locations with fully managed hardware and software deployment options; multi-cloud deployment to &lt;a href="https://cloud.google.com/anthos/clusters/docs/multi-cloud/aws"&gt;AWS&lt;/a&gt; and &lt;a href="https://cloud.google.com/anthos/clusters/docs/multi-cloud/azure"&gt;Azure&lt;/a&gt;; and the ability to attach and manage any CNCF-compliant Kubernetes cluster.&lt;/li&gt;&lt;li data-block-key="dgsfa"&gt;Google Cloud has expertise in running &lt;a href="https://cloud.google.com/architecture/best-practices-for-running-cost-effective-kubernetes-applications-on-gke"&gt;cost-optimized applications&lt;/a&gt;, including publishing the inaugural &lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/new-report-state-of-kubernetes-cost-optimization"&gt;State of Kubernetes Cost Optimization Report&lt;/a&gt;.&lt;/li&gt;&lt;li data-block-key="ei3ts"&gt;We release new minor versions of GKE approximately 30 days after the release of the corresponding open source version, ensuring that GKE users have access to the latest security patches and features as soon as possible.&lt;/li&gt;&lt;/ul&gt;&lt;p data-block-key="7m6f0"&gt;If you are looking for a scalable, reliable, and fully automated Kubernetes service to run everything from microservices to databases to the most-demanding generative AI workloads, then GKE is the right choice for you.&lt;/p&gt;&lt;h3 data-block-key="b3fd"&gt;&lt;b&gt;Join us at KubeCon + CloudNativeCon North America 2023&lt;/b&gt;&lt;/h3&gt;&lt;p data-block-key="aqf35"&gt;If you plan to be at KubeCon, we’d love to meet with you. You can check out all of our plans &lt;a href="https://inthecloud.withgoogle.com/kubecon-northam-chicago-microsite-23/register.html#home" target="_blank"&gt;here&lt;/a&gt;, but here are a few highlights:&lt;br/&gt;&lt;/p&gt;&lt;ul&gt;&lt;li data-block-key="3oge1"&gt;&lt;a href="https://inthecloud.withgoogle.com/kubecon-northam-chicago-microsite-23/register.html#agenda" target="_blank"&gt;65+ breakout sessions and lightning talks&lt;/a&gt;&lt;/li&gt;&lt;li data-block-key="38reh"&gt;&lt;a href="https://rsvp.withgoogle.com/events/gke-kube-con-na-23" target="_blank"&gt;Google Container Day&lt;/a&gt;&lt;/li&gt;&lt;li data-block-key="6irug"&gt;&lt;a href="https://inthecloud.withgoogle.com/kubecon-northam-chicago-microsite-23/register.html#meeting" target="_blank"&gt;Request 1-1 meeting&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p data-block-key="44fj8"&gt;You can also stop by booth #D2 to see demos, lightning talks or simply meet with our GKE and Kubernetes experts and engineers. And if you can’t make it this year, you can check out our &lt;a href="https://cloudonair.withgoogle.com/events/countdown-to-kubecon-with-cloud" target="_blank"&gt;exclusive preview on-demand&lt;/a&gt;.&lt;/p&gt;&lt;/div&gt;</description><pubDate>Thu, 02 Nov 2023 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/containers-kubernetes/why-choose-gke-as-your-kubernetes-service/</guid><category>Open Source</category><category>Application Modernization</category><category>Infrastructure Modernization</category><category>Containers &amp; Kubernetes</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>A decade of Kubernetes leadership: why Google Cloud should be your choice for Kubernetes</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/containers-kubernetes/why-choose-gke-as-your-kubernetes-service/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Drew Bradstock</name><title>Sr. Director of Product Management, Google Kubernetes Engine</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>Gari Singh</name><title>Product Manager</title><department></department><company></company></author></item><item><title>Streamlining ML development with Feast</title><link>https://cloud.google.com/blog/products/databases/how-feast-feature-store-streamlines-ml-development/</link><description>&lt;div class="block-paragraph"&gt;&lt;p&gt;&lt;i&gt;This post is the first in a short series of blog posts about Feast on Google Cloud. In this first blog post, we describe the benefits of using Feast, a popular open source ML feature store, on Google Cloud. In our second blog post, we’ll provide a simple, introductory tutorial for building a product recommendation system with Feast on Google Cloud.&lt;/i&gt;&lt;/p&gt;&lt;p&gt;Data scientists and other ML practitioners are increasingly relying on a new kind of data platform: the ML feature store. This specialized offering can help organizations simplify the management of their ML &lt;a href="https://en.wikipedia.org/wiki/Feature_(machine_learning)" target="_blank"&gt;feature data&lt;/a&gt; and make their ML model development efforts more scalable. Feature stores take on the core tasks of managing the code that organizations use to generate ML features, running this code on unprocessed data, and deploying these features to production in user-facing applications. Feature stores typically integrate with a data warehouse, object storage, and an operational storage system for application serving. &lt;/p&gt;&lt;p&gt;Feature stores can be very valuable for organizations whose ML teams need to reuse the same feature data in multiple ML models for different application use cases. They can be especially valuable when these ML models must be retrained frequently using very recent data to ensure that model predictions remain up-to-date for app users. &lt;/p&gt;&lt;p&gt;For example, let’s consider a movie streaming service that has a dozen different ML models running in production to support use cases like personalized recommendations, search, and email notifications. If we assume that each ML model is owned by a different team, there’s a very high likelihood that each team could benefit from having many of the same ML features (e.g. regularly updated vector embeddings that include the most recent movies watched by user, by title, and by genre) instead of each having to build their own features from scratch and take on the costs of maintaining the same critical infrastructure a dozen different times.&lt;/p&gt;&lt;p&gt;Every organization and ML project has unique requirements, and there are a wide variety of effective ML platforms available to support these different needs. For example, some Google Cloud customers choose Vertex AI Feature Store, a fully-managed feature store that provides a centralized repository for organizing, storing, and serving ML features and integrates directly with &lt;a href="https://cloud.google.com/vertex-ai"&gt;Vertex AI&lt;/a&gt;’s broad range of features and capabilities. Alternatively, organizations with more specialized requirements can choose to build a custom ML platform based on the always-on, petabyte-scale capabilities of Google Cloud managed services like &lt;a href="https://cloud.google.com/bigquery"&gt;BigQuery&lt;/a&gt; and &lt;a href="https://cloud.google.com/bigtable"&gt;Cloud Bigtable&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Then there’s &lt;a href="https://github.com/feast-dev/feast" target="_blank"&gt;Feast&lt;/a&gt;, a popular, customizable open-source &lt;a href="https://feast.dev/blog/what-is-a-feature-store/" target="_blank"&gt;ML feature store&lt;/a&gt; that solves many of the most difficult challenges that keep organizations from effectively scaling their ML development efforts. To support Google Cloud customers who’d like an end-to-end solution for Feast on Google Cloud, &lt;a href="https://www.tecton.ai/" target="_blank"&gt;Tecton&lt;/a&gt;, a contributor to Feast, released &lt;a href="https://docs.feast.dev/reference/online-stores/bigtable" target="_blank"&gt;an open-source integration&lt;/a&gt; for Feast on Bigtable last year, expanding on their existing integrations with BigQuery and Google Kubernetes Engine (GKE) for feature-store use cases.&lt;/p&gt;&lt;p&gt;Feast has been adopted by &lt;a href="https://feast.dev/#key-contributorsblock_60760ba81e2b9" target="_blank"&gt;a wide variety of organizations&lt;/a&gt; in different industries including in retail, media, travel, and financial services. Among Google Cloud customers, Feast has been adopted at scale in industry verticals like consumer internet, technology, retail, and gaming. Along the way, customers have unlocked significant ML development velocity and productivity benefits that enhance the value of the applications that they deliver to their own customers, partners, and end-users.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/1_47IPW10.max-1000x1000.png"
        
          alt="1.png"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;i&gt;Role of a Feature Store in the ML model development lifecycle&lt;/i&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;h3&gt;Why Feast?&lt;/h3&gt;&lt;p&gt;Feast provides a powerful single data access layer that abstracts ML feature storage from feature retrieval, ensuring that ML teams’ models remain portable throughout the model development and deployment process — from training to serving, from batch to real-time, and from one data storage system to another. &lt;/p&gt;&lt;p&gt;Compare this to organizations who opt to build their own, homegrown feature stores. These projects often achieve quick success with focused efforts by small teams, but can quickly run into challenges when they try to scale their ML development efforts to additional teams within their respective organizations. These new teams may learn very quickly that reusing the existing feature store as-is is impractical, and instead – by necessity – decide to “reinvent the wheel” and build their own siloed feature pipelines versions to meet deadlines. As this process repeats itself from team to team, the organization’s ML stack and ML development practices quickly become fragmented, preventing future teams from reusing the ML features, data pipelines, Notebooks, data access controls, or other tooling that already exists. This pattern results in further duplication of development efforts and tooling, causing rapid growth in infrastructure costs, while also adding time-to-market bottlenecks for new models, each of which must be developed from scratch.&lt;/p&gt;&lt;p&gt;Feast addresses these common organization-level ML scaling challenges head-on, enabling customers to achieve far greater leverage from their ML investments by:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Standardizing data workflows, development processes, and tooling&lt;/b&gt; across different teams by integrating directly with the tools and infrastructure that these teams already use for key steps like feature transformation, data storage, monitoring, and modeling&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Accelerating time-to-market&lt;/b&gt; for new ML projects by bootstrapping them with a reusable library of curated, production-ready features for data warehouses such as &lt;b&gt;BigQuery&lt;/b&gt; that are readily discoverable by anyone within the customer organization&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Productionizing features&lt;/b&gt; with centrally-managed, reusable data pipelines and integrating them with a low-latency online storage layer, such as &lt;b&gt;Bigtable&lt;/b&gt;, and the online storage layer’s feature serving endpoints&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Eliminating expensive data inconsistencies&lt;/b&gt; across teams’ data analysis, training, and serving environments, including across BigQuery and Bigtable, improving model point-in-time accuracy and prediction quality, while also avoiding the protracted debugging efforts that would otherwise be necessary to identify the source of these data inconsistencies&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/2_vv6zt39.max-1000x1000.png"
        
          alt="2.png"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;i&gt;ML feature development workflow&lt;/i&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;h3&gt;Feast’s Bigtable integration&lt;/h3&gt;&lt;p&gt;Feast’s Bigtable integration builds on Feast’s &lt;a href="https://docs.feast.dev/reference/data-sources/bigquery" target="_blank"&gt;existing integration with BigQuery&lt;/a&gt; and provides Google Cloud customers with a more turnkey &lt;b&gt;single data-access layer&lt;/b&gt; on top of BigQuery and Bigtable that streamlines the critical “last mile” of production ML data materialization. With the Feast’s Bigtable integration, data scientists and other ML practitioners can transform and productionize their analytical data in &lt;b&gt;BigQuery&lt;/b&gt; for low-latency training and inference serving on &lt;b&gt;Bigtable&lt;/b&gt; at any scale without having to build or update custom pipelines, so they can realize the value of their efforts in production sooner. &lt;/p&gt;&lt;p&gt;What’s more, Bigtable’s &lt;a href="https://cloud.google.com/bigtable/docs/replication-overview"&gt;highly flexible replication capabilities&lt;/a&gt; now allow ML teams to serve Feast feature data to end-users in up to eight Google Cloud regions at the same time to (a) reduce serving latency and (b) provide automatic request routing to the nearest replica to support Disaster Recovery (DR) requirements.&lt;/p&gt;&lt;h3&gt;The role of feature serving in an ML feature store&lt;/h3&gt;&lt;p&gt;A high-quality feature store typically consists of the following components, as shown in the diagram below.&lt;/p&gt;&lt;/div&gt;
&lt;div class="block-image_full_width"&gt;






  
    &lt;div class="article-module h-c-page"&gt;
      &lt;div class="h-c-grid"&gt;
  

    &lt;figure class="article-image--large
      
      
        h-c-grid__col
        h-c-grid__col--6 h-c-grid__col--offset-3
        
        
      "
      &gt;

      
      
        
        &lt;img
            src="https://storage.googleapis.com/gweb-cloudblog-publish/images/3_U9qW4qk.max-1000x1000.png"
        
          alt="3.png"&gt;
        
        &lt;/a&gt;
      
        &lt;figcaption class="article-image__caption "&gt;&lt;i&gt;Feature store system components&lt;/i&gt;&lt;/figcaption&gt;
      
    &lt;/figure&gt;

  
      &lt;/div&gt;
    &lt;/div&gt;
  




&lt;/div&gt;
&lt;div class="block-paragraph"&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Storage&lt;/b&gt;: Feature stores persist feature data to support retrieval through feature serving layers. They typically contain an offline storage layer, such as &lt;b&gt;BigQuery&lt;/b&gt; or &lt;b&gt;Cloud Storage&lt;/b&gt; for ML model training as well as to provide ML model transparency and explainability to support customers’ internal ML model governance practices and policies.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Serving&lt;/b&gt;: Feature stores like Feast, through the abstractions they provide, serve feature data to the ML models that app developers integrate with their applications, a step that’s also known as feature materialization. To ensure that developers’ apps can respond quickly to the most up-to-date model predictions (e.g. to provide fresh content recommendations to end-users, show more relevant ads, or to reject fraudulent credit card payment attempts), a high-performance API backed by a low-latency database like &lt;b&gt;Bigtable&lt;/b&gt; is essential.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Registry&lt;/b&gt;: a feature repository that acts as a centralized source of truth for customers’ ML features and contains standardized feature definitions and metadata to enable different teams to reuse existing features for different ML use cases and applications.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Transformation&lt;/b&gt;: ML applications need to incorporate the freshest data into feature values using batch or stream processing frameworks like &lt;b&gt;Spark&lt;/b&gt;, &lt;a href="https://cloud.google.com/dataflow?"&gt;&lt;b&gt;Dataflow&lt;/b&gt;&lt;/a&gt;, or &lt;a href="https://cloud.google.com/pubsub"&gt;&lt;b&gt;Pub/Sub&lt;/b&gt;&lt;/a&gt; so that ML models generate the most timely and relevant predictions for end users. With Feast, these transformations can be configured based upon common feature definitions and similar metadata in a common feature registry &lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Monitoring (not shown)&lt;/b&gt;: operational monitoring, and especially data correctness and data-quality monitoring to detect behavior like &lt;a href="https://developers.google.com/machine-learning/guides/rules-of-ml#training-serving_skew" target="_blank"&gt;training-serving skew&lt;/a&gt; and &lt;a href="https://towardsdatascience.com/model-drift-in-machine-learning-models-8f7e7413b563" target="_blank"&gt;model drift&lt;/a&gt; are essential parts of any machine learning system. Feature stores like Feast can calculate metrics on the features they store and serve that describe correctness and quality to communicate the overall health of an ML application and help determine when intervention is necessary.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Feast in action on Google Cloud&lt;/h3&gt;&lt;p&gt;Google Cloud customers use many of the following products in combination with Feast:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;BigQuery&lt;/b&gt;: Google Cloud’s petabyte-scale, fully managed, serverless data warehouse enables scalable analysis over petabytes of data and is a popular choice for offline feature storage, training and evaluation &lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Cloud Bigtable&lt;/b&gt;: Cloud Bigtable is Google Cloud’s fully managed, scalable NoSQL database service for large analytical and operational workloads and is a highly effective solution for online prediction and feature serving&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Dataflow&lt;/b&gt;: Dataflow is Google Cloud’s fully managed streaming analytics service, which minimizes latency, processing time, and cost through autoscaling and batch processing to extract, transform, and load data to and from data warehouses like BigQuery and databases like Cloud Bigtable to support use cases like ML feature transformation&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Dataproc&lt;/b&gt;: Dataproc is a fully managed and highly scalable service for running Apache Spark and 30+ open source tools and frameworks. Spark ranks among the most popular batch and stream processing frameworks for ML practitioners.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;b&gt;Pub/Sub&lt;/b&gt;: Pub/Sub is Google Cloud’s asynchronous and scalable messaging service for streaming analytics and data integration pipelines to ingest and distribute data and can be an excellent fit for on-demand streaming transformations to ML feature data&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Thanks for reading! In the second installment of this series of blog posts, we’ll build a prototype Feast feature store for an ML personalization use case using BigQuery, Cloud Bigtable, and Google Colab.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Learn more&lt;/b&gt;&lt;/p&gt;&lt;p&gt;For more information about Feast, please visit &lt;a href="https://feast.dev/" target="_blank"&gt;Feast&lt;/a&gt; here. As a developer, you can also get started with &lt;code&gt;pip install "feast[gcp]"&lt;/code&gt; and begin using a bootstrapped feature store on Google Cloud with &lt;code&gt;feast init -t gcp&lt;/code&gt;&lt;/p&gt;&lt;p&gt;For more information about installing Feast for Bigtable, click &lt;a href="https://docs.feast.dev/reference/online-stores/bigtable" target="_blank"&gt;here&lt;/a&gt;. To learn more about how Feast works with BigQuery, see &lt;a href="https://docs.feast.dev/reference/offline-stores/bigquery" target="_blank"&gt;here&lt;/a&gt;. &lt;/p&gt;&lt;p&gt;&lt;b&gt;About Feast &lt;/b&gt;&lt;/p&gt;&lt;p&gt;Feast is a popular open source feature store that reuses organizations’ existing infrastructure to manage and serve machine learning features to realtime models. Feast enables organizations to consistently define, store, and serve ML features and decouple ML from data infrastructure.&lt;/p&gt;&lt;p&gt;&lt;b&gt;About Tecton&lt;/b&gt;&lt;/p&gt;&lt;p&gt;Tecton is the main open source contributor to Feast. Tecton also offers a new, fully-managed feature platform for real-time machine learning on Google Cloud. For more information about this new platform, please see &lt;a href="https://tecton.ai/blog/integrating-tecton-and-google-cloud-platform" target="_blank"&gt;this announcement&lt;/a&gt;. &lt;/p&gt;&lt;p&gt;&lt;b&gt;About Google Cloud &lt;/b&gt;&lt;/p&gt;&lt;p&gt;Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology and tools to help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.&lt;/p&gt;&lt;/div&gt;</description><pubDate>Tue, 25 Jul 2023 16:00:00 +0000</pubDate><guid>https://cloud.google.com/blog/products/databases/how-feast-feature-store-streamlines-ml-development/</guid><category>Data Analytics</category><category>AI &amp; Machine Learning</category><category>Open Source</category><category>Databases</category><og xmlns:og="http://ogp.me/ns#"><type>article</type><title>Streamlining ML development with Feast</title><description></description><site_name>Google</site_name><url>https://cloud.google.com/blog/products/databases/how-feast-feature-store-streamlines-ml-development/</url></og><author xmlns:author="http://www.w3.org/2005/Atom"><name>Danny Chiao</name><title>Engineering Lead, Tecton</title><department></department><company></company></author><author xmlns:author="http://www.w3.org/2005/Atom"><name>David Simmons</name><title>Product Manager, Cloud Bigtable</title><department></department><company></company></author></item></channel></rss>