9261
views
✓ Answered

NVIDIA and Google Cloud: Powering the Next Generation of AI with Agentic and Physical AI

Asked 2026-05-04 20:04:58 Category: Technology

For over a decade, NVIDIA and Google Cloud have worked closely together, building a comprehensive AI platform that covers everything from optimized libraries and frameworks to enterprise-ready cloud services. This deep collaboration has paved the way for developers, startups, and large organizations to move agentic and physical AI from experimental labs into real-world production. From AI agents that handle complex workflows to robots and digital twins operating on factory floors, the partnership has become a cornerstone of modern AI deployment. At Google Cloud Next in Las Vegas, the two companies announced a series of new offerings that push the boundaries further, including next-generation infrastructure, expanded GPU options, and enhanced support for agentic AI workloads.

1. What is the cornerstone of the NVIDIA and Google Cloud partnership?

The partnership is built on a decade-long history of co-engineering a full-stack AI platform that spans every technology layer. This includes performance-optimized libraries and frameworks, as well as enterprise-grade cloud services. The collaboration provides a seamless foundation for cutting-edge workloads, enabling developers to deploy agentic AI — autonomous systems that manage complex tasks — and physical AI such as robotics and digital twins. By combining Google Cloud's scalable infrastructure with NVIDIA's industry-leading hardware and software, the duo offers customers flexibility to train, tune, and serve everything from frontier models to open models, while optimizing for performance, cost, and sustainability.

NVIDIA and Google Cloud: Powering the Next Generation of AI with Agentic and Physical AI
Source: blogs.nvidia.com

2. What major infrastructure announcement was made at Google Cloud Next 2025?

At Google Cloud Next, the companies announced the upcoming A5X bare-metal instances powered by the NVIDIA Vera Rubin NVL72 rack-scale system. These instances represent a leap in AI infrastructure, delivering up to 10x lower inference cost per token and 10x higher token throughput per megawatt compared to the previous generation. This extreme performance is achieved through deep codesign across chips, systems, and software. Additionally, A5X will utilize NVIDIA ConnectX-9 SuperNICs alongside Google's next-generation Virgo networking, enabling scaling to up to 80,000 NVIDIA Rubin GPUs in a single site cluster and up to 960,000 GPUs in a multisite cluster. This makes it possible for customers to run their largest AI workloads on NVIDIA-optimized infrastructure.

3. How does the A5X platform improve AI performance and efficiency?

The A5X platform with Vera Rubin delivers dramatic improvements in both cost and energy efficiency. Specifically, it achieves 10x lower inference cost per token and 10x higher token throughput per megawatt relative to the prior generation. These gains come from close integration of hardware and software, including the use of fifth-generation NVIDIA NVLink and NVLink 5 Switch technology. The rack-scale design allows for scaling from a single rack with up to 72 Blackwell GPUs to multiple interconnected NVL72 racks that can scale out to tens of thousands of GPUs. Customers can also opt for fractional GPU instances, such as one-eighth of a GPU, giving them the flexibility to right-size acceleration for their specific workloads. This comprehensive approach ensures optimal performance for both training and inference tasks.

4. What does the NVIDIA Blackwell portfolio offer on Google Cloud?

Google Cloud's NVIDIA Blackwell portfolio includes a wide range of virtual machine options to suit different needs. These range from A4 VMs with NVIDIA HGX B200 systems to A4X VMs with rack-scale NVIDIA GB200 NVL72 and GB300 NVL72 systems, all the way down to G4 VMs with fractional NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. This breadth allows customers to choose the right balance of performance and cost — whether they need multiple interconnected NVL72 racks for massive scale, a single rack for moderate workloads, or just a fraction of a GPU for smaller tasks. The portfolio also supports confidential VMs with NVIDIA Blackwell GPUs, providing enhanced security for sensitive AI workloads. All options are designed to integrate seamlessly with Google Cloud's managed AI services.

NVIDIA and Google Cloud: Powering the Next Generation of AI with Agentic and Physical AI
Source: blogs.nvidia.com

5. How does the collaboration support agentic and physical AI use cases?

The NVIDIA and Google Cloud partnership directly enables the development and deployment of agentic AI — autonomous agents that manage complex workflows — and physical AI, including robots and digital twins. Key support comes from integrating NVIDIA's Nemotron open models and the NeMo framework with Google's Gemini Enterprise Agent Platform. This combination allows developers to build, train, and deploy agents that can reason, plan, and execute tasks in real-world environments. Additionally, the infrastructure optimizations — such as the low-latency networking and high-throughput GPU configurations — are critical for real-time robotics and digital twin simulations. By providing both the hardware and software layers, the partnership helps move these AI systems from labs to production environments like factories and logistics centers.

6. What is the significance of the new Gemini preview on Google Distributed Cloud?

The preview of Google Gemini on Google Distributed Cloud running on NVIDIA Blackwell and Blackwell Ultra GPUs marks a significant step in bringing advanced AI to edge and hybrid environments. This offering allows enterprises to run Gemini models on premises or at the edge while leveraging NVIDIA's latest GPU architectures. It provides consistent performance and seamless integration with cloud services, enabling low-latency inference for applications like real-time agentic AI and digital twins. The combination supports critical use cases where data privacy, compliance, or network connectivity require local processing. By extending the partnership to distributed cloud, NVIDIA and Google Cloud give customers more deployment flexibility without sacrificing AI performance or security.