Google Bets the Enterprise on Agents: Inside the Gemini Enterprise Agent Platform and 8th-Gen TPUs

From Models to Agents

For two years the cloud AI pitch was about access to the best model. At Cloud Next 2026, Google signaled that the pitch has changed. The company introduced the Gemini Enterprise Agent Platform as the explicit evolution of Vertex AI, with future Vertex services to be delivered through it, and described it as a comprehensive platform to build, scale, govern, and optimize agents. The framing matters: Google is no longer selling a model endpoint, it is selling the operating system for autonomous software that takes actions on a company's behalf.

The platform is built around four operational pillars, build, scale, govern, and optimize, and a deep bench of tooling that reads like an enterprise software catalog: Agent Studio, an Agent Development Kit, an Agent Runtime, Agent to Agent Orchestration, an Agent Gateway, Agent Identity, an Agent Registry, plus observability, simulation, and evaluation. The breadth is the point. Google is betting that the hard part of enterprise agents is not the model but the lifecycle around it, and that whoever owns that lifecycle owns the customer.

A Deliberately Open Model Layer

Notably, Google did not wall the platform off around its own models. The Gemini Enterprise Agent Platform offers access to more than 200 models, including Gemini 3.1 Pro and, pointedly, Anthropic's Claude variants alongside third party options. That is a calculated concession to how enterprises actually buy. Customers want to route different tasks to different models, and a platform that forced everything through Gemini would lose to one that did not. Google would rather take the platform margin and the data gravity than insist on model lock in.

We read this as Google conceding the model monogamy battle to win the platform war. The same pattern is visible across the industry: Microsoft hosts rivals in Foundry, AWS hosts everyone in Bedrock, and now Google hosts Claude inside its agent platform. The strategic logic is that the model is becoming a swappable component while the orchestration, governance, and data layers are sticky. If Google can make its platform the place where agents are built, governed, and observed, the choice of underlying model becomes a detail it is happy to let customers make.

The Silicon Underneath

The agent story rests on hardware, and here Google flexed. Its eighth generation TPUs come in two flavors. The TPU 8t, aimed at training, delivers 121 exaflops of compute, claims three times the performance of the previous generation and ten times faster storage access, and scales to 9,600 chips. The TPU 8i, tuned for inference, claims 80 percent better performance per dollar, five times lower on chip latency, and ships with 288 gigabytes of high bandwidth memory and 384 megabytes of on chip SRAM, triple the prior generation.

Performance per dollar on inference is the number enterprises should watch. Training grabs headlines, but the recurring cost of running agents in production is inference, and an 80 percent improvement there changes the unit economics of deploying agents at scale. Google's vertical integration, designing its own silicon in partnership with Broadcom, is precisely what lets it make that claim. The hyperscalers that control their accelerators control the cost curve, and that cost curve is what determines whether agentic deployments pencil out for a CFO.

Wiring a Campus Into One Computer

The most ambitious announcement was architectural. Google's new Virgo network fabric implements what it calls a campus as a computer design, connecting 134,000 TPU 8t chips with 47 petabits per second of non blocking bandwidth, four times the bandwidth per TPU and 40 percent lower unloaded fabric latency than before. At that scale the distinction between a cluster and a single machine starts to blur, which is exactly the point for training and serving frontier scale models.

This is where the power bottleneck the whole industry is wrestling with becomes concrete. Wiring 134,000 accelerators into one fabric is as much a power and cooling problem as a networking one, and it is why hyperscaler capital expenditure is projected to exceed 600 billion dollars in 2026. Google complemented Virgo with cross cloud plumbing, including high throughput VMs and a managed Lustre file system at ten terabytes per second, signaling that it wants to be the substrate even for workloads that span AWS and Azure.

Sovereignty as a Feature

Alongside the raw capability, Google leaned hard into control. The company introduced new sovereign controls and client side encryption that let enterprises lock data processing and storage to specific jurisdictions such as the US and EU, with more countries like Germany and India to follow, plus Confidential External Key Management. Partners are amplifying the message: Capgemini expanded its alliance with Google Cloud specifically to deliver end to end sovereign cloud and AI services on top of Vertex AI and Gemini Enterprise.

Sovereignty has graduated from a European compliance checkbox to a primary purchasing criterion, and Google clearly sees it as a wedge against rivals. For regulated enterprises in finance, healthcare, and government, the ability to guarantee that data and inference stay within a legal boundary is increasingly the deciding factor in cloud selection. By baking jurisdictional control directly into the agent platform, Google is trying to remove the objection that has historically slowed AI adoption in the most conservative, and most lucrative, enterprise segments.

Our Read

Cloud Next 2026 was a coherent argument rather than a grab bag of features. Google's thesis is that the enterprise future is agentic, that agents need a full lifecycle platform rather than a model API, that the platform must be open to rival models to win adoption, and that all of it must run on custom silicon cheap enough to make production economics work, inside jurisdictions customers can control. Every announcement laddered up to that thesis, which is more than most vendor keynotes can claim.

The open question is execution and trust. Google has the best argument and arguably the best inference economics, but it remains third in cloud market share and carries a reputation for deprecating products that makes some enterprises wary of betting their agent strategy on its roadmap. The technology shown at Next is genuinely strong. Whether conservative buyers will commit their most important new workloads to Google, rather than to the incumbents they already trust, is the question the next year will answer.

From Models to Agents

A Deliberately Open Model Layer

The Silicon Underneath

Wiring a Campus Into One Computer

Sovereignty as a Feature

Our Read

New York Votes to Freeze AI Datacenter Construction: Governor Hochul Must Now Decide

New York Legislature Passes One-Year Datacenter Moratorium

The AI FinOps Reckoning: Token Costs Coming Due