The world’s most watched AI infrastructure event just delivered its biggest reveal yet. At NVIDIA GTC 2026 in San Jose, CEO Jensen Huang took the stage to detail the Vera Rubin GPU platform — the company’s next major leap in AI accelerator design.
What Is Vera Rubin?
Named after the pioneering astrophysicist who discovered evidence for dark matter, the Vera Rubin platform pairs a new Rubin GPU with NVIDIA’s custom Vera CPU. The combination is designed from the ground up for AI data centres running trillion-parameter models at scale.
Key specs revealed at the keynote:
- TSMC 3 nm process — a full node shrink from Blackwell, reducing power consumption while boosting transistor density
- HBM4 memory — delivering dramatically higher memory bandwidth crucial for inference workloads
- Inference-first architecture — Huang described the chip as built to unlock the “brain” of AI, with significantly enhanced inference throughput per watt
NVIDIA confirmed that initial samples have already shipped to select customers, with general production expected in the second half of 2026.
Why It Matters
The Blackwell generation set the standard for AI training. Vera Rubin shifts the focus: as frontier models approach the limits of diminishing returns from ever-larger training runs, inference efficiency becomes the defining competitive advantage.
NVIDIA is betting that whoever controls the inference layer controls the AI economy. Vera Rubin is that bet made in silicon.
GTC 2026 at a Glance
Running March 16–19 in San Jose and streamed globally, GTC 2026 also featured sessions on:
- Physical AI and humanoid robotics pipelines
- Agentic AI infrastructure and multi-agent orchestration
- AI factories — the next evolution of the data centre concept
With compute costs continuing to fall and inference demand accelerating, the Vera Rubin announcement lands at precisely the right moment.
Source: tomshardware.com, nvidia.com, techrepublic.com