Copied


NVIDIA Vera CPU Enters Production With 88 Olympus Cores for AI Factories

Luisa Crawford   Mar 16, 2026 20:23 0 Min Read


NVIDIA has pushed its Vera CPU into full production, marking the company's most aggressive move yet into the data center CPU market dominated by AMD and Intel. The chip packs 88 custom Olympus cores and delivers up to 1.2 TB/s of memory bandwidth—specs purpose-built for the emerging agentic AI workloads that are straining traditional server architectures.

Partner systems from Cisco, Dell, HPE, Lenovo, and Supermicro are expected in the second half of 2026.

Why CPUs Matter in the GPU Era

Here's the counterintuitive reality: as AI models get smarter, CPUs become more critical, not less. Reinforcement learning workflows—the kind training models to code, reason, and use tools—generate massive amounts of output on GPUs that must then be compiled, tested, and evaluated on CPU clusters. When those CPU tasks run slow, expensive GPU cycles sit idle waiting for results.

NVIDIA calls this the "post-training reality." Agentic AI systems spawn thousands of concurrent sandbox environments, each running code interpreters, web browsers, and database queries. Traditional server CPUs weren't designed for this pattern of thousands of single-threaded tasks running simultaneously.

The Olympus Architecture

Vera represents NVIDIA's first fully custom data center CPU core. The Olympus core uses a 10-wide instruction fetch and decode frontend with a neural branch predictor—essentially using AI to optimize AI workloads. It's built on Arm v9.2, maintaining compatibility with existing Arm containers and software.

The standout specification: each core gets up to 14 GB/s of memory bandwidth, roughly 3x what traditional data center CPUs provide per core. NVIDIA claims this enables over 90% sustained memory bandwidth utilization under load, compared to the significant degradation typical in competing architectures.

A new feature called Spatial Multithreading lets operators choose between maximum single-thread performance and higher thread counts at runtime—trading flexibility for the tail latency problems that plague traditional SMT implementations.

Performance Claims and Competitive Positioning

NVIDIA is claiming 1.5x sandbox performance versus AMD EPYC Turin and Intel Xeon 6 Granite Rapids across compilation, scripting, and runtime workloads. The company says this translates to faster reinforcement learning cycles and reduced user wait times in agentic inference.

The Vera CPU Rack configuration—256 liquid-cooled chips in a single rack—claims 4x the capacity and 2x the performance per watt of x86 server racks. NVIDIA positions this as drop-in compatible with existing NVL72 infrastructure, using the same cooling and power planning.

Market Implications

For AI infrastructure investors, Vera signals NVIDIA's intent to capture more of the data center stack. The company isn't just selling GPUs anymore—it's selling complete AI factory solutions where CPUs, GPUs, networking, and management software all carry the NVIDIA brand.

The second-generation NVLink-C2C interconnect delivers 1.8 TB/s coherent bandwidth between Vera CPUs and Rubin GPUs, creating tight integration that third-party CPU vendors can't easily match. That's the real competitive moat: not just faster cores, but system-level optimization that locks customers into the NVIDIA ecosystem.

AMD and Intel now face pressure to demonstrate their CPUs can keep pace with agentic workload demands—or risk losing data center share to a competitor that controls both sides of the compute equation.


Read More