Anthropic Launches Multi-Agent Code Review for Claude Code Enterprise

Anthropic released Code Review for Claude Code on March 9, deploying multiple AI agents to analyze pull requests with a depth the company claims catches bugs that quick human scans typically miss. The feature enters research preview for Team and Enterprise customers.

The timing addresses a real bottleneck. Anthropic reports code output per engineer jumped 200% over the past year, straining review capacity. Before Code Review, just 16% of the company's internal PRs received substantive comments. That figure now sits at 54%.

How the System Operates

When developers open a pull request, Code Review spawns a team of agents working in parallel. These agents hunt for bugs independently, cross-verify findings to filter false positives, then rank issues by severity. The output lands as a single overview comment plus inline annotations for specific problems.

Review depth scales automatically. Large, complex changes get more agents and longer analysis; trivial updates get a quick pass. Average review time runs around 20 minutes, according to Anthropic.

The agents won't approve PRs—that remains a human decision. But the system aims to ensure reviewers aren't rubber-stamping code they haven't actually examined.

Internal Results Tell the Story

Anthropic's internal testing shows clear patterns. On PRs exceeding 1,000 changed lines, 84% receive findings averaging 7.5 issues flagged. Smaller PRs under 50 lines see findings on just 31%, averaging half an issue. Engineers dispute less than 1% of findings as incorrect.

One case stood out: a single-line change to a production service—the kind of diff that typically gets waved through—would have broken authentication entirely. Code Review flagged it as critical before merge. The engineer admitted they wouldn't have caught it manually.

Early access customers report similar catches. On a ZFS encryption refactor in TrueNAS's open-source middleware, the system spotted a pre-existing bug in adjacent code: a type mismatch silently wiping the encryption key cache on every sync. That's the kind of latent issue hiding in code a PR happens to touch, invisible to reviewers scanning changesets.

Pricing and Controls

This isn't cheap. Reviews bill on token usage, averaging $15-25 per PR depending on size and complexity. That's significantly pricier than Anthropic's existing open-source GitHub Action, which remains available for lighter-weight checks.

Admins get spending controls: monthly organization caps, repository-level toggles, and an analytics dashboard tracking review counts, acceptance rates, and costs. Once enabled, reviews trigger automatically on new PRs with no developer configuration required.

The release follows Claude Code Security's limited preview launch on February 20, which scans codebases for vulnerabilities. Together, these features position Claude Code as increasingly comprehensive infrastructure for enterprise development teams willing to pay for depth over speed.

Anthropic Launches Multi-Agent Code Review for Claude Code Enterprise

How the System Operates

Internal Results Tell the Story

Pricing and Controls

Read More