Copied


Anthropic Launches Advisor Tool Cutting AI Agent Costs by 85%

Lawrence Jengar   Apr 09, 2026 18:50 0 Min Read


Anthropic just dropped a tool that could reshape how developers budget for AI agents. The advisor tool, announced April 9, lets cheaper Claude models tap into Opus-level intelligence only when needed—cutting costs by up to 85% while maintaining competitive performance.

The mechanic is straightforward: Sonnet or Haiku runs your agent end-to-end, handling tool calls and iterations. When it hits a wall, it escalates to Opus for guidance. Opus never touches tools or user output directly—it just advises and hands control back.

The Numbers That Matter

Anthropic's benchmarks tell an interesting story. Sonnet with an Opus advisor scored 2.7 percentage points higher on SWE-bench Multilingual than Sonnet alone, while actually costing 11.9% less per task. That's better performance for less money—not a tradeoff most developers expect.

Haiku users see even more dramatic shifts. On BrowseComp, Haiku with Opus advisor hit 41.2%—more than double its solo score of 19.7%. Yes, it still trails Sonnet's standalone performance by 29%, but here's the kicker: it costs 85% less per task. For high-volume operations where you're burning through thousands of agent calls daily, that math gets very attractive very fast.

Why This Matters Now

The timing isn't accidental. Anthropic shipped Sonnet 4.6 in mid-February, which already matched Opus-level performance in many tasks. OpenAI countered with GPT-5.4 in early March, unifying their Codex and GPT lines with million-token context. The AI agent space is getting crowded, and cost efficiency is becoming the battleground.

The advisor tool flips the typical orchestration pattern. Instead of a big model delegating to smaller workers, a cheap model drives everything and only escalates when stuck. No decomposition logic, no worker pools—just a single API call with built-in handoffs.

Implementation Details

Developers add one tool declaration to their Messages API request. The advisor_20260301 type routes context to Opus automatically when the executor model decides it needs help. A max_uses parameter caps advisor calls per request, and tokens bill at each model's respective rate.

Since Opus typically generates just 400-700 tokens of guidance per consultation while the executor handles full output at lower rates, overall spend stays well below running Opus end-to-end.

The tool slots alongside existing capabilities—web search, code execution, whatever you're already using. No architectural overhaul required.

For teams already running Claude agents at scale, this is a straightforward optimization. For those evaluating Anthropic against OpenAI's expanding lineup, it's another variable in the cost-performance equation that's worth modeling before committing infrastructure.


Read More