OpenAI Agents SDK Gets Sandbox Execution and Model-Native Harness

OpenAI has shipped a substantial upgrade to its Agents SDK, adding native sandbox execution and a model-native harness that lets developers build AI agents capable of working across files, running commands, and handling multi-step tasks in controlled environments.

The April 15, 2026 release addresses a persistent pain point for teams moving from prototype to production: the gap between having a capable model and having infrastructure that actually supports how agents need to work.

What's Actually New

The updated SDK introduces two core capabilities. First, a model-native harness with configurable memory, sandbox-aware orchestration, and filesystem tools similar to those powering Codex. Second, native sandbox execution that gives agents a proper workspace—they can read and write files, install dependencies, run code, and use tools without developers cobbling together their own execution layer.

For sandbox providers, OpenAI isn't forcing developers into a single option. Built-in support covers Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel. Bring your own sandbox if you prefer.

The SDK also introduces a Manifest abstraction for describing an agent's workspace. Developers can mount local files, define output directories, and pull data from AWS S3, Google Cloud Storage, Azure Blob Storage, or Cloudflare R2. This creates portability—same workspace definition works from local development through production deployment.

Why the Architecture Matters

OpenAI explicitly designed the SDK assuming prompt-injection and data exfiltration attempts will happen. By separating the harness from compute, credentials stay out of environments where model-generated code executes.

The separation also enables durable execution through snapshotting and rehydration. If a sandbox container fails or expires, the SDK can restore agent state in a fresh container and continue from the last checkpoint. For long-running tasks, that's the difference between catastrophic failure and minor hiccup.

Scalability benefits too: agent runs can spin up multiple sandboxes, invoke them only when needed, route subagents to isolated environments, and parallelize work across containers.

Early Production Results

Oscar Health tested the SDK on clinical records workflows. According to Rachael Burns, Staff Engineer and AI Tech Lead, the update made it "production-viable to automate a critical clinical records workflow that previous approaches couldn't handle reliably enough." The specific improvement: correctly understanding encounter boundaries in complex medical records, not just extracting metadata.

Current Limitations

The new harness and sandbox capabilities launch in Python only. TypeScript support is coming but doesn't have a firm date. Code mode and subagent features are also planned for both languages in future releases.

Pricing follows standard API rates based on tokens and tool use—no separate sandbox fees mentioned.

OpenAI says it's working to expand sandbox provider integrations and make the SDK plug into more existing developer toolchains. For teams already building agent systems with model-agnostic frameworks, the pitch is clear: closer alignment with how frontier models actually perform best, without sacrificing flexibility on where agents run or how they access sensitive data.

OpenAI Agents SDK Gets Sandbox Execution and Model-Native Harness

What's Actually New

Why the Architecture Matters

Early Production Results

Current Limitations

Read More