Skip to content

What is Docker Sandboxes?

Docker Sandboxes (sbx) is a microVM-based runtime for running AI coding agents safely. Instead of letting agents like Claude Code, Codex, or Gemini CLI run as a regular user process on your laptop — with full access to your home directory, SSH keys, AWS credentials, and Docker daemon — sbx runs each agent inside a lightweight virtual machine with its own kernel. Only the workspace directory you explicitly mount is shared. Everything else stays on the host.

The boundary is structural, enforced by the hypervisor — not by a permission dialog, a system prompt, or a policy document the agent has agreed to follow.

What problem it solves?

When a developer runs an AI coding agent on their laptop without sandboxing, the agent runs as their user. That means it has access to everything they have access to:

  • ~/.aws/credentials — AWS access keys
  • ~/.ssh/id_rsa — SSH private keys
  • .env files — database passwords, API tokens
  • The entire home directory
  • Any running Docker containers on the host

The agent isn't malicious. But it doesn't need to be for something to go wrong. A confused agent, a malicious MCP server, or a prompt injection attack can cause the agent to exfiltrate credentials or corrupt data. And without an audit trail, you won't know it happened.

Real incidents in 2025 — leaked system prompts, RCE via malicious MCP file swaps, MCP tool poisoning across Anthropic, OpenAI, Zapier, and Cursor — all share the same pattern: agents are trusted, and that trust is being exploited.

The current "solutions" don't hold up:

  • Permission dialogs get dismissed in YOLO mode and become noise. They're not security.
  • System prompt instructions are just text. They can be overridden by prompt injection and have no enforcement mechanism.

The Solution

Docker Sandboxes wraps each agent in a microVM with four layers of governance built in:

  • Structural Isolation: The agent runs inside a VM with its own Linux kernel. Host credentials, SSH keys, and the host Docker socket are simply not present inside the VM. There is nothing to exfiltrate.
  • Credential Proxy Injection: API keys live in your OS keychain. When the agent makes an outbound request, a host-side proxy intercepts it and injects the auth header. The raw key never enters the VM.
  • Network Policy Enforcement: Every outbound connection passes through a policy layer (Open / Balanced / Locked Down). Allowed and blocked attempts are logged in real time.
  • Branch Mode and Parallel Execution: Agents work on isolated Git worktrees so you can review diffs before merging — and run multiple agents on the same repo simultaneously without conflicts.

This is the infrastructure that makes semi-autonomous and autonomous agent workflows safe to operate.

Key Features

  • microVM Isolation: Each sandbox is a lightweight VM with its own kernel. The boundary is enforced by the hypervisor, not by a policy file.
  • Credential Proxy: API keys stay on the host. Authentication happens at the proxy layer so the agent never sees the raw key.
  • Network Policy: Pick Open, Balanced, or Locked Down at login. Override per-sandbox. Watch every connection in a live audit log.
  • Branch Mode: Agent work lands on its own Git worktree and branch. Review the diff, then merge — or throw it away.
  • Parallel Agents: Run multiple agents on the same codebase at once. Each gets its own worktree. No file-locking, no conflicts.
  • Local Model Support: Run open-source models inside sbx via Docker Model Runner. Zero cloud egress, zero API keys, fully air-gapped workflows.
  • Pluggable Agents: Works with Codex, Claude Code, Gemini CLI, and any agent image you publish.

Who's This For

  • Developers Running Coding Agents: Run Claude Code, Codex, or Gemini CLI without giving them access to your SSH keys, AWS credentials, or host Docker daemon.
  • Platform Engineers: Provide a paved path for agent execution across the org. Centralized network policy, centralized secret management, audit logs by default.
  • Security Teams: Get the structural guarantees and the audit trail you need to sign off on agentic workloads — without blocking developer productivity.
  • Enterprises Operating at Scale: The architecture that makes 30,000 concurrent governed agent sessions across a workforce a tractable problem rather than an unbounded risk.