Sandboxing agentic coding tools is a networking problem
Command allowlisting for agentic tools presents significant challenges. Taking inspiration from Simon Willison, sandboxes help address the "lethal trifecta":
Sandboxes help us reason about their relation to the lethal trifecta:
- What untrusted content is the sandbox exposed to?
- How can they externally communicate?
- What sensitive data are we providing to the sandbox?
Anthropic provides several sandboxing tools:
- Claude Code's Sandbox Bash tool using
sandbox-execfor OS X users (same technique Chromium uses) - Claude Code's experimental sandbox runtime
- Devcontainers template applying firewall to allowlisted IPs
Cursor and OpenAI's Codex CLI offer similar features. Custom sandboxes using gVisor or Firecracker VMs apply comparable network isolation principles.
What is the worst a sandbox can do?
A sufficiently sandboxed Claude Code resembles a separate host. Key considerations:
- What network access are you allowing Claude Code?
- What actions can Claude Code perform with available network access and data?
Nearly all Claude Code instances access Anthropic API keys. Claude Code inherits all environment variables from your terminal session and can read files in the directory where you run claude.
While software often requires secrets, development credentials remain sensitive. Dotenv files in your working directory — even if properly .gitignored — become accessible to Claude Code, creating exfiltration risks.
Unpacking the devcontainer firewall
The devcontainer template includes an init-firewall.sh script permitting connections to:
- registry.npmjs.org (npm packages)
- api.anthropic.com (Anthropic API)
- sentry.io (logging)
- statsig.anthropic.com/statsig.com (feature flagging)
- marketplace.visualstudio.com (VSCode extensions)
- vscode.blob.core.windows.net/update.code.visualstudio.com (blob storage)
- GitHub servers
The firewall operates at the IP layer using iptables. However, this IP-level enforcement doesn't prevent application-layer attacks. Domain fronting, for instance, allows diverse actions on single domains. Even HTTP-only allowlists can enable credential exfiltration through npm packages or GitHub gists.
Application-layer inspection becomes necessary for effective restriction.
Using network proxies to prevent secrets exfiltration
Claude Code supports two proxy configurations:
- HTTP_PROXY environment variable (intercepts HTTP traffic from Claude Code process)
- sandbox httpProxyPort (intercepts HTTP proxy traffic from bash commands)
These configurations operate independently.
You can configure Claude Code to use an HTTP proxy using the following configuration in settings.json:
{
"env": {
"HTTP_PROXY": "http://localhost:8080",
"HTTPS_PROXY": "http://localhost:8080"
},
"sandbox": {
"httpProxyPort": 8080
}
}
mitmproxy is a great tool to run these HTTP proxies. You could then pass an invalid Anthropic API Key to Claude Code, and write a mitmproxy addon that intercepts requests to api.anthropic.com and updates the X-API-Key header with the actual credentials:
from mitmproxy import http
REAL_API_KEY = "sk-ant-api03-real-key-here"
class InjectApiKey:
def request(self, flow: http.HTTPFlow) -> None:
if flow.request.pretty_host == "api.anthropic.com":
flow.request.headers["x-api-key"] = REAL_API_KEY
addons = [InjectApiKey()]
You could then run mitmweb with the right API key:
mitmweb -s inject_api_key.py --set ssl_insecure=true
Afterwards, run claude with the ANTHROPIC_API_KEY environment variable set to an invalid API key:
ANTHROPIC_API_KEY=sk-ant-dummy claude
From the perspective of Claude Code, all API responses from api.anthropic.com work correctly, but it never sees the real credentials. Neither Claude Code nor the sandbox possesses real credentials.
Note: Claude Code requires OAuth sign-in before checking ANTHROPIC_API_KEY, so obtain the API key first, close the session, then restart with invalid credentials.
This technique extends beyond Anthropic keys — dummy credentials with mitmproxy injection work for any API.
Tying a developer's permissions to their Claude Code permissions
Formal enables least privilege for both human and machine identities. Current Anthropic Admin API Keys inherit full user permissions without fine-grained restrictions. API keys generated for Claude Code appear limited but lack clear documentation.
The optimal approach prevents Claude Code from accessing credentials directly. Using Formal Connectors, Resources, and Native Users ensures Claude Code cannot leak API keys. Claude Code makes requests with Formal-specific credentials while the Connector injects actual secrets upstream.
When hostnames and headers are hard to edit: mitmproxy add-ons
For hostnames and headers that are hard to tweak, use mitmproxy add-ons to route the HTTP requests for these domains to the corresponding listener. Default hostnames and ports appear identical from Claude Code's perspective.
from mitmproxy import http
# Map original hostnames to Formal Connector listeners
REROUTE_MAP = {
"api.anthropic.com": ("localhost", 4004),
"api.openai.com": ("localhost", 4005),
"api.github.com": ("localhost", 4006),
}
class RerouteHosts:
def request(self, flow: http.HTTPFlow) -> None:
host = flow.request.pretty_host
if host in REROUTE_MAP:
target_host, target_port = REROUTE_MAP[host]
flow.request.host = target_host
flow.request.port = target_port
flow.request.headers["X-Original-Host"] = host
addons = [RerouteHosts()]
You can then pass this add-on via mitmproxy -s reroute_hosts.py.
Applying fine-grained least privilege policies
We could then create a policy in a similar way to the policy we created for the local Github MCP server use case. For example, allow access only to specific API endpoints:
{
"type": "allow",
"description": "Allow Claude Code to access Anthropic completions",
"condition": {
"match": {
"host": "api.anthropic.com",
"method": "POST",
"path": "/v1/complete"
}
}
}
If we change the path param to "/v1/messages," we can confirm that this policy is able to block requests to that endpoint:
{
"type": "block",
"description": "Block direct access to messages API",
"condition": {
"match": {
"host": "api.anthropic.com",
"method": "POST",
"path": "/v1/messages"
}
},
"action": "deny"
}
We also get visibility into every request being made to the Anthropic API across our organization!
Of course, this technique was not specific to safeguarding Anthropic API Keys. Proxies enhance two dimensions of the lethal trifecta:
- External communication limitation: Proxies allow or block traffic
- Data access reduction: Credentials can be injected for specific actions without exposing them to the language model's context window