The E2B SDK’s sandbox.files.* surface is the quiet
workhorse of an agent’s loop: write a Python file, read the model’s
output back, list what the last step produced, clean up. On Linux the
upstream envd answers these calls by doing ordinary libc file IO
inside the sandbox — which is fine, but it means every call crosses a
process boundary. We do something cheaper and, we think, clearer:
we serve the whole surface from the host, against the jail’s mounted
ZFS rootfs, with no jexec in the fast path.
The observation
A FreeBSD jail’s root filesystem isn’t hidden from the host. It’s a
regular mountpoint at {jails_root}/e2b-<id>, owned by
root, readable and writeable by anything running as root on the host.
When the jail is a ZFS clone — which it is, always, for us — that
mountpoint is also snapshot-cheap, fsync-coherent, and survives jail
death until explicit zfs destroy.
The gateway runs as root on the host. It already knows the rootfs
path: it created the clone. Asking the SDK to POST a file
write is therefore an invitation to translate one path prefix and
call tokio::fs::write. No subprocess. No pipe plumbing.
No jexec fork latency (~3 ms on a warm jail). The call
completes at the speed of a page write to ARC.
Concretely, the handler in e2b-compat/src/files.rs:
let rootfs = format!("{}/e2b-{}", state.jails_root, sandbox_id);
let host_path = rootfs.join(jail_path.trim_start_matches('/'));
tokio::fs::write(&host_path, &body).await?;
The sandbox sees the write land under its own / on the
next read(2). The jail kernel wasn’t involved.
The path-safety story
A request that comes in with path=/../etc/passwd would
join to /jails/e2b-abc/../etc/passwd and, after the
kernel’s own path resolution, land in /jails/etc/passwd
— outside the jail, inside the host. That’s the escape. Three layers
of defence, cheap to combine:
- Lexical reject. Before touching the filesystem,
split on
/and error if any component is literally... Also require a leading slash so callers can’t smuggle a relative path and havePath::joinsilently anchor it in the current directory. This catches 99% of attacks and costs nothing. - Ancestor canonicalize. For writes and mkdirs the
leaf doesn’t exist yet, so
std::fs::canonicalizeon the full path would fail. Walk up until you hit an ancestor that does exist, canonicalize that, and check the result starts with the canonicalized rootfs. This catches a symlink in the middle of the path that points at/. - No follow-through-own-rope. We never
fs::create_dir_allat a path whose ancestor-check failed. A symlink planted by a previous call, pointing at the host, doesn’t help the attacker on subsequent calls because they’d have to write through it, which the ancestor check rejects.
The gate lives in one function, rewrite_path. Every
handler — read, write, list, mkdir, rename, remove, stat — calls
it first and operates only on the returned PathBuf.
What’s out of scope, deliberately
Streaming uploads. The SDK’s multipart-upload path
is supported by our envd only via the octet-stream branch — we
advertise envd_version=0.5.8 precisely to steer the
client at the simpler raw-body POST. A 2 GiB upload will buffer in
memory. Good enough for notebook-sized state; not the world’s file
storage.
Binary-mode read transforms. The SDK has a
format=“stream” option that returns an iterator of byte
chunks. We support it trivially — Axum streams whatever’s in the
response body — but we don’t do range requests, partial downloads,
or resumable transfers. An agent round-trip is a few MB at most.
Per-user permissions. The SDK sends a
username= query parameter. We read it, ignore it, and
let the caller’s root-owned writes land as root-owned files. The
sandbox itself runs as root (E2B default), so this matches upstream
semantics; it is not a promise of unprivileged writes. A real
chown step is a future addition and would require
mapping the name to a uid inside the jail’s /etc/passwd,
which we’d rather do lazily than eagerly.
Filesystem watches. files.watch stays
open. The FreeBSD-correct primitive is kqueue(2) with
EVFILT_VNODE, not inotify, and the port is a small but
real piece of work (queueing, streaming over a WebSocket). Covered
elsewhere in the audit.
The two wire protocols
Newer SDK versions split the surface into two families. Read and
write go through the original REST endpoint,
/files?path=…, with the body being the content. Every
other call — list, mkdir, rename, remove, stat — goes through
Connect-RPC at /filesystem.Filesystem/<Method>,
JSON-encoded. We implement both. The REST handlers are the ones
that matter for performance; the RPC endpoints wrap the same host-fs
calls with a different envelope.
The receipt for all of it is examples/08-filesystem.py:
write, read, list, make_dir, rename, remove, and a traversal attempt
that the gate rejects. It runs green against the current gateway on
honor.