from one 256 MiB checkpointmicroVMs / 1 ckp
naive would be ~250 GiBhost RAM, all 1 000
beats Cube's 60 ms claimresume p50 · vs 60 ms
Coppice gives agent systems a place to run untrusted code on hardware you control. It exposes the E2B-compatible create/files/commands/code surface, a browser demo UI, and an admin dashboard; underneath it uses FreeBSD VNET jails for fast lightweight sessions and bhyve for Linux, Windows, GPU, and full-VM isolation. You get shells, files, browser automation, VS Code, VNC/RDP desktops, durable snapshots, live volumes, signed templates, metrics, webhooks, and optional API-key auth with scopes and tenant quotas from one gateway.
Hosted sandboxes are convenient when somebody else can carry the pager. Coppice is for the cases where the code, data, GPU, network, or compliance boundary has to stay on your metal. The receipts here are intentionally concrete: 17 ms resume from a durable bhyve pool, 1 000 microVMs in 9.1 GiB of RAM, measured NVIDIA passthrough, SDK roundtrips, and a live admin surface that shows what is actually running.
Product surface
| need | Coppice | typical hosted sandbox | why it matters |
|---|---|---|---|
| Run untrusted code | VNET jail or bhyve VM | container / microVM | Pick the boundary per template: fast jail, full guest kernel, Windows, Linux, or GPU VM. |
| Keep state between calls | TTL, pause, snapshot, fork | varies by provider | Agents can idle, resume, branch, and inspect old state without rebuilding a workspace. |
| Developer API | E2B-compatible + CLI + MCP | provider SDK | Existing E2B-style clients keep working; custom clients can use REST/Connect-RPC directly. |
| Interactive UX | Shell, browser, VS Code, VNC, RDP | usually shell/browser | Humans can debug the exact sandbox an agent used, including desktop sessions. |
| Data boundary | self-hosted | hosted control plane | Use it when prompts, source, credentials, or regulated data cannot leave your environment. |
| Fleet features | single-node first | multi-region SaaS | Coppice optimizes local density and receipts; global scheduling and cold object-store mobility remain future work. |
The demo portal
The fastest path from "what is this" to "this is a sandbox" is
/ui/ on the gateway. A single static page served by
e2b-compat, tabs for shell /
browser / VS Code /
VNC / RDP, a spawn
menu, and a status bar wired to the per-sandbox metrics. Shell
tabs get a restty
terminal (libghostty-vt on WebGPU, native split panes); browser
tabs render the Chrome DevTools inspector attached to a headless
chromium in its own jail; VS Code tabs embed code-server
pointed at the sandbox's home directory; desktop tabs boot openbox,
Firefox, xterm, VNC, and RDP in a desktop template.
ssh -fN -L 3001:127.0.0.1:3000 honor
open http://localhost:3001/ui/ What's in the box
-
Per-sandbox VNET
Each sandbox gets a routable
10.78.0.MIP on a dedicated bridge, a per-sandbox pf anchor scoped by source IP, and a one-call air-gap viaPUT /sandboxes/:id/network. -
Durable snapshots
POST /sandboxes/:id/snapshotscaptures a ZFS fork-point;POST /sandboxeswithsnapshotIDclones it into a fresh VNET jail. The Cube README row labelled coming soon; shipped here. -
Persistent volumes
coppice volume create→ ZFS dataset + nullfs mount inside the jail at/mnt/<name>. Atomic JSON registry; 403 on unauthorised mount. -
Signed templates
signify(1)over the ZFS snapshot guid, verified before everyzfs clone. Prometheus counter on verifications so tampering surfaces as an alert. -
API keys, quotas, audit
Optional
X-API-Key/ Bearer auth maps requests totenantIDplusread,exec, oradminscopes, then caps active creates with per-tenant quotas and records decisions in /audit/events. -
Streaming commands
sandbox.commands.*over Connect-RPC with an NDJSON alias. Line-streamed stdout/stderr fromjexecwithout buffering the full log. -
Auto-suspend + AutoResume
lifecycle.onTimeout="pause"idles jail-backed sandboxes and bhyve SSH guests withSIGSTOP, and SDK/envd/files/commands activity resumes them automatically. -
Filesystem API
files.write,files.read,files.list, and SDK-compatible watch via host-side snapshot-diff polling. Matches the E2B surface the Python and Node SDKs drive. -
Metrics, traces, logs
Per-sandbox rctl-backed CPU/RAM via /per-sandbox-metrics, OTLP spans via
OTEL_EXPORTER_OTLP_ENDPOINT, ring-buffer logs via /per-sandbox-logs, and E2B-shaped lifecycle events/webhooks via /lifecycle-events. -
coppice CLI
One
coppicebinary for sandbox lifecycle, volumes, templates, signing, and keyring-backedlogin/whoami. Plus an rc.d service and the E2B SDK drop-in for Python, Node, and Go. -
Desktop UX
desktoptemplate with VNC, RDP, openbox, Firefox, xterm, xclock, xeyes, manual clipboard, resize, and Ctrl-Alt-Del wiring in the React portal. -
VS Code in-browser
templateID=vscodelaunches code-server inside a VNET jail and proxies HTTP/WebSocket traffic through/vscode-proxy/<id>/. -
nginx BYOI
Cube's
cubesandbox-base-nginxshape is measured: a FreeBSD nginx template, in-jail port 80, and<port>-<id>.coppice.lanrouting. -
Signed preview URLs
coppiceproxycan require HMAC host tokens for public listener routes and mint expiring URLs behind an admin bearer. -
Linux + Windows guests
Debian 12 bhyve cloud image is a first-class pooled template; Windows Server eval boots through bhyve framebuffer with the same console proxy path.
-
GPU passthrough
NVIDIA RTX passthrough into a Debian bhyve guest is measured on honor with
ppt(4), sidecar-owned bhyve slots, and aGPU_OKreceipt. -
Agent integrations
OpenAI Agents SDK, OpenAI code-interpreter shape, mini-RL, Codex CLI, and MCP all have receipts against the same gateway.
-
Git checkout API
POST /sandboxes/:id/git/clonegives SDKs and demos a first-class repo checkout route with path hygiene and a receipt. -
Admin dashboard
Static Astro dashboard for feature status, live gateway probes, template/pool state, machine actions, metrics, and receipt coverage through the local gateway tunnel.
What we measured
Cold start — bhyve, four configurations at cc=1
Chart · bhyve resume latency, four tiers, cc=1 — log scale
▸ reproduce · mise run bench:bhyve-full ·mise run bench:bhyve-durable-pool ·mise run bench:bhyve-durable-prewarm-pool ·mise run bench:bhyve-prewarm-pool · methodology
Density — N microVMs from one checkpoint, with the vmm-vnode patch
| concurrent VMs | host Δ | per-VM | notes |
|---|---|---|---|
| 8 × 256 MiB | 103 MiB | 12.9 MiB | Shared template lives in the page cache once; per-VM is bhyve process state. |
| 50 × 256 MiB | 939 MiB | 18 MiB | Matches the N=50 figure in /appendix/ksm-equivalent. |
| 200 × 256 MiB | 5 366 MiB | 26 MiB | Load average climbing; 16 threads saturating. |
| 400 × 256 MiB | 6 492 MiB | 16 MiB | Per-VM cost dropping as fixed overhead amortizes. |
| 1 000 × 256 MiB | 9 117 MiB | 9 MiB | Naive un-shared cost: ~250 GiB. Fits in the RAM of a laptop. |
bhyvectl --suspend checkpoint; N copies
of bhyve started with -o snapshot.vnode_restore=true;
settled; host-wide memory delta sampled. See
benchmarks/rigs/bhyve-fanout-rss.sh
and the annotated patch.
Network isolation — Coppice on pf, vs Cube on eBPF
| metric | Coppice / pf | Cube / eBPF (typical) | notes |
|---|---|---|---|
| sandbox↔sandbox p50 RTT | 7 µs | ~5–10 µs | netperf TCP_RR, 1-byte req/resp. |
| sandbox↔sandbox p99 RTT | 8 µs | ~10–15 µs | stddev 0.51 µs; no GC pauses, no qdisc lottery. |
| TCP throughput, 1 stream | 14.6 Gbit/s | ~15–20 Gbit/s | iperf3 intra-host. Memory-bandwidth limited, not pf. |
| Policy update, 1 000 IPs | 4 ms | Cilium bulk: similar | One pfctl -T replace -f call, atomic. Effective ~250 k ops/sec. |
| Enforcement latency | next packet | next packet | pf tables are O(1) kernel lookups; no stale-ruleset window. |
coppicenet0 bridge +
dedicated pf anchor with deny-by-default + explicit allow.
Full methodology at /appendix/ebpf-to-pf;
broader ecosystem picture at /appendix/ebpf-on-freebsd.
Feature audit
CubeSandbox advertises roughly 59 capabilities across its README, docs, and examples directory. Of those, 56 are measured closed on Coppice with receipts; 1 are partial (shape there, plumbing pending), and 2 are genuinely open. Row-by-row at /appendix/cubesandbox-feature-audit.
Where Coppice fits
Fly Sprites, E2B, Modal, and Cloudflare's sandbox offerings are all legitimate hosted answers to the durable-sandbox question, and if you want somebody else to carry the pager they're the right call. Coppice isn't trying to beat their time-to-production or their global resume story — Sprites in particular pays for cross-region mobility with an object-store persistence layer, which is the right trade for a hosted fleet. Coppice trades that mobility for local ZFS and a 17 ms resume on a single box. It's the substrate play: run it where you already have the metal, the colo, the air-gap, or the compliance boundary that rules hosted out.
Where to go next
Provenance
All CubeSandbox claims resolve against a pinned commit
(CubeSandbox/CubeAPI GitHub mirror, tag
Apache-2.0-2025-09). All FreeBSD numbers come from JSON
in benchmarks/results/; the scripts that produced them
are in benchmarks/rigs/. Measurements are on
honor — AMD Ryzen 9 5900HX, 32 GB DDR4, single
NVMe, FreeBSD 15.0-RELEASE. Kernel patches live in
patches/ with apply instructions, upstream-readiness
audit, and
inline annotated walk-throughs.