Coppice

Self-hosted agent sandboxes with jails, microVMs, desktops, GPUs, durable snapshots, and E2B-compatible APIs.

1 000
concurrent bhyve microVMs
from one 256 MiB checkpoint
microVMs / 1 ckp
9.1 GiB
host RAM for all 1 000
naive would be ~250 GiB
host RAM, all 1 000
17 ms
resume p50, durable pool
beats Cube's 60 ms claim
resume p50 · vs 60 ms

Coppice gives agent systems a place to run untrusted code on hardware you control. It exposes the E2B-compatible create/files/commands/code surface, a browser demo UI, and an admin dashboard; underneath it uses FreeBSD VNET jails for fast lightweight sessions and bhyve for Linux, Windows, GPU, and full-VM isolation. You get shells, files, browser automation, VS Code, VNC/RDP desktops, durable snapshots, live volumes, signed templates, metrics, webhooks, and optional API-key auth with scopes and tenant quotas from one gateway.

Hosted sandboxes are convenient when somebody else can carry the pager. Coppice is for the cases where the code, data, GPU, network, or compliance boundary has to stay on your metal. The receipts here are intentionally concrete: 17 ms resume from a durable bhyve pool, 1 000 microVMs in 9.1 GiB of RAM, measured NVIDIA passthrough, SDK roundtrips, and a live admin surface that shows what is actually running.

Product surface

need Coppice typical hosted sandbox why it matters
Run untrusted code VNET jail or bhyve VM container / microVM Pick the boundary per template: fast jail, full guest kernel, Windows, Linux, or GPU VM.
Keep state between calls TTL, pause, snapshot, fork varies by provider Agents can idle, resume, branch, and inspect old state without rebuilding a workspace.
Developer API E2B-compatible + CLI + MCP provider SDK Existing E2B-style clients keep working; custom clients can use REST/Connect-RPC directly.
Interactive UX Shell, browser, VS Code, VNC, RDP usually shell/browser Humans can debug the exact sandbox an agent used, including desktop sessions.
Data boundary self-hosted hosted control plane Use it when prompts, source, credentials, or regulated data cannot leave your environment.
Fleet features single-node first multi-region SaaS Coppice optimizes local density and receipts; global scheduling and cold object-store mobility remain future work.

The demo portal

The fastest path from "what is this" to "this is a sandbox" is /ui/ on the gateway. A single static page served by e2b-compat, tabs for shell / browser / VS Code / VNC / RDP, a spawn menu, and a status bar wired to the per-sandbox metrics. Shell tabs get a restty terminal (libghostty-vt on WebGPU, native split panes); browser tabs render the Chrome DevTools inspector attached to a headless chromium in its own jail; VS Code tabs embed code-server pointed at the sandbox's home directory; desktop tabs boot openbox, Firefox, xterm, VNC, and RDP in a desktop template.

ssh -fN -L 3001:127.0.0.1:3000 honor
open http://localhost:3001/ui/
The default portal has no auth because it is a tunnelled-localhost surface. Configure API keys when exposing the gateway beyond that boundary. Product manual: /manual/. Operator view: /admin/.

What's in the box

What we measured

Cold start — bhyve, four configurations at cc=1

Chart · bhyve resume latency, four tiers, cc=1 — log scale

10201002001s2s↑ mean ms · logfull guestdurable pooldurable + prewarmpre-warm pool3.9s271ms17ms10ms
bhyve-durable-prewarm-pool — the production shape — sits at 17 ms, ahead of the 60 ms Cube claim with real on-disk durability. bhyve-full (full cold boot) is the upper bound; bhyve-prewarm-pool is the in-memory-only lower bound. Dashed line = Tencent's advertised 60 ms pool-hit.

▸ reproduce ·  mise run bench:bhyve-full ·mise run bench:bhyve-durable-pool ·mise run bench:bhyve-durable-prewarm-pool ·mise run bench:bhyve-prewarm-pool · methodology

Density — N microVMs from one checkpoint, with the vmm-vnode patch

concurrent VMs host Δ per-VM notes
8 × 256 MiB 103 MiB 12.9 MiB Shared template lives in the page cache once; per-VM is bhyve process state.
50 × 256 MiB 939 MiB 18 MiB Matches the N=50 figure in /appendix/ksm-equivalent.
200 × 256 MiB 5 366 MiB 26 MiB Load average climbing; 16 threads saturating.
400 × 256 MiB 6 492 MiB 16 MiB Per-VM cost dropping as fixed overhead amortizes.
1 000 × 256 MiB 9 117 MiB 9 MiB Naive un-shared cost: ~250 GiB. Fits in the RAM of a laptop.
Every row: one bhyvectl --suspend checkpoint; N copies of bhyve started with -o snapshot.vnode_restore=true; settled; host-wide memory delta sampled. See benchmarks/rigs/bhyve-fanout-rss.sh and the annotated patch.

Network isolation — Coppice on pf, vs Cube on eBPF

metric Coppice / pf Cube / eBPF (typical) notes
sandbox↔sandbox p50 RTT 7 µs ~5–10 µs netperf TCP_RR, 1-byte req/resp.
sandbox↔sandbox p99 RTT 8 µs ~10–15 µs stddev 0.51 µs; no GC pauses, no qdisc lottery.
TCP throughput, 1 stream 14.6 Gbit/s ~15–20 Gbit/s iperf3 intra-host. Memory-bandwidth limited, not pf.
Policy update, 1 000 IPs 4 ms Cilium bulk: similar One pfctl -T replace -f call, atomic. Effective ~250 k ops/sec.
Enforcement latency next packet next packet pf tables are O(1) kernel lookups; no stale-ruleset window.
Two VNET jails on a dedicated coppicenet0 bridge + dedicated pf anchor with deny-by-default + explicit allow. Full methodology at /appendix/ebpf-to-pf; broader ecosystem picture at /appendix/ebpf-on-freebsd.

Feature audit

CubeSandbox advertises roughly 59 capabilities across its README, docs, and examples directory. Of those, 56 are measured closed on Coppice with receipts; 1 are partial (shape there, plumbing pending), and 2 are genuinely open. Row-by-row at /appendix/cubesandbox-feature-audit.

Where Coppice fits

Fly Sprites, E2B, Modal, and Cloudflare's sandbox offerings are all legitimate hosted answers to the durable-sandbox question, and if you want somebody else to carry the pager they're the right call. Coppice isn't trying to beat their time-to-production or their global resume story — Sprites in particular pays for cross-region mobility with an object-store persistence layer, which is the right trade for a hosted fleet. Coppice trades that mobility for local ZFS and a 17 ms resume on a single box. It's the substrate play: run it where you already have the metal, the colo, the air-gap, or the compliance boundary that rules hosted out.

Where to go next

  1. Tunnel into the demo portal and spawn a shell, browser, or VS Code tab.
  2. Read the one-page number table for every headline benchmark with JSON receipts.
  3. Open the admin dashboard for an operator view of live gateway probes, receipt coverage, and remaining parity work.
  4. Walk the feature audit — 56/59 closed today.
  5. Or start with the essay sequence: Anatomy · Claims · Jails · bhyve · Port sketch · Caveats .

Provenance

All CubeSandbox claims resolve against a pinned commit (CubeSandbox/CubeAPI GitHub mirror, tag Apache-2.0-2025-09). All FreeBSD numbers come from JSON in benchmarks/results/; the scripts that produced them are in benchmarks/rigs/. Measurements are on honor — AMD Ryzen 9 5900HX, 32 GB DDR4, single NVMe, FreeBSD 15.0-RELEASE. Kernel patches live in patches/ with apply instructions, upstream-readiness audit, and inline annotated walk-throughs.