from one 256 MiB checkpoint
naive would be ~250 GiB
beats Cube's 60 ms claim
Tencent's CubeSandbox advertises sub-60 ms cold start, <5 MB per-instance memory, and thousands of sandboxes per node. Coppice is what that architecture looks like if you rebuild it on FreeBSD — bhyve for the VMM, jails for the shim-less fast path, a Rust/Axum gateway for E2B SDK compatibility, and a ~350-line vmm kernel patch that closes the last density gap by backing guest RAM with the checkpoint file itself.
Every number on this site traces back to a JSON file in
benchmarks/results/ and a script in
benchmarks/rigs/. Measurements are on honor — AMD
Ryzen 9 5900HX, 32 GB DDR4, single NVMe, FreeBSD
15.0-RELEASE. A laptop. We matched or beat Tencent's published numbers
under a stricter clock definition on hardware that is an order of
magnitude smaller than whatever Cube's benchmark box was.
What we measured
Cold start — bhyve, four configurations at cc=1
Chart · bhyve resume latency, four tiers, cc=1 — log scale
▸ reproduce · mise run bench:bhyve-full ·mise run bench:bhyve-durable-pool ·mise run bench:bhyve-durable-prewarm-pool ·mise run bench:bhyve-prewarm-pool · methodology
Density — N microVMs from one checkpoint, with the vmm-vnode patch
| concurrent VMs | host Δ | per-VM | notes |
|---|---|---|---|
| 8 × 256 MiB | 103 MiB | 12.9 MiB | Shared template lives in the page cache once; per-VM is bhyve process state. |
| 50 × 256 MiB | 939 MiB | 18 MiB | Matches the N=50 figure in /appendix/ksm-equivalent. |
| 200 × 256 MiB | 5 366 MiB | 26 MiB | Load average climbing; 16 threads saturating. |
| 400 × 256 MiB | 6 492 MiB | 16 MiB | Per-VM cost dropping as fixed overhead amortizes. |
| 1 000 × 256 MiB | 9 117 MiB | 9 MiB | Naive un-shared cost: ~250 GiB. Fits in the RAM of a laptop. |
bhyvectl --suspend checkpoint; N copies
of bhyve started with -o snapshot.vnode_restore=true;
settled; host-wide memory delta sampled. See
benchmarks/rigs/bhyve-fanout-rss.sh
and the annotated patch.
Network isolation — cubenet on pf
| metric | cubenet / pf | Cube / eBPF (typical) | notes |
|---|---|---|---|
| sandbox↔sandbox p50 RTT | 7 µs | ~5–10 µs | netperf TCP_RR, 1-byte req/resp. |
| sandbox↔sandbox p99 RTT | 8 µs | ~10–15 µs | stddev 0.51 µs; no GC pauses, no qdisc lottery. |
| TCP throughput, 1 stream | 14.6 Gbit/s | ~15–20 Gbit/s | iperf3 intra-host. Memory-bandwidth limited, not pf. |
| Policy update, 1 000 IPs | 4 ms | Cilium bulk: similar | One pfctl -T replace -f call, atomic. Effective ~250 k ops/sec. |
| Enforcement latency | next packet | next packet | pf tables are O(1) kernel lookups; no stale-ruleset window. |
cubenet0 bridge +
dedicated pf anchor with deny-by-default + explicit allow.
Lab scripts:
lab-setup-freebsd.sh
+ run-net-bench.sh.
Full comparison + methodology at /appendix/ebpf-to-pf;
broader ecosystem picture at /appendix/ebpf-on-freebsd.
The rest of the CubeSandbox feature set
Table · CubeSandbox vs measured FreeBSD alternatives
| Docker | Traditional VM | CubeSandbox | FreeBSD Jail (zfs-clone) | FreeBSD bhyve (durable-prewarm) | |
|---|---|---|---|---|---|
| Isolation model | Shared kernel (namespaces) | Dedicated kernel | Dedicated kernel + eBPF net † | Shared kernel (jail) | Dedicated kernel |
| Cold start (cc=1) | ~200ms † | seconds † | ~60ms † | ~122ms ● | ~17ms ● |
| p95 cold start (cc=10) | — | — | — | 239ms ● | 54ms ● |
| Durability across host reboot | N/A (no pool) | N/A | yes (CH snapshot-restore) † | yes (jails are cheap to respawn) | yes (bhyvectl --suspend to disk) ● |
| Memory / instance | low (shared) | high (full OS) | <5MB (claim) † | ~2MB ● | ~9MB (with vmm-vnode patch) ● |
| Density on 32GB host | hundreds ≈ | dozens ≈ | thousands (claim) † | thousands (~2MB/jail) ≈ | 1000+ (measured, 256MB each) ● |
| Network isolation | namespaces + iptables | N/A | eBPF (CubeNet / XDP) | VNET + pf (7 µs p50 RTT) ● | same (shared cubenet0 bridge) ● |
| E2B SDK drop-in | — | — | 9/17 endpoints ● | yes (our e2b-compat, SDK-verified) ● | same path (shared gateway) ≈ |
- Jails path (shared kernel, no guest boot):
jail-zfs-clone at cc=1 = 122 ms mean, p95
cc=50 = 361 ms, per-jail idle RSS = 2 MB.
~2× Tencent's number under a stricter clock — we measure
jail
exec.startreturn; they measure HTTP round-trip without guest-readiness (CubeAPI/benchmark/runner.go:25-88). - E2B SDK drop-in:
e2b-compat/(Rust/Axum) passes 10/10 of the Python SDK's calls (create, list, get, pause/resume, timeout, snapshots, metrics, kill). The envd-compat endpoint on :49999 streamsrun_codeNDJSON viajexec python3;print(1+1)returns2. See /appendix/e2b-compat + /appendix/run-code-protocol. - The KSM gap — closed. FreeBSD doesn't ship a
kernel-samepage-merging equivalent, and without one a naive
50-entry hot pool costs 12.5 GB. Our
vmm-memseg-vnodepatch takes a different path: back guest RAM directly with the checkpoint file's vnode, layer an anonymous shadow over it for CoW. Same outcome as KSM for the "many guests, one image" case, ~350 LoC across kernel + libvmmapi + bhyve instead of a 3 000+ LoC KSM port. Full numbers above.
Read order
Provenance
All CubeSandbox claims resolve against a pinned commit
(CubeSandbox/CubeAPI GitHub mirror, tag
Apache-2.0-2025-09). All FreeBSD numbers come from JSON
in benchmarks/results/; the scripts that produced them
are in benchmarks/rigs/. Kernel patches live in
patches/ with apply instructions, upstream-readiness
audit, and
inline annotated walk-throughs.
Reading notes live under notes/ (not published).