CubeSandbox feature audit

This is the inventory — Cube’s advertised features on the left, Coppice’s state on the right, one row each. Rows fall into four buckets: closed (measured, receipts attached), partial (the shape is there, edges aren’t), open (acknowledged gap with a test rig proposed), and N/A (the feature is Linux-kernel-shaped in a way that doesn’t translate and doesn’t need to). Cross-links back to parity-gaps where they exist; this page exists to make sure nothing in the README, the docs site, or the examples/ directory slipped between the seams.

Sources surveyed, pinned to commit c439bb513f5124d4d9389451b31b8aeb87ab539c: README.md; docs/ (VitePress site at docs.cubesandbox.ai); examples/code-sandbox-quickstart, browser-sandbox, cubesandbox-base-nginx, openclaw-integration, openai-agents-code-interpreter, openai-agents-example, mini-rl-training; E2B Python SDK surface (e2b + e2b-code-interpreter), cross-read against e2b-compat/src/routes.rs and e2b-compat/src/envd.rs.

Lifecycle & API

Cube featureCoppice statereceipt / note
Create sandbox (POST /sandboxes)closedAxum handler in e2b-compat/src/routes.rs::create_sandbox. SDK round-trip verified. See e2b-compat.
Destroy (DELETE /sandboxes/:id)closedTears down ipykernel, kills jail, releases pool entry. 10/10 SDK calls.
List (GET /sandboxes, v2 with filters)closedBoth /sandboxes and /v2/sandboxes with state + metadata filters.
Get detail (GET /sandboxes/:id)closedReturns the E2B SandboxDetail camelCase shape. CPU/RAM placeholders are hardcoded pending rctl(4) integration — see Observability below.
Pause / resume / connectclosedSIGSTOP/SIGCONT for jail backend; bhyve backend uses bhyvectl —suspend. cc=1 resume 17 ms on bhyve-durable-prewarm-pool. See snapshot-cloning.
Timeout / TTL refreshclosedPOST /sandboxes/:id/timeout and /refreshes mutate end_at; the in-process reaper in e2b-compat/src/reaper.rs sweeps expired sandboxes every 10 s.
Metadata (arbitrary KV at create, filterable)closedCreateRequest.metadata: HashMap; v2 list filters on ?metadata=k=v.
Restart kernel (code-interpreter)closedPOST /contexts/:id/restart in envd.rs::restart_context SIGTERMs the tracked ipykernel PID, polls kill -0 up to 5 s for reap, then calls kernel::spawn_kernel() fresh and swaps the KernelInfo in state.kernels. Bumps coppice_kernel_exits_total + coppice_kernel_spawns_total each restart. Receipt: examples/02-persistent-kernel.py binds x = 42, calls sb.restart_code_context(ctx), asserts the follow-up print(x) returns a NameError (observed green 2026-04-22). SDK verified: e2b_code_interpreter’s Sandbox.restart_code_context() hits <jupyter_url>/contexts/<context_id>/restart.
Durable snapshot creation (POST /sandboxes/:id/snapshots)openReturns 501 today. Cube advertises it as “coming soon” in README. Capability exists below the API (bhyvectl —suspend + ZFS snapshot); it’s exposing it as a named reusable fork-point that’s missing. Test rig: benchmarks/rigs/snapshot-fork.sh would call sandbox.snapshot(), then sandbox.fork(snapshot_id), assert state divergence.
Reaper (TTL enforcement)closede2b-compat/src/reaper.rstokio::spawned at startup, wakes every 10 s, finds sandboxes whose end_at has passed, and destroys each via state::kill_sandbox_internal (the same teardown path DELETE /sandboxes/:id uses). Exposes coppice_sandboxes_reaped_total at /metrics. Rig: benchmarks/rigs/reaper-test.sh creates a sandbox, sets timeout=5, waits 12 s, asserts GET returns 404 and the counter advances — passes on honor.
Per-sandbox live network update (PUT /sandboxes/:id/network)closedHandler in e2b-compat/src/routes.rs translates {allow_out, deny_out} (CIDRs, bare IPs, hostnames — hostnames resolved via tokio::net::lookup_host, lookup failures skipped with a warn) into a pf fragment loaded on anchor cube/sandbox-<short-id> (first 12 hex chars of the sandbox uuid) via pfctl -a <anchor> -f - on stdin. Anchor creation is lazy — the first PUT primes an empty shape then applies the policy. States for newly-denied IPs are flushed via pfctl -k <ip> so in-flight connections don’t survive the change. 10-entry policy lands in 0.66 ms wall (pfctl subprocess, best-of-5); see the ebpf-to-pf honor table. Rig: benchmarks/rigs/net/per-sandbox-policy.sh creates a sandbox, PUTs deny_out=[“1.1.1.1/32”], asserts jexec <jail> fetch http://1.1.1.1/ fails, then PUTs empty and asserts it succeeds. 3/3 green on honor 2026-04-22.

Filesystem, processes, and the envd data plane

Cube featureCoppice statereceipt / note
Code execution — sandbox.run_code()closedNDJSON envelopes over :49999/execute, ipykernel in-jail, state persists. 7/7 SDK checks. See run-code-protocol.
Persistent-kernel state (imports, variables, open files)closedIn-jail Python bridge translates iopub → NDJSON. x = 42 / print(x). receipt.
Rich MIME output (PNG, HTML, SVG)closedpandas DataFrame → text/html, matplotlib → image/png base64. Verified in examples/03-rich-output.py.
Shell commands — sandbox.commands.run()partialOne-shot jexec via our convenience POST /sandboxes/:id/exec; the E2B SDK’s commands.run hits envd’s /commands endpoint family with streaming stdio and a PID-tracked process handle. Not yet implemented. Shape is REST, portable. Test rig: benchmarks/rigs/commands-api.shrun(“sleep 1 && echo hi”), assert stdout stream, PID returned, kill(pid) works.
Filesystem — files.read / files.writeclosedHost-side, no jexec per op. e2b-compat rewrites the caller’s jail-absolute path onto {jails_root}/e2b-<id><path> and uses ordinary tokio::fs. Path safety: reject .., require leading /, canonicalize and verify the rootfs prefix. Receipt: examples/08-filesystem.py (write+read round-trip, 22 bytes, traversal rejected). Design notes: /appendix/filesystem-api.
Filesystem — files.list, make_dir, rename, remove, existsclosedConnect-RPC endpoints at /filesystem.Filesystem/{Stat,ListDir,MakeDir,Move,Remove}, JSON codec. Same host-side ZFS-clone access pattern, same path-safety gate. Receipt: examples/08-filesystem.py — list 8 entries, make_dir nested, rename, remove dir and file, all verified via the Python SDK’s sandbox.files.* surface.
Filesystem watch — files.watchopenE2B’s watcher is inotify-backed on Linux and streamed over WebSocket. On FreeBSD the correct primitive is kqueue(2) with EVFILT_VNODE. Semantically equivalent, cheaper than inotify in our experience; just has to be wired. Test rig: benchmarks/rigs/fs-watch.sh opens a watch, touches a file from outside, asserts event received within 50 ms.
<port>-<id>.<domain> host-header routingclosedpf rdr + dnsmasq + Go cubeproxy. LAN-peer curl to 80-sbxa.coppice.lan:30080 returns 200. See parity-gaps § External → sandbox.

Compute, templates, and storage

Cube featureCoppice statereceipt / note
Template build — cubemastercli tpl create-from-imagepartialWe bake templates ad-hoc (pkg -r /jails/_template install … + zfs snapshot …@base). No tpl CLI that ingests an OCI image and emits a jail or bhyve template. The Cube version consumes a container image (ccr.ccs.tencentyun.com/ags-image/sandbox-code) and inflates it; we’d want coppice-tpl import <oci-ref> → ZFS dataset. Test rig: benchmarks/rigs/tpl-oci-import.sh pulls an alpine image, converts, launches a sandbox from it, asserts uname returns the alpine rootfs.
Templates with pre-installed packagesclosedPython 3.11 + ipykernel + numpy + pandas + matplotlib baked into the _template ZFS dataset; sandbox create clones in milliseconds. Receipt: run-code-protocol update 2026-04-22 later.
Writable-layer sizing (—writable-layer-size 1G)closedCreateRequest.diskSizeMB in e2b-compat/src/routes.rs sets zfs set quota=<mb>M on the sandbox’s clone dataset after zfs clone and before jail -c, so the cap binds the very first write. Receipt: benchmarks/rigs/cpu-mem-limit-test.sh creates with diskSizeMB=100, runs dd if=/dev/zero of=/tmp/big bs=1M count=200 inside the jail, observes ENOSPC at ~100 MB. See e2b-compat.
CPU / memory limits per sandboxclosedCreateRequest.cpuCount + memoryMB in e2b-compat/src/routes.rs drop into rctl -a jail:e2b-<id>:pcpu:deny=<pct> and rctl -a jail:e2b-<id>:memoryuse:deny=<mb>M right after jail -c, so the first jexec(8) process binds to the caps. Receipt: benchmarks/rigs/cpu-mem-limit-test.sh creates with cpuCount=50, memoryMB=128, pins a Python while True: pass, reads rctl -h -u jail:e2b-<id> — observed pcpu≈50. Requires kern.racct.enable=1 in /boot/loader.conf; the rig skips gracefully if racct is off.
GPU passthroughN/Ahonor is a 5900HX laptop APU with no discrete GPU; we can’t measure it. bhyve has PCI passthrough via ppt; on paper a discrete card bound to ppt0 is passable to a guest. Cube’s mini-rl-training example doesn’t actually use a GPU either (it’s LLM-API-driven SWE-bench). Flagged N/A, not open, because the rig cost is a different machine, not engineering time.
Persistent volumes (volumeMounts in create)openSDK field is accepted and silently ignored in e2b-compat’s _extra. The FreeBSD answer is a nullfs mount into the jail (or /dev/md + UFS for writable). Test rig: create with volumeMounts=[{host:/tmp/vol,sandbox:/data}], write from sandbox, assert visible on host.
Custom kernel / custom boot imagepartialbhyve consumes any FreeBSD kernel including our SNAPSHOT build with the vmm-vnode patch — that’s the whole thesis of the vmm work. Loading a Linux guest kernel under bhyve works (grub-bhyve or UEFI) but isn’t wired into the template system. Cube’s model assumes KVM-shaped Linux guests; our parity is “any bhyve-runnable kernel,” which is wider on FreeBSD guests and equal on Linux.

Networking

Most of this is already tracked in parity-gaps and ebpf-to-pf. Listed here for completeness of the feature-row count.

Cube featureCoppice statereceipt / note
Sandbox-to-sandbox isolation (CubeVS eBPF)closedVNET jails + pf anchor; 7 µs p50 TCP_RR. ebpf-to-pf.
Sandbox → external NATclosedpf NAT on vm-public; 18.7 ms ICMP to 1.1.1.1. parity-gaps.
External → sandbox port routingclosedrdr + cubeproxy + wildcard DNS. parity-gaps.
Per-sandbox firewall rules (allow_out / deny_out)closedpf tables + anchors. 250k mutations/sec via pfctl -T replace. The REST handler that accepts the SDK’s shape is still Open, see Lifecycle above.
Air-gapped sandboxes (allow_internet_access=False)partialpf can trivially drop all non-cubenet egress for a sandbox (block-quick in its anchor). The SDK bool is accepted-and-ignored by our create handler today. Test rig: examples/08-air-gap.py — create with allow_internet_access=False, assert curl https://1.1.1.1 times out.
Per-sandbox rate limitsclosedipfw + dummynet. 10-1000 Mbit/s caps each >95% of configured rate. parity-gaps § dummynet.
IPv6 parityclosedDual-stack fd77::/64 ULA + NAT66. v6 TCP_RR 8 µs tied with v4.
Cluster-level multi-node overlayopenCube’s docs advertise multi-node clusters via CubeMaster. Coppice runs single-node today. The FreeBSD answer is vxlan(4) or wireguard between hosts; not measured. Test rig: two-host lab, sandbox on A reaches sandbox on B via VXLAN on cubenet-overlay0. Explicitly outside the single-host-parity mandate but worth a row.

Observability & security

Cube featureCoppice statereceipt / note
Per-sandbox CPU / memory metricsopenGET /sandboxes/:id/metrics returns []. rctl(4) rctl -hu jail:<name> yields CPU time and RSS; bhyve has bhyvectl —get-stats. Test rig: benchmarks/rigs/metrics-sampling.sh — spawn, spin CPU in sandbox, assert cpuUsedPct > 50 within 5 s.
Per-sandbox structured logsopenGET /sandboxes/:id/logs returns []. Jail console + dmesg per-VNET + envd stderr are the source. Shape is a ring buffer + cursor. Test rig: capture envd output, assert SDK’s get_logs() iterator yields them.
Host-side diagnose / triageclosedtools/diagnose.sh bundles pf state + anchors + tables + pflog + netstat, filterable by sandbox or IP. Cube has no obvious single-command equivalent. parity-gaps.
Trace export (OpenTelemetry)openCubeAPI is Rust/Axum; adds OTel via the tracing-opentelemetry crate. We do plain tracing now. Test rig: point an OTel collector at the gateway, assert spans for create_sandbox arrive with sandbox-id attributes.
Hardware isolation (dedicated guest kernel)closedbhyve + vmm(4) for the microVM path; vmm-vnode patch for density. 1000 × 256 MiB in 9.1 GiB host. vmm-vnode-patch.
Syscall-level confinement (seccomp / Capsicum)partialJails are the primary boundary; Capsicum exists and is used piecewise in FreeBSD userland but isn’t enforced on the sandbox’s user processes by default. envd-side Capsicum wrap would be a ~1-day pass. Test rig: benchmarks/rigs/capsicum-envd.sh — start envd under cap_enter, assert open-outside-sandbox fails with ECAPMODE.
Rootless operationopenJails and bhyve both need root on the host today. FreeBSD has security.bsd.unprivileged_chroot and rootless-bhyve patches (unmerged circa 2024) but nothing shippable. Cube requires root too; this is a genuine tie and arguably not a gap — noted for symmetry.
Image signing / template provenanceopenCube pulls signed OCI images from Tencent’s registry. Our templates are whatever lands in zroot/jails/_template. Test rig would wrap the template importer (see Compute § Template build) with signify(1) verification before promoting a dataset.

Developer experience

Cube featureCoppice statereceipt / note
Python SDK (e2b + e2b-code-interpreter)closed7/7 code-interpreter checks, 10/10 lifecycle calls. Examples 0107 under examples/.
Node / JS SDK (@e2b/code-interpreter)closedVerified drop-in against the gateway on honor 2026-04-22 (commit 12462c6f) with e2b@2.19.0 + @e2b/code-interpreter@2.4.0, Node v25.8.1. Receipts in examples/c1-nodejs/: 01-lifecycle.mjs (REST create/list/kill + SDK Sandbox.list() paginator + runCode(“print(1+1)”)), 02-code-interpreter.mjs (state persistence across runCode calls, numpy rich result, restartCodeContext clears state into a NameError, plus NameError/ZeroDivisionError/ModuleNotFoundError frames via Execution.error). Debug-mode caveat (same as Python SDK): in E2B_DEBUG=true the SDK’s Sandbox.create() short-circuits to a stub sandbox, so the example uses raw fetch for lifecycle and the MRU-fallback in e2b-compat/src/envd.rs for envd routing. No gateway changes needed — wire-compatible with the Python SDK verification.
Go SDKclosedNo official github.com/e2b-dev/go-sdk exists yet (that URL 404s). The closest-to-official community SDK github.com/xerpa-ai/e2b-go v0.1.0 verified drop-in against the gateway on honor 2026-04-22: its SandboxPaginator (e2b.List(WithListAPIURL, WithListAPIKey)) talks to GET /v2/sandboxes unmodified. Create/kill//execute//contexts/:id/restart verified via raw HTTP — same wire shape the Python SDK uses under the hood — because the Go SDK’s non-debug envd path assumes wildcard DNS (<port>-<id>.e2b.app) and its debug mode short-circuits Sandbox.New to a mock (same class of caveat as the Python and Node SDKs). Receipts in examples/c2-go/: lifecycle/main.go (create + SDK paginator + print(1+1) + kill round-trip), codeinterpreter/main.go (x = 42 survives across run_code, numpy rich result, kernel restart clears state into a NameError). Both pass; module self-contained, Go 1.24.
cubemastercli (CLI)closedNew coppice binary at e2b-compat/src/bin/coppice.rs — sandbox create/list/kill/exec/logs all wired against the gateway; tpl and pool subcommand surface is shaped but returns exit 3 with a “not implemented” note where the server-side endpoint doesn’t exist yet (pool ops still live in coppice-pool-ctl). Gateway URL resolved from —url / $COPPICE_URL / $E2B_API_URL / http://localhost:3000. Receipt: examples/10-cli-roundtrip.sh (create → exec → kill) plus 5 integration tests in e2b-compat/tests/cli.rs driving the compiled binary against a wire-compatible mock gateway. Install: mise run coppice:install. See /appendix/coppice-cli.
Browser-sandbox demo (Playwright + CDP)openCube’s examples/browser-sandbox runs headless Chromium on port 9000 inside a sandbox, reached over CDP WebSocket through CubeProxy. Chromium on FreeBSD 15 is available (www/chromium) but heavy; a slimmer path is firefox-esr + marionette. Test rig: spawn sandbox with browser template, Playwright connects via 9000-<id>.coppice.lan, page.goto, page.screenshot.
nginx BYOI demo (cubesandbox-base-nginx)partialThe “bring your own image” shape. Our jail template story isn’t OCI-ingesting (see Compute § Template build); the underlying “run a listener on port 80, reach it through port routing” path is closed. Test rig: a FreeBSD jail with www/nginx, curl 80-<id>.coppice.lan.
SWE-bench / mini-rl-training demopartialLLM-API-driven agent loop. All the pieces exist on Coppice (create, run_code, commands.run), but commands.run is still the partial row above. Would close as a drop-in once commands API is done.
OpenAI-Agents SDK integration demopartialSame story — run_code works (closed), session.write/session.exec hit the filesystem + commands APIs (open). One demo, two dependencies.
VS Code / remote-connect integrationopenCube’s architecture docs mention VS Code remote attach via their proxy. E2B doesn’t ship this; if we care it’s a reverse-SSH through the port-router. Low priority, flagged for completeness.
Hot-reload template authoringopenRebuild template + reload pool is a manual workflow today. A coppice-tpl dev watch loop would rebake the dataset on file change.

Summary rows by bucket

The rolling tally, counting this page’s rows (excluding the four closed network items copied from parity-gaps for completeness):

bucketcountinterpretation
closed (measured, receipts attached)18Everything an agent developer hitting our gateway with the E2B Python SDK for code-interpretation, network isolation, pause/resume, and port routing touches — works, measured, has a rig.
partial (shape present, edges missing)10The REST handler returns 501 or accepts-and-ignores; the capability below is already closed. These close by plumbing, not protocol archaeology.
open (rig proposed, not yet run)14Genuine work-units. Filesystem API, commands API streaming, durable snapshot endpoint, Node/Go SDK verification, OCI template import, metrics + logs, trace export, per-sandbox resource limits. Test rig named for each.
N/A1GPU passthrough — we can’t measure it on a GPU-less laptop. The mechanic (bhyve ppt) is present; the rig needs different hardware.

Verdict

Cube advertises roughly 43 features across README, docs, and examples. 18 are measured closed on Coppice with receipts. Ten more are partial — the shape is there, the REST handler just needs to translate the SDK’s payload into the already-closed primitive below (pf anchor mutation, ZFS quota, rctl cap, jexec). Fourteen are genuinely open: they need a test rig and a one-to-few-day pass. One — GPU — is bounded by hardware we don’t own.

The shape of this distribution is what a feature audit of a port that’s done the hard substrate work but not the exhaustive handler-plumbing should look like. The expensive rows (cold start, density, isolation, NAT, port routing, rate limiting, IPv6, L7 policy, persistent kernel) are closed. The cheap rows (accept the SDK field and wire it to the already-working mechanism) are where the remaining work lives. That’s an inversion of the usual software failure mode, where the surface is polished and the guts are brittle; here the guts are measured and the surface has known plumbing left to lay.

None of the open rows reveal a FreeBSD-shaped impossibility. Every proposed rig runs on the existing honor box, using primitives already in base (kqueue, rctl, signify, nullfs, bhyvectl) or trivially available from ports (Playwright via Chromium, OTel collector). The three places where a new piece of FreeBSD engineering would be needed — OCI ingest, Capsicum-wrapping envd, rootless-bhyve — are acknowledged and tracked, and none of them are on the critical path for an agent-sandbox deployment whose threat model is “LLM-generated Python must not eat the host.”

Cross-refs