The chain is simple. A mise run bench:<config> invocation
SCPs benchmark shell scripts to $HONOR_HOST, runs them over
SSH under sudo, captures TSV samples, and wraps them in a
typed BenchmarkRun JSON file in
benchmarks/results/. The Chart component reads those JSON
files at build time. Reproducibility is one shell command per chart.
Host comparison: honor vs. Tencent's reference
Tencent's disclosed host
From the repo at pinned commit c439bb5, the whole
catalog of hardware disclosure for the published
<60ms / p95=90ms / p99=137ms figures amounts to the word
"bare-metal". README.md:94: "Cold start
benchmarked on bare-metal. 60ms at single concurrency; under 50
concurrent creations, avg 67ms, P95 90ms, P99 137ms — consistently
sub-150ms." That sentence is not elaborated anywhere in
README.md, README_zh.md,
docs/**, any closed/open GitHub issue, or the v0.1.0
release notes. The only external data point — an aibase.com summary
of the Tencent blog — mentions a "96-core physical server"
in the density context ("2000+ sandboxes on one machine"),
not the cold-start measurements. CPU vendor/model/clock, RAM
size/type, storage medium, host kernel, guest vmlinux,
and guest rootfs byte size are never published.
More consequential: the in-tree benchmark at
CubeAPI/benchmark/runner.go:25-88 times a single HTTP
POST /sandboxes round trip — clock starts right before
client.Do(req), stops when the response headers
return. The published 60ms is create-request latency from
an HTTP client, not guest userspace readiness. It includes
CubeAPI parsing, CubeMaster scheduling, Cubelet snapshot-clone +
VMM fork, CubeVS network-agent plumbing, and the API's response
write. It does not wait for a guest-side
/ready probe or an exec check. The number is also
heavily assisted by "resource pool pre-provisioning and
snapshot cloning" (README.md:73) — the instance
isn't booting a kernel; it's cloning a pre-warmed VMM snapshot.
Apples-to-apples against "time-to-boot" is not what this
number measures.
honor
From ssh honor 'sysctl -n hw.model hw.ncpu hw.physmem; zpool list zroot':
- CPU: AMD Ryzen 9 5900HX (Zen 3, mobile APU with integrated Radeon graphics), 16 logical CPUs (8C/16T). Base 3.3 GHz / boost 4.6 GHz — a laptop-class part.
- RAM: 32 GB (DDR4-3200 SO-DIMM, non-ECC).
- Storage: single-vdev ZFS pool
zrootonnda0p4.eli— a GELI-encrypted NVMe partition, 888 G total. Template lives on the same dataset. - Host OS: FreeBSD 15.0-RELEASE-p4, amd64.
- Virtualization primitive: jails (OS-level isolation, shared host kernel). No guest kernel. Template is
base.txzextracted → 374 MB on ZFS.
What we can conclude
Three overlapping caveats make any direct comparison apples-to-kumquats:
- Different isolation primitive. CubeSandbox = KVM microVM with a dedicated guest kernel. honor = jails sharing the host kernel. Jails skip the entire guest-kernel / VMM / virtio plumbing that CubeSandbox's 60ms includes. Any "we're faster" finding on honor is partly measuring "jails have less to do," not "our software is better."
- Different hardware class. Tencent benchmarks on unspecified bare-metal (the only external source mentions a server with 96 cores). honor is a laptop-class Ryzen 9 5900HX with non-ECC DDR4 and a single consumer NVMe behind GELI + ZFS. Expect honor to win on single-thread latency (Zen 3 at 4.6 GHz beats most server cores' boost) and lose on concurrency past ~16 jobs, memory bandwidth, and sustained I/O.
- Different clock definition. CubeSandbox's 60ms is
HTTP create-call latency against a pre-warmed snapshot pool. The
jail rigs time
jail -cthroughexec.start="/bin/echo ready"returning inside the jail — a stricter clock that includes actual in-jail exec. Our number with the same definition applied to CubeSandbox would be larger than 60ms.
The ethical comparison isn't a head-to-head table. It's "here's what honor does under our clearly-defined methodology" next to "here's what Tencent claims under their under-specified methodology" — which is what /freebsd-jails and /claims do, with this caveat linked prominently.
References: README.md:94, README.md:142,
README.md:73, CubeAPI/benchmark/runner.go:25-88,
deploy/guest-image/Dockerfile:1,
deploy/one-click/build-vm-assets.sh:219-355,
deploy/one-click/assets/kernel-artifacts/README.md.
Tasks
The canonical task list is .mise.toml at the repo root.
Install mise, then from the project
directory:
# Single config, end-to-end (sync rigs, capture host info, run all concurrencies + RSS):
mise run bench:jail-raw
mise run bench:jail-vnet-pf
mise run bench:jail-zfs-clone
# All implemented configs:
mise run bench:all-jails
Each bench:* task depends on bench:sync (scp the
rigs to honor:/tmp/bench-rigs/) and bench:host-info
(capture kernel/release/CPU/RAM so the summary JSON embeds a full host
record). The internal bench:_run helper iterates over
concurrencies 1/10/50 for cold-start and one RSS pass at cc=32.
.mise.toml 316 lines mise tasks # .mise.toml — task runner for this research notebook.
#
# Invoke with: `mise run <task>` or `mise run <task-a> <task-b>`.
# Reproducibility is the point — if a number is on the site, the mise
# task that produced it is in this file, and the underlying shell rig
# is in benchmarks/rigs/. The site reads both at build time.
[tools]
node = "22"
pnpm = "10"
python = "3.12"
[env]
CUBESANDBOX_COMMIT = "c439bb513f5124d4d9389451b31b8aeb87ab539c"
HONOR_HOST = "honor"
HONOR_RIG_DIR = "/tmp/bench-rigs"
# Shallow defaults for first-pass signal. Bump these (200/100/50) once
# we like the shape. Jail creates aren't cheap — cc=1 at ~1s each.
BENCH_ITERS_CC1 = "30"
BENCH_ITERS_CC10 = "30"
BENCH_ITERS_CC50 = "50"
# ──────────────────────────── site ─────────────────────────────
[tasks.dev]
description = "Run the Astro dev server"
run = "pnpm dev"
[tasks.build]
description = "Build the static site"
run = "pnpm build"
[tasks.test]
description = "Run component + schema tests"
run = "pnpm test"
[tasks.check]
description = "Run astro check"
run = "pnpm check"
[tasks.links]
description = "Check built site for broken links"
run = ["pnpm build", "pnpm links"]
# ──────────────────────────── research ──────────────────────────
[tasks."research:clone"]
description = "Clone CubeSandbox at the pinned commit into /tmp/cubesandbox-research"
run = '''
set -eu
mkdir -p /tmp/cubesandbox-research
cd /tmp/cubesandbox-research
[ -d CubeSandbox/.git ] || git clone https://github.com/TencentCloud/CubeSandbox.git
cd CubeSandbox
git fetch origin main
git checkout "$CUBESANDBOX_COMMIT"
echo "Pinned at $(git rev-parse HEAD)"
'''
# ──────────────────────────── benchmarks ────────────────────────
[tasks."bench:sync"]
description = "Sync benchmarks/rigs to $HONOR_HOST:$HONOR_RIG_DIR"
run = 'scp -r benchmarks/rigs "$HONOR_HOST:$HONOR_RIG_DIR"'
[tasks."bench:setup-pf"]
description = "Apply the benchmark pf ruleset on $HONOR_HOST with a dead-man switch that auto-disables pf after DMS_TIMEOUT seconds if we lose control."
depends = ["bench:sync"]
run = '''
set -eu
: "${DMS_TIMEOUT:=60}"
ssh "$HONOR_HOST" "sudo sh -c 'DMS_TIMEOUT=$DMS_TIMEOUT sh $HONOR_RIG_DIR/setup-pf.sh $HONOR_RIG_DIR/pf.bench.conf'"
# Independent verification from the dev machine — if this fails, the
# dead-man is running and pf will self-disable within DMS_TIMEOUT.
sleep 2
ssh "$HONOR_HOST" 'echo "SSH still alive: $(hostname)"'
'''
[tasks."bench:host-info"]
description = "Capture $HONOR_HOST's kernel/release/CPU/RAM into /tmp/honor-host.json"
run = '''
ssh "$HONOR_HOST" 'python3 - <<PY
import json, platform, subprocess
def s(k): return subprocess.run(["sysctl","-n",k], capture_output=True, text=True).stdout.strip()
print(json.dumps({
"hostname": platform.node(), "kernel": platform.system(),
"release": platform.release(), "cpuModel": s("hw.model"),
"cpuCount": int(s("hw.ncpu") or 0),
"memGB": round(int(s("hw.physmem") or 0) / (1024**3), 2),
}))
PY' > /tmp/honor-host.json
cat /tmp/honor-host.json
'''
# Internal helper: execute one config × (cc1/10/50 cold-start + idle RSS).
# Takes $CONFIG as env var. RSS concurrency defaults to 32 (jails); set
# $RSS_CC for configs where 32 is impractical (bhyve VMs at 512MB each
# would reserve 16GB — we use cc=8 there). Cold-start iteration counts
# per concurrency may also be overridden via $ITERS_CC1/10/50 for configs
# where a boot takes seconds, not milliseconds.
[tasks."bench:_run"]
description = "Internal: run one config's full sweep (cold-start cc1/10/50 + idle RSS)"
hide = true
run = '''
set -eu
: "${CONFIG:?must set CONFIG=jail-raw|jail-vnet-pf|jail-zfs-clone|bhyve-*}"
: "${RSS_CC:=32}"
: "${ITERS_CC1:=$BENCH_ITERS_CC1}"
: "${ITERS_CC10:=$BENCH_ITERS_CC10}"
: "${ITERS_CC50:=$BENCH_ITERS_CC50}"
: "${RIG_ENV:=}"
mkdir -p benchmarks/results
HOST_JSON=$(cat /tmp/honor-host.json)
for CC in 1 10 50; do
case $CC in 1) ITERS=$ITERS_CC1;; 10) ITERS=$ITERS_CC10;; 50) ITERS=$ITERS_CC50;; esac
echo "▸ $CONFIG cold-start @ cc=$CC (iters=$ITERS)"
ssh "$HONOR_HOST" "cd $HONOR_RIG_DIR && sudo sh -c '$RIG_ENV sh $CONFIG.sh $CC $ITERS'" > "/tmp/$CONFIG-cc$CC.tsv"
python3 benchmarks/rigs/summarize.py \
--config "$CONFIG" --metric cold-start-ms --concurrency "$CC" \
--script "benchmarks/rigs/$CONFIG.sh" --input "/tmp/$CONFIG-cc$CC.tsv" \
--output "benchmarks/results/${CONFIG}_cold-start-ms_cc${CC}.json" \
--host-info-json "$HOST_JSON"
done
echo "▸ $CONFIG idle RSS @ cc=$RSS_CC"
ssh "$HONOR_HOST" "cd $HONOR_RIG_DIR && sudo sh $CONFIG-rss.sh" > "/tmp/$CONFIG-rss.tsv"
python3 benchmarks/rigs/summarize.py \
--config "$CONFIG" --metric rss-kb-idle-1s --concurrency "$RSS_CC" \
--script "benchmarks/rigs/$CONFIG-rss.sh" --input "/tmp/$CONFIG-rss.tsv" \
--output "benchmarks/results/${CONFIG}_rss-kb-idle-1s_cc${RSS_CC}.json" \
--host-info-json "$HOST_JSON"
'''
[tasks."bench:jail-raw"]
description = "jail-raw: cold-start @ cc=1,10,50 + idle RSS @ cc=32"
depends = ["bench:sync", "bench:host-info"]
env = { CONFIG = "jail-raw" }
run = "mise run bench:_run"
[tasks."bench:jail-vnet-pf"]
description = "jail-vnet-pf (VNET + pf egress filter): cold-start + idle RSS"
depends = ["bench:sync", "bench:host-info"]
env = { CONFIG = "jail-vnet-pf" }
run = "mise run bench:_run"
[tasks."bench:jail-zfs-clone"]
description = "jail-zfs-clone (per-jail ZFS clone rootfs): cold-start + idle RSS"
depends = ["bench:sync", "bench:host-info"]
env = { CONFIG = "jail-zfs-clone" }
run = "mise run bench:_run"
[tasks."bench:all-jails"]
description = "All three jail configurations"
depends = ["bench:jail-raw", "bench:jail-vnet-pf", "bench:jail-zfs-clone"]
# bhyve tasks. Each config follows the same pattern as the jail tasks:
# bench:_run runs cold-start at cc=1/10/50 plus an idle-RSS sweep.
# bhyve VMs dominate the runtime budget — cc=50 cold-start can take
# minutes at full guest memory. See the rig scripts for iter counts.
[tasks."bench:bhyve-full"]
description = "bhyve-full (full FreeBSD 15 GENERIC guest per iter): cold-start + idle RSS"
depends = ["bench:sync", "bench:host-info"]
env = { CONFIG = "bhyve-full", RSS_CC = "8", ITERS_CC1 = "10", ITERS_CC10 = "10", ITERS_CC50 = "50", RIG_ENV = "TIMEOUT_SEC=180" }
run = "mise run bench:_run"
[tasks."bench:bhyve-minimal"]
description = "bhyve-minimal (MINIMAL-BHYVE kernel — STUB: guest rc.conf not yet trimmed)"
depends = ["bench:sync", "bench:host-info"]
env = { CONFIG = "bhyve-minimal", RSS_CC = "8", ITERS_CC1 = "10", ITERS_CC10 = "10", ITERS_CC50 = "50", RIG_ENV = "TIMEOUT_SEC=60" }
run = "mise run bench:_run"
[tasks."bench:bhyve-prewarm-pool"]
description = "bhyve-prewarm-pool (SIGSTOP/SIGCONT proxy for CH snapshot-restore): cold-start + idle RSS"
depends = ["bench:sync", "bench:host-info"]
env = { CONFIG = "bhyve-prewarm-pool", RSS_CC = "8", ITERS_CC1 = "30", ITERS_CC10 = "30", ITERS_CC50 = "50", RIG_ENV = "POOL_SIZE=50 BOOT_TIMEOUT=240" }
run = "mise run bench:_run"
[tasks."bench:bhyve-durable-pool-setup"]
description = "One-time pool-setup: boot N VMs, bhyvectl --suspend each to /vms/pool/ (requires SNAPSHOT kernel)"
depends = ["bench:sync"]
run = 'ssh "$HONOR_HOST" "sudo sh $HONOR_RIG_DIR/bhyve-durable-pool-setup.sh"'
[tasks."bench:bhyve-durable-pool"]
description = "bhyve-durable-pool: resume from on-disk bhyvectl --suspend checkpoint (SNAPSHOT kernel)"
depends = ["bench:sync", "bench:host-info"]
env = { CONFIG = "bhyve-durable-pool" }
run = "mise run bench:_run"
[tasks."bench:all"]
description = "Run every bench config that's implemented"
depends = ["bench:all-jails"]
[tasks."bench:clean-remote"]
description = "Remove /tmp/bench-rigs and bench tempfiles on $HONOR_HOST"
run = 'ssh "$HONOR_HOST" "rm -rf $HONOR_RIG_DIR /tmp/*.tsv /tmp/bhyve-*.log" || true'
# ─────────────────────────── e2b-compat ─────────────────────────
[tasks."e2b:build"]
description = "Build the e2b-compat binary (cargo build --release)"
run = "cargo build --release --manifest-path e2b-compat/Cargo.toml"
[tasks."e2b:check"]
description = "cargo check the e2b-compat crate"
run = "cargo check --manifest-path e2b-compat/Cargo.toml"
[tasks."e2b:serve"]
description = "Run e2b-compat locally (requires root or sudoers for zfs/jail/jls/ps/kill/jexec)"
run = '''
cargo run --release --manifest-path e2b-compat/Cargo.toml -- \
--listen 127.0.0.1:3000 \
--zfs-pool zroot/jails \
--template-snapshot zroot/jails/_template@base \
--jails-root /jails
'''
[tasks."e2b:sync-honor"]
description = "rsync e2b-compat sources to $HONOR_HOST:/tmp/e2b-compat-src/"
run = '''
ssh "$HONOR_HOST" 'mkdir -p /tmp/e2b-compat-src'
rsync -az --delete --exclude=target/ --exclude=Cargo.lock e2b-compat/ "$HONOR_HOST:/tmp/e2b-compat-src/"
'''
[tasks."e2b:build-honor"]
description = "Build the e2b-compat binary on $HONOR_HOST (FreeBSD-native)"
depends = ["e2b:sync-honor"]
run = 'ssh "$HONOR_HOST" "cd /tmp/e2b-compat-src && cargo build --release"'
[tasks."e2b:serve-honor"]
description = "Run e2b-compat on $HONOR_HOST as root, listen 127.0.0.1:3000"
depends = ["e2b:build-honor"]
run = 'ssh "$HONOR_HOST" "sudo /tmp/e2b-compat-src/target/release/e2b-compat --listen 127.0.0.1:3000 --zfs-pool zroot/jails --template-snapshot zroot/jails/_template@base --jails-root /jails"'
[tasks."e2b:smoke"]
description = "Hit the running e2b-compat with the example smoke-test script"
run = 'sh e2b-compat/examples/smoke-test.sh "${E2B_COMPAT_URL:-http://honor:3000}"'
# ──────────────────────────── demos ────────────────────────────
[tasks."demo:notebook"]
description = "Execute examples/notebook-demo.ipynb against the Coppice gateway"
run = '''
set -eu
HONOR="${HONOR_HOST:-honor}"
# Gateway binds to 127.0.0.1 on honor. Tunnel API + envd through SSH
# so the local SDK reaches them as localhost:3000 / localhost:49999
# — which is what E2B_DEBUG=true hardcodes anyway.
echo "opening SSH tunnel: $HONOR → localhost 3000 + 49999"
ssh -fN -L 3000:127.0.0.1:3000 -L 49999:127.0.0.1:49999 "$HONOR"
ssh_pid=$(pgrep -f "ssh -fN -L 3000:127.0.0.1:3000.*$HONOR" | head -1 || true)
cleanup() { [ -n "${ssh_pid:-}" ] && kill "$ssh_pid" 2>/dev/null || true; }
trap cleanup EXIT
export E2B_API_URL="http://localhost:3000"
export E2B_DEBUG="true"
export E2B_API_KEY="${E2B_API_KEY:-local}"
echo "gateway: $E2B_API_URL (tunneled from $HONOR)"
# uv run with inline deps — no pip, no venv to manage.
uv run --with jupyter --with nbclient --with e2b-code-interpreter \
jupyter nbconvert \
--to notebook \
--execute examples/notebook-demo.ipynb \
--output notebook-demo.executed.ipynb \
--ExecutePreprocessor.timeout=120
echo
echo "executed: examples/notebook-demo.executed.ipynb"
echo "render to HTML: mise run demo:notebook:html"
echo "open live: mise run demo:notebook:view"
'''
[tasks."demo:notebook:view"]
description = "Open the notebook live in nbclassic, tunnel + deps wired up"
run = '''
set -eu
HONOR="${HONOR_HOST:-honor}"
# Same tunnel as demo:notebook — keeps localhost:3000/:49999 pointed at
# the gateway on honor while nbclassic is running.
ssh -fN -L 3000:127.0.0.1:3000 -L 49999:127.0.0.1:49999 "$HONOR"
ssh_pid=$(pgrep -f "ssh -fN -L 3000:127.0.0.1:3000.*$HONOR" | head -1 || true)
cleanup() { [ -n "${ssh_pid:-}" ] && kill "$ssh_pid" 2>/dev/null || true; }
trap cleanup EXIT
export E2B_API_URL="http://localhost:3000"
export E2B_DEBUG="true"
export E2B_API_KEY="${E2B_API_KEY:-local}"
echo "gateway: $E2B_API_URL (tunneled from $HONOR)"
# Run both the server and the in-notebook kernel under the same uv
# environment so `from e2b_code_interpreter import Sandbox` resolves.
# The source ipynb is the live version — open that, not the
# .executed one (nbconvert-rendered outputs confuse readers trying
# to re-run cells).
exec uv run \
--with nbclassic \
--with e2b-code-interpreter \
--with matplotlib --with pandas --with numpy \
jupyter nbclassic examples/notebook-demo.ipynb
'''
[tasks."demo:notebook:html"]
description = "Render the executed notebook to a static HTML page"
depends = ["demo:notebook"]
run = '''
uv run --with jupyter --with nbconvert \
jupyter nbconvert \
--to html \
examples/notebook-demo.executed.ipynb \
--output notebook-demo.html
echo "rendered: examples/notebook-demo.html"
''' Driver helpers
common.sh is sourced by every per-config rig. It provides
timestamp_ms (via Python for portability — FreeBSD
date lacks %N), a run_concurrent
wrapper, and a host_info_json probe.
benchmarks/rigs/common.sh 41 lines bash #!/bin/sh
# Shared helpers for all rigs. Sourced, not run directly.
set -eu
timestamp_ms() {
# FreeBSD date supports %N only through gdate; use python for portability.
python3 -c 'import time; print(int(time.time()*1000))'
}
host_info_json() {
python3 - <<'PY'
import json, platform, subprocess
def sysctl(k):
return subprocess.run(['sysctl','-n',k], capture_output=True, text=True).stdout.strip()
print(json.dumps({
'hostname': platform.node(),
'kernel': platform.system(),
'release': platform.release(),
'cpuModel': sysctl('hw.model'),
'cpuCount': int(sysctl('hw.ncpu') or 0),
'memGB': round(int(sysctl('hw.physmem') or 0) / (1024**3), 2),
}))
PY
}
run_concurrent() {
# Usage: run_concurrent N CMD...
# Runs CMD N times in parallel, prints per-iteration elapsed_ms TSV.
# The loop index (0..N-1) is appended as an extra argument so each
# worker can form a unique name even though $$ is shared by subshells.
# _rc_n/_rc_j are intentionally prefixed to avoid clobbering the
# caller's loop variable (sh functions share scope with their callers).
_rc_n=$1; shift
_rc_j=0
while [ $_rc_j -lt $_rc_n ]; do
( s=$(timestamp_ms); "$@" "$_rc_j" >/dev/null 2>&1; e=$(timestamp_ms); printf "%d\t%d\n" "$_rc_j" "$((e - s))" ) &
_rc_j=$((_rc_j + 1))
done
wait
} summarize.py reads the TSV a rig emits and writes a
BenchmarkRun JSON file validated by the Zod schema in
src/data/benchmarks.ts.
benchmarks/rigs/summarize.py 65 lines python #!/usr/bin/env python3
"""Wrap raw TSV samples into the BenchmarkRun JSON schema."""
import argparse, json, statistics, datetime, sys, subprocess
def host_info():
r = subprocess.run(['python3', '-c', '''
import json, platform, subprocess
def s(k): return subprocess.run(["sysctl","-n",k], capture_output=True, text=True).stdout.strip()
print(json.dumps({
"hostname": platform.node(),
"kernel": platform.system(),
"release": platform.release(),
"cpuModel": s("hw.model") or "unknown",
"cpuCount": int(s("hw.ncpu") or 0),
"memGB": round(int(s("hw.physmem") or 0) / (1024**3), 2),
}))
'''], capture_output=True, text=True)
return json.loads(r.stdout)
def summarize(samples):
ss = sorted(samples)
def pct(p): return ss[min(len(ss) - 1, int(len(ss) * p / 100))]
return {
'mean': statistics.mean(ss),
'p50': pct(50), 'p95': pct(95), 'p99': pct(99),
'min': min(ss), 'max': max(ss), 'n': len(ss),
}
def main():
ap = argparse.ArgumentParser()
ap.add_argument('--config', required=True)
ap.add_argument('--metric', required=True)
ap.add_argument('--concurrency', type=int, required=True)
ap.add_argument('--script', required=True)
ap.add_argument('--input', required=True)
ap.add_argument('--output', required=True)
ap.add_argument('--host-info-json', help='Pre-collected host_info_json; if omitted, shells out to collect')
args = ap.parse_args()
samples = []
with open(args.input) as f:
for line in f:
line = line.strip()
if not line: continue
parts = line.split('\t')
samples.append(float(parts[-1]))
host = json.loads(args.host_info_json) if args.host_info_json else host_info()
out = {
'config': args.config,
'host': host,
'metric': args.metric,
'concurrency': args.concurrency,
'samples': samples,
'summary': summarize(samples),
'scriptPath': args.script,
'runAt': datetime.datetime.utcnow().isoformat() + 'Z',
}
with open(args.output, 'w') as f:
json.dump(out, f, indent=2)
if __name__ == '__main__':
main() Jail configurations
Jail — raw jail-raw
Plain jail, shared rootfs (cp -R from template), no VNET.
▸ reproduce · mise run bench:jail-raw
benchmarks/rigs/jail-raw.sh 46 lines cold-start rig #!/bin/sh
# jail-raw.sh — plain jail with cp -R rootfs, no VNET, no pf.
# Usage: jail-raw.sh <concurrency> <total-iterations>
#
# Runs EXACTLY <total-iterations> jail create/destroy cycles, dispatched
# across a <concurrency>-sized worker pool. Emits one TSV line per
# iteration: <global-index>\t<elapsed-ms>.
set -eu
. "$(dirname "$0")/common.sh"
CONC=${1:-1}
ITERS=${2:-200}
TEMPLATE=${TEMPLATE:-/jails/_template}
[ -d "$TEMPLATE" ] || { echo "missing $TEMPLATE" >&2; exit 2; }
create_one() {
id="bench-$$-$1"
path="/jails/$id"
cp -R "$TEMPLATE" "$path"
jail -c name="$id" path="$path" host.hostname="$id" ip4=inherit persist \
exec.start="/bin/echo ready" \
>/dev/null
jail -r "$id" >/dev/null 2>&1 || true
rm -rf "$path"
}
i=0
while [ $i -lt "$ITERS" ]; do
if [ "$CONC" -gt 1 ]; then
batch_end=$(( i + CONC ))
[ "$batch_end" -gt "$ITERS" ] && batch_end=$ITERS
j=$i
while [ "$j" -lt "$batch_end" ]; do
( s=$(timestamp_ms); create_one "$j" >/dev/null 2>&1; e=$(timestamp_ms); printf "%d\t%d\n" "$j" "$((e - s))" ) &
j=$(( j + 1 ))
done
wait
i=$batch_end
else
s=$(timestamp_ms); create_one "$i"; e=$(timestamp_ms)
printf "%d\t%d\n" "$i" "$((e - s))"
i=$(( i + 1 ))
fi
done benchmarks/rigs/jail-raw-rss.sh 38 lines idle-RSS rig #!/bin/sh
# Start 32 jails, wait 1s, sum RSS of their init processes.
set -eu
. "$(dirname "$0")/common.sh"
TEMPLATE=${TEMPLATE:-/jails/_template}
N=32
i=0
while [ $i -lt $N ]; do
id="bench-rss-$$-$i"; path="/jails/$id"
cp -R "$TEMPLATE" "$path"
jail -c name="$id" path="$path" host.hostname="$id" ip4=inherit persist \
exec.start="/usr/sbin/daemon -f /bin/sleep 60" >/dev/null
i=$((i + 1))
done
sleep 1
i=0
while [ $i -lt $N ]; do
id="bench-rss-$$-$i"
jid=$(jls -j "$id" jid 2>/dev/null || echo "")
if [ -n "$jid" ]; then
rss=$(ps -J "$jid" -o rss= | awk '{s+=$1} END {print s}')
printf "%d\t%s\n" "$i" "$rss"
fi
i=$((i + 1))
done
# teardown
i=0
while [ $i -lt $N ]; do
id="bench-rss-$$-$i"; path="/jails/$id"
jail -r "$id" >/dev/null 2>&1 || true
rm -rf "$path" 2>/dev/null || true
i=$((i + 1))
done Jail — VNET + pf jail-vnet-pf
Jail with VNET (per-jail epair network stack) + pf egress filter.
▸ reproduce · mise run bench:jail-vnet-pf
benchmarks/rigs/jail-vnet-pf.sh 50 lines cold-start rig #!/bin/sh
# jail-vnet-pf.sh — jail with VNET (epair) + active pf.bench.conf.
# Usage: jail-vnet-pf.sh <concurrency> <total-iterations>
#
# Requires: if_epair loaded, pf active with safe ruleset (mise run bench:setup-pf).
set -eu
. "$(dirname "$0")/common.sh"
CONC=${1:-1}
ITERS=${2:-200}
TEMPLATE=${TEMPLATE:-/jails/_template}
[ -d "$TEMPLATE" ] || { echo "missing $TEMPLATE" >&2; exit 2; }
create_one() {
id="benchvp-$$-$1"
path="/jails/$id"
cp -R "$TEMPLATE" "$path"
epair_a=$(ifconfig epair create)
epair_b=$(echo "$epair_a" | sed 's/a$/b/')
oct=$(( $1 % 250 + 2 ))
jail -c name="$id" path="$path" host.hostname="$id" vnet persist \
vnet.interface="$epair_b" \
exec.prestart="ifconfig $epair_b inet 10.88.$oct.2/24 up" \
exec.start="/bin/echo ready" \
>/dev/null
jail -r "$id" >/dev/null 2>&1 || true
ifconfig "$epair_a" destroy 2>/dev/null || true
rm -rf "$path"
}
i=0
while [ $i -lt "$ITERS" ]; do
if [ "$CONC" -gt 1 ]; then
batch_end=$(( i + CONC ))
[ "$batch_end" -gt "$ITERS" ] && batch_end=$ITERS
j=$i
while [ "$j" -lt "$batch_end" ]; do
( s=$(timestamp_ms); create_one "$j" >/dev/null 2>&1; e=$(timestamp_ms); printf "%d\t%d\n" "$j" "$((e - s))" ) &
j=$(( j + 1 ))
done
wait
i=$batch_end
else
s=$(timestamp_ms); create_one "$i"; e=$(timestamp_ms)
printf "%d\t%d\n" "$i" "$((e - s))"
i=$(( i + 1 ))
fi
done benchmarks/rigs/jail-vnet-pf-rss.sh 45 lines idle-RSS rig #!/bin/sh
# Start 32 VNET jails, wait 1s, sum RSS of their init processes.
set -eu
. "$(dirname "$0")/common.sh"
TEMPLATE=${TEMPLATE:-/jails/_template}
N=32
i=0
while [ $i -lt $N ]; do
id="benchvp-rss-$$-$i"; path="/jails/$id"
cp -R "$TEMPLATE" "$path"
ifconfig epair create >/tmp/ep.$$ 2>/dev/null
epair=$(cat /tmp/ep.$$ | tr -d '\n'); epb=$(echo "$epair" | sed 's/a$/b/')
rm -f /tmp/ep.$$
jail -c name="$id" path="$path" host.hostname="$id" vnet persist \
vnet.interface="$epb" \
exec.prestart="ifconfig $epb inet 10.88.$(( $i % 250 )).2/24 up" \
exec.start="/usr/sbin/daemon -f /bin/sleep 60" >/dev/null
i=$((i + 1))
done
sleep 1
i=0
while [ $i -lt $N ]; do
id="benchvp-rss-$$-$i"
jid=$(jls -j "$id" jid 2>/dev/null || echo "")
if [ -n "$jid" ]; then
rss=$(ps -J "$jid" -o rss= | awk '{s+=$1} END {print s}')
printf "%d\t%s\n" "$i" "$rss"
fi
i=$((i + 1))
done
# teardown
i=0
while [ $i -lt $N ]; do
id="benchvp-rss-$$-$i"; path="/jails/$id"
jail -r "$id" >/dev/null 2>&1 || true
# destroy epair interfaces (they were moved to jails)
ifconfig "epair${i}a" destroy 2>/dev/null || true
rm -rf "$path" 2>/dev/null || true
i=$((i + 1))
done Jail — ZFS clone jail-zfs-clone
Jail with per-instance rootfs via ZFS clone of the template snapshot.
▸ reproduce · mise run bench:jail-zfs-clone
benchmarks/rigs/jail-zfs-clone.sh 47 lines cold-start rig #!/bin/sh
# jail-zfs-clone.sh — per-iteration rootfs via ZFS clone of the template
# snapshot. No VNET, no pf.
# Usage: jail-zfs-clone.sh <concurrency> <total-iterations>
#
# Requires: zroot/jails/_template@base snapshot.
set -eu
. "$(dirname "$0")/common.sh"
CONC=${1:-1}
ITERS=${2:-200}
POOL=${POOL:-zroot/jails}
SNAP=${SNAP:-${POOL}/_template@base}
zfs list "$SNAP" >/dev/null 2>&1 || { echo "missing snapshot $SNAP" >&2; exit 2; }
create_one() {
id="benchzfs-$$-$1"
path="/jails/$id"
zfs clone "$SNAP" "$POOL/$id"
jail -c name="$id" path="$path" host.hostname="$id" ip4=inherit persist \
exec.start="/bin/echo ready" \
>/dev/null
jail -r "$id" >/dev/null 2>&1 || true
zfs destroy "$POOL/$id" 2>/dev/null || true
}
i=0
while [ $i -lt "$ITERS" ]; do
if [ "$CONC" -gt 1 ]; then
batch_end=$(( i + CONC ))
[ "$batch_end" -gt "$ITERS" ] && batch_end=$ITERS
j=$i
while [ "$j" -lt "$batch_end" ]; do
( s=$(timestamp_ms); create_one "$j" >/dev/null 2>&1; e=$(timestamp_ms); printf "%d\t%d\n" "$j" "$((e - s))" ) &
j=$(( j + 1 ))
done
wait
i=$batch_end
else
s=$(timestamp_ms); create_one "$i"; e=$(timestamp_ms)
printf "%d\t%d\n" "$i" "$((e - s))"
i=$(( i + 1 ))
fi
done benchmarks/rigs/jail-zfs-clone-rss.sh 38 lines idle-RSS rig #!/bin/sh
# Start 32 ZFS-clone jails, wait 1s, sum RSS of their init processes.
set -eu
. "$(dirname "$0")/common.sh"
POOL=zroot/jails; SNAP=${POOL}/_template@base
N=32
i=0
while [ $i -lt $N ]; do
id="benchzfs-rss-$$-$i"; path="/jails/$id"
zfs clone "$SNAP" "$POOL/$id"
jail -c name="$id" path="$path" host.hostname="$id" ip4=inherit persist \
exec.start="/usr/sbin/daemon -f /bin/sleep 60" >/dev/null
i=$((i + 1))
done
sleep 1
i=0
while [ $i -lt $N ]; do
id="benchzfs-rss-$$-$i"
jid=$(jls -j "$id" jid 2>/dev/null || echo "")
if [ -n "$jid" ]; then
rss=$(ps -J "$jid" -o rss= | awk '{s+=$1} END {print s}')
printf "%d\t%s\n" "$i" "$rss"
fi
i=$((i + 1))
done
# teardown
i=0
while [ $i -lt $N ]; do
id="benchzfs-rss-$$-$i"
jail -r "$id" >/dev/null 2>&1 || true
zfs destroy "$POOL/$id" 2>/dev/null || true
i=$((i + 1))
done Jail — VNET + pf + ZFS clone jail-vnet-zfs-clone
The fair VNET + pf comparison: ZFS-clone rootfs + per-jail VNET stack + active pf egress filter.
▸ reproduce · mise run bench:jail-vnet-zfs-clone
benchmarks/rigs/jail-vnet-zfs-clone.sh 56 lines cold-start rig #!/bin/sh
# jail-vnet-zfs-clone.sh — jail with VNET (epair) + active pf.bench.conf,
# but per-iteration rootfs via ZFS clone of the template snapshot. This
# is the fair comparison for "network-isolated jail with dynamic egress
# policy", since jail-vnet-pf uses cp -R and is dominated by rootfs cost.
# Usage: jail-vnet-zfs-clone.sh <concurrency> <total-iterations>
#
# Requires: if_epair loaded; zroot/jails/_template@base snapshot;
# pf active with safe ruleset (`mise run bench:setup-pf`).
set -eu
. "$(dirname "$0")/common.sh"
CONC=${1:-1}
ITERS=${2:-200}
POOL=${POOL:-zroot/jails}
SNAP=${SNAP:-${POOL}/_template@base}
zfs list "$SNAP" >/dev/null 2>&1 || { echo "missing snapshot $SNAP" >&2; exit 2; }
create_one() {
id="benchvpz-$$-$1"
path="/jails/$id"
zfs clone "$SNAP" "$POOL/$id"
epair_a=$(ifconfig epair create)
epair_b=$(echo "$epair_a" | sed 's/a$/b/')
oct=$(( $1 % 250 + 2 ))
jail -c name="$id" path="$path" host.hostname="$id" vnet persist \
vnet.interface="$epair_b" \
exec.prestart="ifconfig $epair_b inet 10.88.$oct.2/24 up" \
exec.start="/bin/echo ready" \
>/dev/null
jail -r "$id" >/dev/null 2>&1 || true
ifconfig "$epair_a" destroy 2>/dev/null || true
zfs destroy "$POOL/$id" 2>/dev/null || true
}
i=0
while [ $i -lt "$ITERS" ]; do
if [ "$CONC" -gt 1 ]; then
batch_end=$(( i + CONC ))
[ "$batch_end" -gt "$ITERS" ] && batch_end=$ITERS
j=$i
while [ "$j" -lt "$batch_end" ]; do
( s=$(timestamp_ms); create_one "$j" >/dev/null 2>&1; e=$(timestamp_ms); printf "%d\t%d\n" "$j" "$((e - s))" ) &
j=$(( j + 1 ))
done
wait
i=$batch_end
else
s=$(timestamp_ms); create_one "$i"; e=$(timestamp_ms)
printf "%d\t%d\n" "$i" "$((e - s))"
i=$(( i + 1 ))
fi
done benchmarks/rigs/jail-vnet-zfs-clone-rss.sh 46 lines idle-RSS rig #!/bin/sh
# Start 32 VNET+ZFS-clone jails, wait 1s, sum RSS of their init processes.
set -eu
. "$(dirname "$0")/common.sh"
POOL=${POOL:-zroot/jails}
SNAP=${SNAP:-${POOL}/_template@base}
N=32
i=0
while [ $i -lt $N ]; do
id="benchvpz-rss-$$-$i"; path="/jails/$id"
zfs clone "$SNAP" "$POOL/$id"
epair_a=$(ifconfig epair create)
epair_b=$(echo "$epair_a" | sed 's/a$/b/')
jail -c name="$id" path="$path" host.hostname="$id" vnet persist \
vnet.interface="$epair_b" \
exec.prestart="ifconfig $epair_b inet 10.88.$(( i + 2 )).2/24 up" \
exec.start="/usr/sbin/daemon -f /bin/sleep 60" >/dev/null
i=$((i + 1))
done
wait
sleep 1
i=0
while [ $i -lt $N ]; do
id="benchvpz-rss-$$-$i"
jid=$(jls -j "$id" jid 2>/dev/null || echo "")
if [ -n "$jid" ]; then
rss=$(ps -J "$jid" -o rss= | awk '{s+=$1} END {print s}')
printf "%d\t%s\n" "$i" "$rss"
fi
i=$((i + 1))
done
# teardown
for d in /jails/benchvpz-rss-$$-*; do
[ -d "$d" ] || continue
name=$(basename "$d")
jail -r "$name" 2>/dev/null || true
zfs list "$POOL/$name" >/dev/null 2>&1 && zfs destroy -f "$POOL/$name" 2>/dev/null || rm -rf "$d" 2>/dev/null
done
for ep in $(ifconfig -l | tr ' ' '\n' | grep '^epair'); do
ifconfig "$ep" destroy 2>/dev/null || true
done bhyve configurations
All bhyve configs share a FreeBSD 15 VM image fetched once:
ssh honor 'mkdir -p /tmp/bhyve-images
cd /tmp/bhyve-images && fetch https://download.freebsd.org/releases/VM-IMAGES/15.0-RELEASE/amd64/Latest/FreeBSD-15.0-RELEASE-amd64-ufs.raw.xz && xz -d FreeBSD-15.0-RELEASE-amd64-ufs.raw.xz' Durable bhyve configs (bhyve-durable-pool
and bhyve-durable-prewarm-pool) additionally require a
host kernel compiled with options BHYVE_SNAPSHOT, plus a
bhyve + bhyvectl userspace built with WITH_BHYVE_SNAPSHOT=YES.
The option is not in GENERIC on FreeBSD 15.0-RELEASE, so we built it from
source. The full kernel reproduction:
# 1. source (if not already installed)
sudo fetch -o /tmp/src.txz https://download.freebsd.org/releases/amd64/15.0-RELEASE/src.txz
sudo tar -C /usr/src -xf /tmp/src.txz # roughly 250 MB
# 2. author SNAPSHOT config (GENERIC + options BHYVE_SNAPSHOT)
sudo tee /usr/src/sys/amd64/conf/SNAPSHOT > /dev/null <<EOF
include GENERIC
ident SNAPSHOT
options BHYVE_SNAPSHOT
EOF
# 3. kernel build — ~5 min on a Ryzen 9 5900HX with -j16
sudo make -C /usr/src -j16 buildkernel KERNCONF=SNAPSHOT
# 4. SAFETY NET: create a ZFS boot environment snapshot before swapping kernels
sudo bectl create pre-snapshot-kernel-$(date +%Y-%m-%d)
# 5. install kernel (current kernel moves to /boot/kernel.old/)
sudo make -C /usr/src DESTDIR=/ installkernel KERNCONF=SNAPSHOT
# 6. reboot
sudo shutdown -r now
# 7. post-reboot: rebuild bhyve + bhyvectl userspace with the option
sudo make -C /usr/src/usr.sbin/bhyvectl WITH_BHYVE_SNAPSHOT=YES MK_BHYVE_SNAPSHOT=yes install
sudo make -C /usr/src/usr.sbin/bhyve WITH_BHYVE_SNAPSHOT=YES MK_BHYVE_SNAPSHOT=yes install
# 8. verify
bhyvectl 2>&1 | grep -E '\-\-suspend|\-\-checkpoint' # should now appear
sysctl kern.ident # SNAPSHOT
The bectl create line is the recovery safety net for the
kernel swap. If the new kernel fails to boot, press 8 at
the loader menu (Boot Environments), pick the pre-kernel BE, press
enter. The SNAPSHOT swap booted clean in our case, so we never
invoked the rollback. (The earlier pf lockout below was recovered
via physical console + pfctl -d, before the
BE-safety-net pattern landed in this workflow.)
Legacy caveat — the following from an earlier pass
still applies to the bhyve-minimal rig, which is still a
stub:
for bhyve-full and
bhyve-prewarm-pool a FreeBSD VM image must be fetched from
freebsd.org;
for bhyve-minimal a custom MINIMAL kernel needs to be built
from /usr/src. The mise tasks exist as placeholders; see
.mise.toml.
- bhyve — full guest bhyve-full
Full FreeBSD guest booted per iteration (GENERIC kernel, baseline).
mise run bench:bhyve-full(returns exit 1 with TODO message) - bhyve — minimal bhyve-minimal
Stripped MINIMAL kernel + tiny initramfs — the apples-to-apples microVM config.
mise run bench:bhyve-minimal(returns exit 1 with TODO message) - bhyve — pre-warm pool bhyve-prewarm-pool
Pre-booted + SIGSTOP paused VMs; "create" == SIGCONT. Proxy for Cloud Hypervisor snapshot-restore.
mise run bench:bhyve-prewarm-pool(returns exit 1 with TODO message) - bhyve — durable pool bhyve-durable-pool
Resume from on-disk bhyvectl --suspend checkpoint (requires SNAPSHOT kernel). Survives reboot; the real analog to Cube durable snapshots.
mise run bench:bhyve-durable-pool(returns exit 1 with TODO message) - bhyve — durable + prewarm bhyve-durable-prewarm-pool
Two-tier pool: on-disk ckps as cold tier, N prewarmed-and-SIGSTOP'd VMs as hot tier. The actual Cube analog.
mise run bench:bhyve-durable-prewarm-pool(returns exit 1 with TODO message)
pf lockout safety
On first run, the jail-vnet-pf setup phase locked
honor out of SSH for ~35 minutes at
2026-04-21T22:54. The ruleset was block out all / pass
out on lo0 all with no explicit SSH pass rules; applying it with
pfctl -e dropped the outbound half of the active SSH session
and the host went dark until it was physically recovered.
The fix is layered. The committed ruleset at benchmarks/rigs/pf.bench.conf has:
-
set skip on lo0— pf never filters loopback, period. -
pass in quick proto tcp to port 22 keep stateandpass out quick proto tcp from port 22 keep state— management SSH is permitted in both directions regardless of any later block rule. -
pass quick proto icmp— ICMP stays open for quick sanity probes. -
block out all— default-deny egress, which is the actual experimental rule we're measuring the cost of.
But even a correct ruleset has a bootstrap problem: if you typo the
next version of it, you still lock yourself out. The committed setup-pf.sh wraps pfctl -f + pfctl -e in a daemon(8)-spawned dead-man
switch. If the script doesn't reach its final kill line
within DMS_TIMEOUT seconds (default 60), pfctl -d
fires from the dead-man child and pf disables itself. The mise task bench:setup-pf drives this end-to-end and performs an
independent SSH-reachability check from the dev machine post-apply —
if that check fails, the dead-man handles the cleanup.
benchmarks/rigs/pf.bench.conf 31 lines pf ruleset # pf.bench.conf — ruleset applied on honor for the jail-vnet-pf rig.
#
# IMPORTANT: designed to NEVER lock out the SSH control plane. The
# earlier version of this ruleset (`block out all` with no management
# pass rules) wedged honor at 2026-04-21T22:54 because the outbound
# half of the active SSH session got dropped. Do not reintroduce that
# shape. Both `pass in` and `pass out` for TCP/22 are explicit here,
# and `set skip on lo0` guarantees loopback is never filtered.
#
# Behavior: default-deny egress on the external NIC, except the SSH
# management traffic and ICMP. This matches the posture CubeVS
# enforces per-sandbox — block all except explicitly allowed.
# Loopback — never filter.
set skip on lo0
# SSH management plane, both directions. Keep-state required so once a
# session is established, reply traffic is authorized from the state
# entry rather than needing its own rule.
pass in quick proto tcp to port 22 keep state
pass out quick proto tcp from port 22 keep state
# ICMP — lets ping-based health checks keep working.
pass quick proto icmp
pass quick proto icmp6
# Default egress: block everything else. This is the part we are
# actually measuring — the cost of per-packet pf filtering on jail
# egress traffic.
block out all benchmarks/rigs/setup-pf.sh 52 lines dead-man-guarded applier #!/bin/sh
# setup-pf.sh — apply pf.bench.conf with a dead-man's-switch that auto-
# reverts if SSH isn't confirmed alive within DMS_TIMEOUT seconds.
#
# Usage (as root, typically via `mise run bench:setup-pf`):
# sh setup-pf.sh /path/to/pf.bench.conf
#
# Safety model:
# 1. Spawn a detached child via daemon(8) that sleeps DMS_TIMEOUT
# seconds and then calls `pfctl -d`. Its PID is written to a
# pidfile so we can cancel it.
# 2. Apply the ruleset with pfctl -f and enable pf.
# 3. If we reach the end of the script without the shell dying, cancel
# the dead-man.
#
# If SSH goes away during (2) — e.g. because the rules were wrong — the
# dead-man fires, pf is disabled, and the host becomes reachable again.
set -eu
RULES=${1:?usage: setup-pf.sh /path/to/pf.bench.conf}
DMS_TIMEOUT=${DMS_TIMEOUT:-60}
DMS_PIDFILE=/tmp/bench-pf-deadman.pid
if [ ! -f "$RULES" ]; then
echo "setup-pf: rules file not found: $RULES" >&2
exit 2
fi
# Ensure pf kernel module is loaded.
kldload pf 2>/dev/null || true
# Fire the dead-man via daemon(8) so it survives our shell exiting.
# daemon -f detaches, -p writes the child's pid.
daemon -f -p "$DMS_PIDFILE" /bin/sh -c "sleep $DMS_TIMEOUT; pfctl -d; logger 'bench-safe: pf disabled by deadman after ${DMS_TIMEOUT}s'"
sleep 0.2 # let daemon write the pidfile
# Apply ruleset and enable pf. `pfctl -f` preserves the existing state
# table, so an active SSH session is kept alive across reload.
pfctl -f "$RULES"
pfctl -e 2>/dev/null || true
pfctl -s info | head -1
# Cancel the dead-man — we're alive and rules applied cleanly.
if [ -f "$DMS_PIDFILE" ]; then
DMS_PID=$(cat "$DMS_PIDFILE")
kill -TERM "$DMS_PID" 2>/dev/null || true
rm -f "$DMS_PIDFILE"
fi
echo "setup-pf: bench ruleset active; deadman cancelled" Known caveats
- Host contention.
honoris not dedicated to these benchmarks; other workloads on the host contribute variance. Re-run for statistical confidence, don't trust single passes. - Page-cache warmup. First iterations of each rig touch cold disk. We intentionally don't discard the first N samples — all samples are in the raw TSV and the summary percentiles reflect them. If you want a warm-only number, summarize over a trimmed slice.
- "Sandbox delivery" definition. The clock starts when
the rig invokes the VM/jail create and stops when
exec "echo ready"returns successfully. Pool-warming, rootfs provisioning, and network setup are all inside that window. - Concurrency scaling.
run_concurrentspawns N background shells andwaits. The per-iteration elapsed time is measured end-to-end in each shell, so at high concurrency you are measuring best-effort parallel throughput, not serial latency.