The Jails Answer

Before talking numbers, a concession and a framing. Jails are not a FreeBSD equivalent of a microVM. They are a FreeBSD equivalent of namespaces. They share the host kernel. If that’s disqualifying for your threat model, go to bhyve. If you’re comparing against Docker for workloads you’d otherwise run on a shared-kernel system anyway, jails are the fair benchmark — and they are remarkably, almost embarrassingly fast.

What jails give you

VNET — per-jail network stack with its own interfaces, routing table, pf ruleset, and socket buffers. The only real difference vs. Linux network namespaces is that it’s in-tree, was-there-first, and has fewer edge-case quirks.
RCTL — per-jail resource accounting and limits (CPU, memory, disk I/O, processes). Not as rich as cgroups v2 on metrics, but the controls that matter are there.
ZFS integration — per-jail rootfs via clones of a template snapshot. Create-cost is a metadata operation.
Capsicum — not jails exactly, but the FreeBSD capability framework compose cleanly with jails to produce workloads that can’t even express the syscalls they’d need to escape.
pf — host-wide filter with jail-aware anchors, so per-jail egress policy sits naturally with the isolation model.

For a whole class of agent workloads — run LLM-generated Python in a captured-syscalls sandbox that can’t reach the internet except through a pf-policed egress allow-list, with a ZFS-cloned rootfs so each agent is isolated at the filesystem level — jails are the right tool and they are not a compromise. They’re just what FreeBSD has built for decades to do this.

What jails don’t give you

A separate kernel. A kernel-surface vulnerability (parsing bugs in filesystem drivers, CPU microarchitectural attacks like Spectre variants that require SMT co-residency, any privilege-escalation path from user-to-kernel) potentially reaches every jail on the host.
Fake CPU topology per sandbox. You can’t present 2 cores to one jail and 8 to another that both look like dedicated hardware to the guest — they’re all running on the same host kernel, with visibility scoped by /proc-style facades (and FreeBSD has fewer of those by design).
Per-jail kernel tuning. sysctls are host-global or jail-filtered; there’s no per-jail sysctl(8) surface for anything kernel actually uses.
Syscall filtering in depth. Capsicum caps what a process can do; it doesn’t filter which syscalls it attempts. The closest equivalent to Linux seccomp-bpf on FreeBSD is Capsicum + MAC policies, which is capability-based rather than filter-based — stronger in some ways, less ergonomic in others.

The shared-kernel framing is what’s load-bearing. For LLM-generated code under an adversarial threat model, the question is whether a Python runtime can exercise enough kernel surface to find an escape. The pragmatic answer in 2026 is usually “in theory yes, in practice no, but we don’t know what we don’t know.” For every threat model that demands better than that, microVMs are the answer.

The numbers

All measurements on honor (FreeBSD 15.0-RELEASE-p4, amd64). Methodology at /appendix/bench-rig. “Cold start” = wall-clock from jail -c invocation to a canned echo ready returning inside the jail.

Single-sandbox cold start

Chart · Cold start (mean, concurrency=1)

▸ reproduce · mise run bench:jail-raw ·mise run bench:jail-vnet-pf ·mise run bench:jail-zfs-clone ·mise run bench:jail-vnet-zfs-clone · methodology

The three jail configurations differ in where they spend time:

jail-raw copies the template rootfs with cp -R on create; no VNET. This is the lower-bound — the jail lifecycle itself, minimum rootfs provisioning, minimum network setup (ip4=inherit, no VNET stack instantiation).
jail-vnet-pf creates an epair(4), adds it to the jail’s VNET, and pf filters egress. The cost is one ifconfig epair create, one jail setup with VNET, and the kernel’s per-jail network-stack init.
jail-zfs-clone swaps the cp -R for a ZFS clone of a template snapshot. Should be faster than raw for large templates, less interesting for small.

Tail latency under concurrency

Chart · Cold start percentiles (concurrency=50)

▸ reproduce · mise run bench:jail-raw ·mise run bench:jail-vnet-pf ·mise run bench:jail-zfs-clone ·mise run bench:jail-vnet-zfs-clone · methodology

At concurrency 50, serialization sources become visible — the kernel’s per-jail setup locks, ZFS transaction group commits under concurrent clones, pf ruleset reloads if policy changes during the burst. The percentiles tell the story of where the tail lives.

Idle memory overhead

Chart · Idle RSS

▸ reproduce · mise run bench:jail-raw ·mise run bench:jail-vnet-pf ·mise run bench:jail-zfs-clone ·mise run bench:jail-vnet-zfs-clone · methodology

Per-jail RSS at idle, measured across 32 concurrent jails with a simple sleep 60 running in each. This is the closest apples-to-apples to Tencent’s “<5MB overhead per instance” for microVMs — except jails are sharing the host kernel, so the base cost is lower still. There is no “guest kernel overhead” to amortize.

▸ reproduce mise run bench:jail-raw · script

▸ reproduce mise run bench:jail-vnet-pf · script

▸ reproduce mise run bench:jail-zfs-clone · script

Reading the results

Honor: FreeBSD 15.0-RELEASE-p4, template size 374 MB, 30 samples at cc=1/10, 50 samples at cc=50, fresh bench agent state each run. Numbers rounded:

config	cc=1 mean	cc=10 mean	cc=50 mean	cc=50 p95	idle RSS
jail-raw	1230 ms	3790 ms	—	—	2.0 MB
jail-vnet-pf (cp -R)	2010 ms	5520 ms	24 750 ms	30 435 ms	2.0 MB
jail-zfs-clone	122 ms	216 ms	302 ms	361 ms	2.0 MB
jail-vnet-zfs-clone	345 ms	3490 ms	3460 ms	4730 ms	2.1 MB

The spread is an order of magnitude between jail-raw and jail-zfs-clone: the rootfs strategy dominates everything else. A 374 MB cp -R burns ~1.1 seconds by itself; everything else on the jail-raw critical path is in the noise.

jail-zfs-clone at cc=1 is about 2× slower than Tencent’s advertised 60ms CubeSandbox pool-hit. That’s a very different measurement — we’re doing real per-jail filesystem clone + jail create; Cube is doing memory-snapshot resume from a pre-warmed pool — and the numbers are in the same ballpark. A FreeBSD analog to CubeSandbox’s pool would be jail-zfs-clone’s rootfs trick plus a pool of paused jails, which we haven’t rigged for jails specifically; an even lower number is plausible. (The bhyve analog of this paused-pool pattern did get rigged — see /essays/freebsd-bhyve for the two-tier bhyve-durable-prewarm-pool at 17 ms.)
jail-zfs-clone at cc=50 p95 is 361 ms against CubeSandbox’s 90 ms. At this concurrency, ZFS’s transaction-group commit cadence becomes visible; parallel clones serialize on the same txg barrier.
jail-vnet-pf is not a useful apples-to-apples with the others, because it uses cp -R for the rootfs on top of the VNET setup. At cc=50 the mean is ~25 seconds — that’s almost entirely 50-parallel-cp-of-374MB disk thrash, not a network-setup cost.
jail-vnet-zfs-clone is the fair VNET + pf number. At cc=1 it costs ~345 ms — ~223 ms of VNET + pf tax on top of jail-zfs-clone (epair create + VNET stack instantiation + pf ruleset evaluation). Under concurrency that tax blows up badly: cc=10 lands at ~3.5 seconds, cc=50 at ~3.5 seconds with p95 4.7 seconds. Something in the epair + VNET migration path serializes hard under concurrent creates — likely the kernel’s per-jail netinet initialization sharing a lock. This is the most interesting number on the page: it says a jail-hosted CubeNet-equivalent can be cheap at idle concurrency but stalls under burst creation, which is exactly the pattern an agent sandbox service has to survive.
Memory overhead: 2 MB per idle jail. Below Tencent’s advertised <5 MB, but not for the same reason: jails are sharing the host kernel, so there’s no guest-kernel cost to amortize. Cube’s 5 MB includes (a sliver of) guest kernel text/rodata; our 2 MB doesn’t because there isn’t one.

When jails are the right answer

The workload is your own code behaving badly, not adversarial.
Density matters more than isolation ceiling — you’re packing many similar sandboxes on one host.
Operational simplicity matters — no guest to manage, no kernel builds, no snapshot tooling.
Your threat model accepts that the kernel is shared, and you mitigate at the pf / Capsicum / MAC layer rather than the hypervisor layer.

When they aren’t

Untrusted code with a large adversarial budget.
Compliance stances that require “dedicated guest kernel” as a literal bullet point.
Workloads that need to tune kernel parameters per sandbox.

For those, the answer is /essays/freebsd-bhyve.