Live metrics

A deliberately small operational panel. e2b-compat exposes /metrics in Prometheus text format; tools/metrics-scraper.sh polls it and appends one JSON line per sample to data/metrics/YYYY-MM-DD.jsonl. This page reads whatever’s sitting in that directory at build time and plots it. No DB, no dashboard server — just JSONL and a build step.

Alive state

sandboxes + kernels currently in the registry
No samples under data/metrics/. Run mise run metrics:watch while the gateway is busy, then rebuild.

Sandbox lifecycle

create / kill / pause / resume — rate per second
No samples under data/metrics/. Run mise run metrics:watch while the gateway is busy, then rebuild.

Execution path

/execute requests + errors — rate per second
No samples under data/metrics/. Run mise run metrics:watch while the gateway is busy, then rebuild.

Per-sandbox CPU

One line per live sandbox, each labeled with the first 12 hex chars of its id. Source is coppice_sandbox_cpu_percent from rctl(8); 100 = one full core. See per-sandbox metrics for the scraper/label shape.

coppice_sandbox_cpu_percent — per-sandbox
No samples under data/metrics/. Run mise run metrics:watch while the gateway is busy, then rebuild.

How to feed this page

# On your laptop, with honor reachable and the gateway running:
mise run metrics:scrape      # one-shot append to data/metrics/TODAY.jsonl
mise run metrics:watch       # 30 s loop; Ctrl-C to stop

# Run some load so the page has something to show:
mise run demo:notebook       # or run example/07-pool-fanout.sh

# Rebuild the site to see the new samples:
pnpm build

Git-ignored by default. If you want samples to land on the deployed site, drop the data/metrics/ line from .gitignore and commit them — every sample is ~300 bytes.

What’s tracked

metrictypesource
coppice_uptime_secondsgaugeprocess clock
coppice_sandboxes_alivegaugein-memory registry size
coppice_kernels_alivegaugeipykernel PID table size
coppice_sandboxes_created_totalcounterPOST /sandboxes on success
coppice_sandboxes_killed_totalcounterDELETE /sandboxes/:id on success
coppice_sandboxes_paused_totalcounterPOST /sandboxes/:id/pause
coppice_sandboxes_resumed_totalcounterPOST /sandboxes/:id/resume
coppice_execute_requests_totalcounterPOST /execute, counted at the end
coppice_execute_errors_totalcounter/execute that emitted an ExecEnv::Error frame
coppice_kernel_spawns_totalcounteripykernel_launcher spawn succeeded
coppice_kernel_exits_totalcounterSIGTERM or crash reaped
coppice_sandbox_create_ns_sumcounterns spent in create, pair with _created_total
coppice_execute_ns_sumcounterns spent in /execute, pair with _requests_total

No histograms yet — if the running-mean from ns_sum / requests_total stops being precise enough we’ll swap in the metrics crate, but the present shape is enough to show whether the system is doing work.