Trace export (OpenTelemetry)

The Coppice gateway already speaks tracing — the Rust idiom for structured span + event logging. Making those spans visible to Jaeger / Tempo / Grafana is a matter of bolting one exporter layer onto the existing subscriber and sprinkling #[tracing::instrument] on the handler entry points. No rewrite, no separate metrics pipeline, no per-span allocation tax when the exporter is off.

The attachment

In e2b-compat/src/main.rs, the subscriber composition used to be one call to fmt().with_env_filter(…).init(). It is now:

let tracer = init_otel_tracer()?;
match tracer {
    Some(t) => tracing_subscriber::registry()
        .with(env_filter)
        .with(fmt_layer)
        .with(tracing_opentelemetry::layer().with_tracer(t))
        .init(),
    None => tracing_subscriber::registry()
        .with(env_filter)
        .with(fmt_layer)
        .init(),
}

init_otel_tracer returns Some(tracer) iff OTEL_EXPORTER_OTLP_ENDPOINT is set in the environment. Unset, the function returns None on its first std::env::var call — no tonic client, no background batch task, no network I/O. Behaviour is byte-for-byte what it was before the B1 patch.

Set, it builds an OTLP/gRPC exporter pointed at the endpoint (http://collector:4317 in production, http://localhost:4317 against a local container for smoke-testing), attaches a resource with service.name=$OTEL_SERVICE_NAME (default e2b-compat), and installs the resulting Tracer as the global OTel provider so a graceful shutdown drops pending batches before exit.

The span set

The handlers that produce a span:

Span	Site	Attributes
`sandbox.create`	`routes::create_sandbox`	`template`, `cpu_count`, `memory_mb`, `sandbox_id` (recorded after UUID gen)
`sandbox.kill`	`routes::kill_sandbox`	`sandbox_id`
`sandbox.pause` / `sandbox.resume`	`routes::pause_sandbox`, `resume_sandbox`	`sandbox_id`
`sandbox.execute`	`envd::execute`	`code_len`, `language`, `sandbox_id`
`kernel.spawn`	`kernel::spawn_kernel`	`sandbox_id`
`backend.create`	`FreeBSDJailBackend::create` / `create_with_limits`	`sandbox_id`, `template`, `cpu_count`, `memory_mb`, `writable_layer_mb`
`backend.kill_internal`	`state::kill_sandbox_internal`	`sandbox_id`
`files.read` / `files.write` / `files.list` / `files.make_dir` / `files.rename` / `files.remove`	`files.rs` handlers	path, byte count (for writes)
`reaper.sweep`	`reaper::sweep` per 10-s tick	`scanned`, `reaped`

Spans nest naturally because tracing tracks the current subscriber span across .await points. A POST /sandboxes produces one outer sandbox.create that parents one backend.create and one kernel.spawn; the collector renders them as a waterfall.

Request payloads (the POST body for /execute, the bytes going to /files) are deliberately not captured — high-cardinality, often sensitive, and not useful for latency debugging. We record their lengths instead. A separate COPPICE_TRACE_VERBOSE=1 flag could relax this later; we don’t expose one yet.

Enabling

OTEL_EXPORTER_OTLP_ENDPOINT=http://collector:4317 \
  OTEL_SERVICE_NAME=e2b-compat \
  ./target/release/e2b-compat

On the collector side, tools/otel/collector.yaml is a two-line config that opens the OTLP receiver and pipes spans to a debug exporter (stdout) — zero external deps, useful for smoke-testing that the pipeline is wired. Uncomment the otlp/jaeger exporter and point it at a jaegertracing/all-in-one container for a UI. See tools/otel/README.md for the copy-paste sequence.

Example transcript

Running benchmarks/rigs/otel-smoke.sh against a local collector with the debug exporter prints one block per exported span:

Span #0
    Trace ID       : bc9d1467866e5b7ace9442125eaffd49
    Parent ID      : (root)
    ID             : 6a3bcc3b20c11a0b
    Name           : sandbox.create
    Kind           : Internal
    Start time     : 2026-04-23 00:32:11.501 UTC
    End time       : 2026-04-23 00:32:11.512 UTC
    Attributes:
         -> service.name: Str(e2b-compat-smoke)
         -> sandbox_id: Str(785b4d23af564452a3b6c636f41af452)
         -> template: Str(python)
         -> cpu_count: Str(None)
         -> memory_mb: Str(None)

Span #1
    Trace ID       : bc9d1467866e5b7ace9442125eaffd49
    Parent ID      : 6a3bcc3b20c11a0b
    Name           : backend.create
    Attributes:
         -> sandbox_id: Str(785b4d23af564452a3b6c636f41af452)
         -> template: Str(python)

In Jaeger the same trace renders as a two-level waterfall: sandbox.create at the root, backend.create and kernel.spawn as children. The latency cost of each stage is immediate.

Out of scope (deliberately)

Metrics. Per-sandbox CPU / memory gauges are B2’s territory — they ship as a Prometheus text endpoint at /metrics and an rctl(8)-based sampler, not as OTLP metrics. We could fan out to OTLP metrics later but there’s no compelling reason; every host that runs a collector also runs Prometheus scraping.

Log aggregation. tracing events fire inside every span but we don’t forward them as OTLP logs. The gateway logs to stderr and rc.d/coppice_gateway pipes that to /var/log/coppice.log. A future tracing-opentelemetry log layer is a one-liner if the operator prefers Loki / Tempo.

Propagation from the SDK. The E2B Python/Node SDKs don’t emit W3C-TraceContext headers today. A trace starts at the gateway, not at the caller. Bridging requires an SDK patch that isn’t in scope for a compat-shim project.

Receipt

benchmarks/rigs/otel-smoke.sh starts a collector (docker/podman if present; falls back to scraping gateway stderr otherwise), drives one create + one kill, and asserts a sandbox.create span reaches the collector. Rig is FreeBSD-optional — on a dev host without ZFS, the sandbox create fails at the backend but the span still fires and exports, which is what the rig asserts.