The REST surface is the skeleton. The nervous system — the thing that
makes sandbox.run_code(“print(‘hi’)”) feel like a live
Jupyter cell — is a WebSocket upgrade that proxies a Jupyter-kernel
messaging protocol. Our /appendix/e2b-compat
page only names it. This page lays it out, because it’s the biggest
single piece of work in a real E2B port and the site under-counts it.
Where the clients actually connect
Two SDKs, two surfaces:
e2b(generic sandbox) — REST for lifecycle, WebSocket(s) for the RPC surface (process API, filesystem API, terminal API). Bundles its own framing on top of WebSocket, structured as bidirectional RPC envelopes.e2b-code-interpreter— REST to create the sandbox and then a Jupyter kernel WebSocket for code execution, matching the Jupyter messaging protocol byte-for-byte. When the agent callssandbox.run_code(code)under the hood it opens a WS to a path like/sandboxes/<id>/jupyter/api/kernels/<kernel_id>/channelsand speaks the same protocol any JupyterLab tab speaks.
The e2b-code-interpreter path is the one most agent
developers consume — it’s what maps directly to ChatGPT’s “Code
Interpreter” UX.
What the Jupyter messaging protocol is
The canonical reference is jupyter-client.readthedocs.io/en/latest/messaging.html. Over a WebSocket, each frame is a JSON envelope:
{
"header": { "msg_id": "...", "msg_type": "execute_request",
"session": "...", "username": "...", "version": "5.3",
"date": "2026-04-22T...Z" },
"parent_header": {},
"metadata": {},
"content": { "code": "print('hi')", "silent": false, "store_history": true,
"user_expressions": {}, "allow_stdin": false,
"stop_on_error": true },
"buffers": []
}
The protocol is strongly asynchronous. Client sends one
execute_request; server emits a stream of reply messages
tagged with parent_header.msg_id referring back to the
request. The canonical sequence for print(‘hi’); 1+1:
status—execution_state: busyexecute_input— echoes the code so the client can display itstream—name: stdout, text: “hi\n”execute_result— the value of the last expression as a MIME bundle (e.g.{"text/plain": "2"}) — the client picks the richest renderer it supportsexecute_reply—status: ok, execution_count: Nstatus—execution_state: idle
For a cell that produces an image (matplotlib), a DataFrame (pandas), or a LaTeX expression, the stream adds:
-
display_data— MIME bundle, e.g.{"image/png": "<base64>", "text/plain": "<Figure 1>"} -
error— traceback, e.g.{"ename": "TypeError", "evalue": "...", "traceback": ["Traceback (most recent call last):", "..."]}
The MIME bundle is where the richness lives. A competent client picks
image/png over text/plain, renders text/html for a DataFrame as a
real HTML table, and falls back to text/plain for everything else.
Why it’s structured this way
Because it’s IPython. ipykernel is a thin framing layer
on top of a Python interpreter that exposes the REPL state over ZMQ
sockets; Jupyter’s server front-ends that ZMQ with WebSockets for
browser clients. e2b-code-interpreter is — at the
SDK/client level — a Jupyter-Kernel client. E2B runs ipykernel inside
the sandbox.
What a real FreeBSD backend has to do
Four pieces, in order of difficulty:
1. Run ipykernel in the jail
The sandbox template must include python3, ipython,
and ipykernel — plus whatever packages the agent workflow
needs (numpy / pandas / matplotlib for the common case). Concretely,
a FreeBSD jail template that’s a peer to E2B’s default Python
templates requires:
- FreeBSD 15 base
lang/python311(or 312) from ports/pkgdevel/py-ipykernel,math/py-numpy,math/py-pandas,graphics/py-matplotlib,science/py-scikit-learn, …- An
rc.d/ipykernelservice or anexec.startthat launchesjupyter-kernelbound to a known port (or stdio over vsock-equivalent)
Rough template size: 500 MB to 1 GB after trim. Still ZFS-cloneable in milliseconds so per-sandbox provisioning stays cheap.
2. Manage the kernel lifecycle
Starting and stopping Jupyter kernels. Kernels can die — agent code
might call os._exit(0), run a deadlock, or OOM. The
gateway has to:
- Spawn a kernel on sandbox create (or lazily on first
run_code) - Track the kernel’s PID / connection file
- Kill and restart on request (
sandbox.restart_kernel()) - Surface kernel crashes back to the client as synthetic
status: deadmessages
3. Proxy Jupyter WebSocket ↔ ZMQ
The kernel itself speaks ZMQ over a connection.json file
with five sockets (shell, iopub, stdin, control, heartbeat). Jupyter
Server projects these onto a single WebSocket with a channel
field on each message. The gateway does the protocol translation.
Rust has decent ZMQ support via zeromq crate. The code
is mechanical: for each sandbox.run_code(), open a
WebSocket, attach the five ZMQ sockets to the in-jail kernel, forward
messages bidirectionally with the channel-name translation layer.
4. Filesystem + process APIs
Separate WebSocket or REST endpoints (E2B uses both):
sandbox.files.write(path, content)— stream-friendly bulk uploadsandbox.files.read(path)— bulk downloadsandbox.files.watch(path)— inotify-equivalent (kqueue(2)on FreeBSD)sandbox.commands.run(“cmd”)— spawn a non-kernel process, stream its stdio over WS
On FreeBSD, kqueue covers the filesystem-watch case.
commands.run is a jexec call with WS as
the transport.
What we actually observed running the SDK against our MVP
On 2026-04-22 we pip-installed e2b-code-interpreter on honor and pointed it at our e2b-compat server. The transcript:
[1] Sandbox.create(template="default", timeout=30)
OK: id=5dca34bc59fe436d816deac361577b8c
[2] run_code("print(1+1)")
FAILED ConnectError: [Errno 8] Name does not resolve
[3] kill
OK
Create + list + kill via the real SDK works. That’s a meaningful drop-in proof: the E2B Python Sandbox.create call reaches our axum server, our server creates a ZFS clone + jail, our response deserializes correctly, and the SDK constructs a Sandbox object it’s happy with.
run_code fails at DNS resolution because the SDK routes its envd traffic by sandbox-scoped hostname. From e2b.connection_config.ConnectionConfig.get_host:
def get_host(self, sandbox_id, sandbox_domain, port):
if self.debug:
return f"localhost:{port}"
return f"{port}-{sandbox_id}.{sandbox_domain}"
For e2b-code-interpreter, port = JUPYTER_PORT = 49999, so a run_code call tries to open an HTTP connection to e.g.
49999-5dca34bc....honor/execute. That’s the same host-header-based routing scheme CubeProxy (nginx+lua) implements in CubeSandbox. We don’t have that proxy, so the hostname doesn’t resolve.
The gap is concrete and small. To make run_code actually work we need three pieces, in order:
envd(or an equivalent) running inside the jail, listening on 49999 forPOST /execute+POST /contexts+ their friends. Insidee2b-code-interpreter,run_codeis a plain HTTP call against the envd’s/executeendpoint with the code in the body and stdout/stderr streamed back. It is not a Jupyter-ZMQ-over-WebSocket call from the client side — E2B’s cloud hides that behind a REST façade. (This is a pleasant surprise; ZMQ-multiplexing would be harder.)- A reverse proxy that routes
<port>-<sandbox_id>.<domain>to the right jail’s envd on port<port>. This is ~30 lines of nginx config, or any Go/Rust HTTP router that parses the Host header. Cube does this in CubeProxy. - DNS setup. Either a wildcard
*.sandboxes.local→ proxy host, or we have agents configureE2B_DOMAINto something the client can resolve and setE2B_DEBUG=1to make the SDK uselocalhost:49999directly (fine for single-box experiments).
What’s inside envd: Looking at the E2B open-source runtime (e2b-dev/infra’s envd/), it’s a Go binary that embeds an IPython-Jupyter kernel, exposes /execute / /contexts as streaming HTTP endpoints, and handles filesystem + process APIs on sibling ports. So the “run_code” path on the E2B side is: client → REST /execute on envd → envd internally drives an IPython kernel → streams text/event-stream results back. The client parses SSE frames for stdout, stderr, and the final result bundle.
What a FreeBSD port owes: A FreeBSD-flavored envd equivalent. In scope:
- Python 3 + IPython kernel in the jail template (packaging work, not code).
- A small HTTP server inside the jail that proxies
POST /executeto the kernel (likely writable in ~300 lines of Rust or Go). - A host-side reverse proxy keyed by Host-header subdomain.
None of this is hard; it’s just plumbing. The earlier “weeks of ZMQ multiplexer work” framing in the run_code section below assumed WebSocket + ZMQ, which is a deeper stack. The reality — envd exposes REST, client consumes SSE — is cheaper to port.
Update (2026-04-22): run_code works end-to-end
After the first pass of this appendix we built an envd-compat endpoint directly into e2b-compat — a second Axum listener on port 49999 that handles POST /execute, POST /contexts, etc., and streams the NDJSON protocol the SDK consumes. With E2B_DEBUG=true the SDK routes <jupyter_url> at localhost:49999 and the pipe goes straight to our handler.
Transcript (2026-04-22, honor):
sandbox: 4dae9f7f532c4fc7b23bdff8500b5f47
[1] hello world (python)
stdout: ['hello, world\n']
[2] arithmetic
stdout: ['0\n', '1\n', '4\n', '9\n', '16\n']
[3] stderr
stdout: ['and some stdout\n']
stderr: ['an error line\n']
[4] NameError → SDK error
stderr: ['Traceback (most recent call last):\n',
' File "<string>", line 1, in <module>\n',
"NameError: name 'undefined_variable' is not defined\n"]
error: NonZeroExit: exit code 1
[5] shell
stdout: ['bash\n', 'DAEMON\n', 'FILESYSTEMS\n', 'LOGIN\n']
The first pass covers everything a stateless agent-code-interpreter demo needs: stdout streaming, stderr streaming, error surfacing with traceback, language switching. We installed Python 3.11 into the jail template (pkg -c /jails/_template install -y python311 + symlink /usr/local/bin/python3 → python3.11 + refresh the @base snapshot) so that ZFS clones of the template have Python already baked in.
Update (2026-04-22, later): persistent ipykernel, state across calls
The remaining gap in the first pass was that every run_code spawned a fresh jexec -l <jail> /usr/local/bin/python3 -u -c <code> subprocess. Variables, imports, open files — nothing survived across calls. That’s not a code-interpreter backend; it’s a one-shot.
Closing the gap was mostly plumbing, not protocol archaeology:
- Bake ipykernel into the template.
pkg -r /jails/_template install -y py311-ipykernel py311-pyzmq py311-numpy py311-matplotlib py311-pandas+zfs snapshot zroot/jails/_template@base. All bindings clean out of the box on FreeBSD 15 — no pip-in-venv fallback needed. (pkg -cwantsPROC_NO_NEW_PRIVS; we usedpkg -r <rootdir>instead, which doesn’t require a chroot.) - Spawn the kernel on sandbox create.
jexec <jail> /usr/local/bin/python3 -m ipykernel_launcher -f /tmp/connection.json. Store its host-side PID in aHashMap<sandbox_id, KernelInfo>onAppState. OnDELETE /sandboxes/:idwekill -TERMthe PID before reaping the jail. - Bridge to the kernel on
/execute. An in-jail Python script (/usr/local/libexec/e2b-kernel-bridge.py) is spawned per request. It usesjupyter_client.BlockingKernelClientto load/tmp/connection.json, sends anexecute_requeston shell, and translates iopub messages into the envd NDJSON envelope format directly —stream→stdout/stderr,execute_result/display_data→resultwith MIME keys (text,html,png,svg,json, …),error→errorwithname/value/traceback.
Why a Python bridge instead of a Rust ZMQ client: on FreeBSD the zeromq-crate + libzmq combo is passable but noisy to build; the Python jupyter_client library already does exactly the framing we need (HMAC signature over the wire, multipart ZMQ envelopes, msgpack-or-JSON payloads, …). The bridge is ~60 LoC. The gateway calls it with code on stdin and reads NDJSON on stdout — the same stream shape it already forwarded to the SDK.
Transcript (2026-04-22, honor, new kernel path):
sandbox: 5b71ba55c87e4826a366035c9db5d37d
[1] run_code("x = 42")
stdout: []
results: []
[2] run_code("print(x)") # same kernel — state persists
stdout: ['42\n']
[3] run_code("import numpy as np; np.array([1,2,3])")
results: [Result(text='array([1, 2, 3])', is_main_result=True)]
[4] run_code("import pandas as pd; \
from IPython.display import display; \
display(pd.DataFrame({'a':[1,2], 'b':[3,4]}))")
results: [Result(text=' a b\\n0 1 3\\n1 2 4',
html='<div>...<table class="dataframe">...')]
[5] run_code("matplotlib plot → display(Image(png))")
results: [Result(png='iVBORw0KGgoAAAANSUhEUgAA...' # PNG magic, 20 KB base64
is_main_result=False)]
[6] run_code("undefined_variable")
error: ExecutionError(
name='NameError',
value="name 'undefined_variable' is not defined",
traceback='...NameError: name \\'undefined_variable\\' is not defined')
All six cases above run against the same ipykernel inside the sandbox. The state-set/state-read pair in [1]/[2] is the cheapest possible proof of a long-lived kernel. [3] shows the MIME bundle coming through with text/plain. [4] adds text/html from display(pandas.DataFrame). [5] is the real payoff: a matplotlib figure arrives as image/png base64 and verifies (raw[:8] == b"\x89PNG\r\n\x1a\n"). [6] exercises the error envelope so the SDK can raise ExecutionError with a proper .name / .value / .traceback.
The rig lives at benchmarks/rigs/jupyter-e2e.sh; it drives the official e2b-code-interpreter Python SDK against a local e2b-compat instance (see the file for env-var knobs). Seven checks, seven passes.
Minimal reproduction scripts for agent developers: examples/02-persistent-kernel.py and friends under examples/ — 30-line SDK-against-our-gateway demos covering hello, persistent state, rich output, error handling, network isolation, diagnose, and N-way fanout.
Where the remaining gaps are
With the current e2b-compat:
- State across
run_code:no persistent kernel. Closed 2026-04-22. ipykernel is spawned inside the jail at sandbox create, an in-jail Python bridge translates iopub to NDJSON, and the/executehandler forwards the bridge’s stdout verbatim. State, imports, and open files persist across calls for the life of the sandbox. - Rich result MIME bundles (matplotlib → PNG, pandas → HTML):
needs an actual Jupyter kernel; comes for free with ipykernel. Closed 2026-04-22. Both paths verified end-to-end through the Python SDK. - Per-sandbox routing under the production URL scheme (
<port>-<id>.<domain>): the listener parses Host for the sandbox id but we haven’t stood up wildcard DNS. Easy enough locally (/etc/hostsentries ordnsmasq), but not yet done. - Filesystem API (
sandbox.files.*): envd has /filesystem/* endpoints that we haven’t implemented. REST, portable, ~1 day of work. - Commands API (
sandbox.commands.*): streaming subprocess over HTTP. Also REST, partially overlaps with our/exec.
The shape of the port is now fully proven. Everything that remains is packaging or incremental feature work, not protocol archaeology.
Secondary hatches in e2b-compat
Alongside the /execute endpoint on :49999
there are two older, simpler shims that predate the envd-compat work
and that we kept for smoke testing:
POST /sandboxes/:id/exec— one-shotjexec, not streaming, not stateful. The bring-up smoke tests use this.GET /sandboxes/:id/ws— a WebSocket that spawns/bin/shin the jail and proxies stdin / stdout as plain text frames. Format is our own{“type”: “stdout”, “text”: ”…”}; it is note2b-code-interpreter-compatible. Useful for debugging the plumbing; deprecated path for real clients. Seee2b-compat/src/ws.rs.
Production clients use the /execute endpoint described
in the Update section above. The E2B Python SDK round-trip passes
through that path exclusively.
Why the site’s port-sketch is still honest without this
Because the REST surface — the thing that makes
Sandbox.create → run_code → close routing work — is
what our site’s “drop-in” claim references. The REST surface
is portable; we’ve verified it. The WebSocket Jupyter protocol is the
next tier of work, not a fatal obstacle. It’s just a week of Rust and
a week of jail templating.