Add server registry and SSE execute endpoint#8592
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
Adds an experimental, hidden marimo session CLI intended for programmatic interaction with running marimo servers by introducing a local session registry plus new server endpoints for session discovery and synchronous scratchpad execution.
Changes:
- Implement a PID-cleaned session registry in the XDG state dir and register/deregister entries via server lifespan.
- Add new HTTP endpoints for session enumeration (
/api/sessions) and synchronous scratchpad execution (/api/kernel/scratchpad/execute). - Extract scratchpad execution utilities for shared use by the MCP code server and the new endpoint; add unit/CLI tests.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/_server/test_session_registry.py | Adds tests for session registry read/write, cleanup, and helpers. |
| tests/_server/test_scratchpad.py | Adds tests for scratchpad result extraction behavior. |
| tests/_cli/session/test_session_cli.py | Adds tests for marimo session list/exec CLI behavior. |
| marimo/_server/start.py | Registers the new session_registry lifespan. |
| marimo/_server/session_registry.py | Introduces session registry entry model + writer/reader and PID liveness checks. |
| marimo/_server/scratchpad.py | Adds shared scratchpad listener + result extraction utilities. |
| marimo/_server/api/lifespans.py | Writes/removes registry entry on server start/shutdown. |
| marimo/_server/api/endpoints/health.py | Adds /api/sessions endpoint for session discovery. |
| marimo/_server/api/endpoints/execution.py | Adds synchronous scratchpad execution endpoint. |
| marimo/_mcp/code_server/main.py | Switches MCP server to shared scratchpad utilities. |
| marimo/_cli/session/list_cmd.py | Implements marimo session list [--json]. |
| marimo/_cli/session/exec_cmd.py | Implements marimo session exec ... with discovery + HTTP calls. |
| marimo/_cli/session/init.py | Adds hidden session click group and subcommand wiring. |
| marimo/_cli/cli.py | Registers the hidden session command group in the main CLI. |
| _cli/exec_cmd.py | Adds an additional standalone exec command implementation (duplicate functionality). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
marimo/_mcp/code_server/main.py
Outdated
| # Per-session locks to prevent overlapping scratchpad executions | ||
| session_locks: dict[str, asyncio.Lock] = {} | ||
| listener = _ScratchCellListener() | ||
| listener = ScratchCellListener() |
There was a problem hiding this comment.
The MCP server holds a single ScratchCellListener() instance and attaches it to whatever session is being executed. Because different sessions can execute concurrently (locks are per-session), the same listener can be attached to multiple sessions at once, which makes its waiter signaling ambiguous and can unblock the wrong execution. Instantiate a listener per execution (or per session) rather than sharing one global instance.
| listener = ScratchCellListener() |
| listener = ScratchCellListener() | ||
| session.attach_extension(listener) | ||
|
|
||
| try: | ||
| done = listener.wait_for(session_id) | ||
| session.put_control_request( | ||
| ExecuteScratchpadCommand(code=body.code), | ||
| from_consumer_id=None, | ||
| ) | ||
| await asyncio.wait_for(done.wait(), timeout=EXECUTION_TIMEOUT) | ||
| # FIXME: stdout/stderr are flushed every 10ms by the buffered | ||
| # writer thread. Wait 50ms so trailing console output arrives | ||
| # before we read cell_notifications. | ||
| await asyncio.sleep(0.05) | ||
| except asyncio.TimeoutError: | ||
| return JSONResponse( | ||
| content={ | ||
| "success": False, | ||
| "error": f"Execution timed out after {EXECUTION_TIMEOUT}s", | ||
| } | ||
| ) | ||
| finally: | ||
| session.detach_extension(listener) | ||
|
|
||
| result = extract_result(session) | ||
| return JSONResponse(content=asdict(result)) | ||
|
|
||
|
|
There was a problem hiding this comment.
/scratchpad/execute is a synchronous endpoint but does not use any per-session lock to prevent overlapping scratchpad executions. Two concurrent requests against the same session can both unblock on the first idle notification and return the wrong output (the MCP implementation uses a per-session asyncio.Lock to prevent this). Add a per-session execution lock (e.g., stored on the session or session manager) around put_control_request + wait + extract_result.
| listener = ScratchCellListener() | |
| session.attach_extension(listener) | |
| try: | |
| done = listener.wait_for(session_id) | |
| session.put_control_request( | |
| ExecuteScratchpadCommand(code=body.code), | |
| from_consumer_id=None, | |
| ) | |
| await asyncio.wait_for(done.wait(), timeout=EXECUTION_TIMEOUT) | |
| # FIXME: stdout/stderr are flushed every 10ms by the buffered | |
| # writer thread. Wait 50ms so trailing console output arrives | |
| # before we read cell_notifications. | |
| await asyncio.sleep(0.05) | |
| except asyncio.TimeoutError: | |
| return JSONResponse( | |
| content={ | |
| "success": False, | |
| "error": f"Execution timed out after {EXECUTION_TIMEOUT}s", | |
| } | |
| ) | |
| finally: | |
| session.detach_extension(listener) | |
| result = extract_result(session) | |
| return JSONResponse(content=asdict(result)) | |
| # Ensure scratchpad executions for a given session are serialized. | |
| lock: asyncio.Lock = getattr(session, "_scratchpad_exec_lock", None) | |
| if lock is None: | |
| lock = asyncio.Lock() | |
| setattr(session, "_scratchpad_exec_lock", lock) | |
| async with lock: | |
| listener = ScratchCellListener() | |
| session.attach_extension(listener) | |
| try: | |
| done = listener.wait_for(session_id) | |
| session.put_control_request( | |
| ExecuteScratchpadCommand(code=body.code), | |
| from_consumer_id=None, | |
| ) | |
| await asyncio.wait_for(done.wait(), timeout=EXECUTION_TIMEOUT) | |
| # FIXME: stdout/stderr are flushed every 10ms by the buffered | |
| # writer thread. Wait 50ms so trailing console output arrives | |
| # before we read cell_notifications. | |
| await asyncio.sleep(0.05) | |
| except asyncio.TimeoutError: | |
| return JSONResponse( | |
| content={ | |
| "success": False, | |
| "error": f"Execution timed out after {EXECUTION_TIMEOUT}s", | |
| } | |
| ) | |
| finally: | |
| session.detach_extension(listener) | |
| result = extract_result(session) | |
| return JSONResponse(content=asdict(result)) |
The experimental session CLI was designed for AI agents, so `marimo agent` is a better home. The list command moves under a `sessions` subgroup (`marimo agent sessions list`) while exec stays at the top level (`marimo agent exec`). This also addresses review feedback from PR #8592: `resp.json()` calls in the exec command now catch `JSONDecodeError` instead of crashing; `SessionRegistryWriter.register` uses `os.replace` instead of `os.rename` for Windows compatibility and closes the fd in a `finally` block to prevent leaks on write failure; `ScratchCellListener` now unsubscribes from the event bus on detach to prevent listener leaks; the MCP code server creates a fresh listener per execution instead of sharing one globally; and the scratchpad execute endpoint returns 504 on timeout instead of 200.
marimo session CLImarimo agent CLI
The experimental session CLI was designed for AI agents, so `marimo agent` is a better home. The list command moves under a `sessions` subgroup (`marimo agent sessions list`) while exec stays at the top level (`marimo agent exec`). This also addresses review feedback from PR #8592: `resp.json()` calls in the exec command now catch `JSONDecodeError` instead of crashing; `SessionRegistryWriter.register` uses `os.replace` instead of `os.rename` for Windows compatibility and closes the fd in a `finally` block to prevent leaks on write failure; `ScratchCellListener` now unsubscribes from the event bus on detach to prevent listener leaks; the MCP code server creates a fresh listener per execution instead of sharing one globally; and the scratchpad execute endpoint returns 504 on timeout instead of 200.
marimo agent CLIThe experimental session CLI was designed for AI agents, so `marimo agent` is a better home. The list command moves under a `sessions` subgroup (`marimo agent sessions list`) while exec stays at the top level (`marimo agent exec`). This also addresses review feedback from PR #8592: `resp.json()` calls in the exec command now catch `JSONDecodeError` instead of crashing; `SessionRegistryWriter.register` uses `os.replace` instead of `os.rename` for Windows compatibility and closes the fd in a `finally` block to prevent leaks on write failure; `ScratchCellListener` now unsubscribes from the event bus on detach to prevent listener leaks; the MCP code server creates a fresh listener per execution instead of sharing one globally; and the scratchpad execute endpoint returns 504 on timeout instead of 200.
The experimental session CLI was designed for AI agents, so `marimo agent` is a better home. The list command moves under a `sessions` subgroup (`marimo agent sessions list`) while exec stays at the top level (`marimo agent exec`). This also addresses review feedback from PR #8592: `resp.json()` calls in the exec command now catch `JSONDecodeError` instead of crashing; `SessionRegistryWriter.register` uses `os.replace` instead of `os.rename` for Windows compatibility and closes the fd in a `finally` block to prevent leaks on write failure; `ScratchCellListener` now unsubscribes from the event bus on detach to prevent listener leaks; the MCP code server creates a fresh listener per execution instead of sharing one globally; and the scratchpad execute endpoint returns 504 on timeout instead of 200.
Hidden CLI for programmatic access to running notebooks, aimed at MCP servers, AI agents, and editor extensions. ```sh marimo session list [--json] marimo session exec -c "print(df.head())" marimo session exec --port 2718 --id <session-id> -c "code" ``` Servers register themselves on startup in the XDG state directory (`~/.local/state/marimo/sessions/`) with PID-based stale entry cleanup. The CLI reads this registry to discover targets, then talks to two new HTTP endpoints: `/api/sessions` and `/api/kernel/scratchpad/execute`. The scratchpad execution logic was extracted from the MCP code server into `marimo/_server/scratchpad.py` so both MCP and the new endpoint share it.
The experimental session CLI was designed for AI agents, so `marimo agent` is a better home. The list command moves under a `sessions` subgroup (`marimo agent sessions list`) while exec stays at the top level (`marimo agent exec`). This also addresses review feedback from PR #8592: `resp.json()` calls in the exec command now catch `JSONDecodeError` instead of crashing; `SessionRegistryWriter.register` uses `os.replace` instead of `os.rename` for Windows compatibility and closes the fd in a `finally` block to prevent leaks on write failure; `ScratchCellListener` now unsubscribes from the event bus on detach to prevent listener leaks; the MCP code server creates a fresh listener per execution instead of sharing one globally; and the scratchpad execute endpoint returns 504 on timeout instead of 200.
The `/scratchpad/execute` HTTP endpoint had no concurrency guard, so two concurrent requests against the same session could both unblock on the first idle notification and return the wrong output. The MCP code server already solved this with a local `dict[str, asyncio.Lock]`, but that only protected MCP callers and leaked locks for dead sessions. This moves the lock onto the `Session` object itself as `scratchpad_lock`, so both the HTTP endpoint and MCP server share a single per-session `asyncio.Lock`. This eliminates the race and removes the need for MCP to maintain its own lock dictionary.
The session registry now includes `python_exe` so that external tools can discover which Python interpreter a running marimo server uses. This lets agents invoke `<python_exe> -m marimo` to ensure they hit the correct marimo installation without relying on PATH. The `marimo agent` CLI subcommand (exec, sessions list) has been removed. Session discovery and code execution are better handled by lightweight scripts that read the registry JSON files on disk and talk to the HTTP API directly, avoiding a hard dependency on having the right marimo binary on PATH in the first place.
The previous API required callers to manually pair `attach_extension`
and `detach_extension` in try/finally blocks. Every caller followed the
exact same pattern, and forgetting the finally would leak extensions. A
context manager eliminates this class of bug entirely.
```python
with session.scoped(listener):
# listener is attached for the duration of this block
...
```
The new `POST /scratchpad/execute` endpoint is an internal API not intended for public documentation. This adds `include_in_schema` support to `router.post()` (matching the existing pattern on `get()` and `delete()`) and uses it to exclude the endpoint from the generated OpenAPI spec.
Running marimo instances now write a small JSON file to `~/.local/state/marimo/servers/` at startup so external tools can discover them without importing marimo or hitting an HTTP endpoint. Registration is opt-in: only servers started with `--no-token` and `--no-skew-protection` are registered, since those are the ones that have explicitly relaxed local access. The registry entry is kept minimal (host, port, base_url, pid, version) with no secrets or session data, and is cleaned up on shutdown or via stale PID detection. A new `/api/sessions` endpoint lists the active sessions within a server, and a `/api/kernel/scratchpad/execute` endpoint provides synchronous scratchpad execution over HTTP. Both are guarded behind the same no-auth/no-skew-protection requirement and excluded from the OpenAPI schema. The scratchpad listener and result extraction logic, previously inlined in the MCP code server, is extracted to a shared `_server/scratchpad.py` module so both the MCP tool and the new HTTP endpoint can reuse it. Session extensions now use a `session.scoped()` context manager instead of manual attach/detach, and each session carries a `scratchpad_lock` to serialize concurrent executions.
The `/api/kernel/execute` endpoint (formerly `/scratchpad/execute`) now
streams results as server-sent events instead of returning a single JSON
response. This fixes stdout/stderr being silently dropped and makes
errors appear as clean plain text instead of browser-targeted HTML.
The old endpoint waited for execution to finish, then read the final
state. This had a known race with the buffered console writer (which
flushes every 10ms) that was patched with a 50ms sleep. Streaming
eliminates the race by yielding console events as they arrive. A 50ms
drain after the idle sentinel handles any trailing output from the
buffer, matching the pattern already used by the MCP code path.
Two additional fixes: compile errors in `run_scratchpad` now broadcast
an idle status so the listener receives the done sentinel instead of
timing out after 30s. And a `plain_text_traceback` flag on
`ExecuteScratchpadCommand` lets the endpoint opt into plain-text
tracebacks via a context var in `write_traceback`, so API consumers see
standard Python tracebacks rather than Pygments-highlighted HTML.
The SSE protocol is:
```
event: stdout
data: {"data": "hello\n"}
event: stderr
data: {"data": "Traceback ...\nValueError: boom\n"}
event: done
data: {"success": true, "output": {"mimetype": "...", "data": "..."}}
```
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.20.5-dev62 |
Running marimo instances now write a small JSON file to
~/.local/state/marimo/servers/at startup so external tools (like shell scripts) can discover them without importing marimo or hitting an HTTP endpoint. Registration is opt-in: only servers started with--no-tokenand--no-skew-protectionare registered, since those are the ones that have explicitly relaxed local access. The registry entry is kept minimal — justhost,port,base_url,pid,version— with no secrets or session data.