Skip to content

Add server registry and SSE execute endpoint#8592

Merged
manzt merged 11 commits intomainfrom
manzt/agent-cli
Mar 12, 2026
Merged

Add server registry and SSE execute endpoint#8592
manzt merged 11 commits intomainfrom
manzt/agent-cli

Conversation

@manzt
Copy link
Collaborator

@manzt manzt commented Mar 5, 2026

Running marimo instances now write a small JSON file to ~/.local/state/marimo/servers/ at startup so external tools (like shell scripts) can discover them without importing marimo or hitting an HTTP endpoint. Registration is opt-in: only servers started with --no-token and --no-skew-protection are registered, since those are the ones that have explicitly relaxed local access. The registry entry is kept minimal — just host, port, base_url, pid, version — with no secrets or session data.

Copilot AI review requested due to automatic review settings March 5, 2026 23:29
@vercel
Copy link

vercel bot commented Mar 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Mar 12, 2026 9:19pm

Request Review

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an experimental, hidden marimo session CLI intended for programmatic interaction with running marimo servers by introducing a local session registry plus new server endpoints for session discovery and synchronous scratchpad execution.

Changes:

  • Implement a PID-cleaned session registry in the XDG state dir and register/deregister entries via server lifespan.
  • Add new HTTP endpoints for session enumeration (/api/sessions) and synchronous scratchpad execution (/api/kernel/scratchpad/execute).
  • Extract scratchpad execution utilities for shared use by the MCP code server and the new endpoint; add unit/CLI tests.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
tests/_server/test_session_registry.py Adds tests for session registry read/write, cleanup, and helpers.
tests/_server/test_scratchpad.py Adds tests for scratchpad result extraction behavior.
tests/_cli/session/test_session_cli.py Adds tests for marimo session list/exec CLI behavior.
marimo/_server/start.py Registers the new session_registry lifespan.
marimo/_server/session_registry.py Introduces session registry entry model + writer/reader and PID liveness checks.
marimo/_server/scratchpad.py Adds shared scratchpad listener + result extraction utilities.
marimo/_server/api/lifespans.py Writes/removes registry entry on server start/shutdown.
marimo/_server/api/endpoints/health.py Adds /api/sessions endpoint for session discovery.
marimo/_server/api/endpoints/execution.py Adds synchronous scratchpad execution endpoint.
marimo/_mcp/code_server/main.py Switches MCP server to shared scratchpad utilities.
marimo/_cli/session/list_cmd.py Implements marimo session list [--json].
marimo/_cli/session/exec_cmd.py Implements marimo session exec ... with discovery + HTTP calls.
marimo/_cli/session/init.py Adds hidden session click group and subcommand wiring.
marimo/_cli/cli.py Registers the hidden session command group in the main CLI.
_cli/exec_cmd.py Adds an additional standalone exec command implementation (duplicate functionality).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# Per-session locks to prevent overlapping scratchpad executions
session_locks: dict[str, asyncio.Lock] = {}
listener = _ScratchCellListener()
listener = ScratchCellListener()
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MCP server holds a single ScratchCellListener() instance and attaches it to whatever session is being executed. Because different sessions can execute concurrently (locks are per-session), the same listener can be attached to multiple sessions at once, which makes its waiter signaling ambiguous and can unblock the wrong execution. Instantiate a listener per execution (or per session) rather than sharing one global instance.

Suggested change
listener = ScratchCellListener()

Copilot uses AI. Check for mistakes.
Comment on lines +276 to +303
listener = ScratchCellListener()
session.attach_extension(listener)

try:
done = listener.wait_for(session_id)
session.put_control_request(
ExecuteScratchpadCommand(code=body.code),
from_consumer_id=None,
)
await asyncio.wait_for(done.wait(), timeout=EXECUTION_TIMEOUT)
# FIXME: stdout/stderr are flushed every 10ms by the buffered
# writer thread. Wait 50ms so trailing console output arrives
# before we read cell_notifications.
await asyncio.sleep(0.05)
except asyncio.TimeoutError:
return JSONResponse(
content={
"success": False,
"error": f"Execution timed out after {EXECUTION_TIMEOUT}s",
}
)
finally:
session.detach_extension(listener)

result = extract_result(session)
return JSONResponse(content=asdict(result))


Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/scratchpad/execute is a synchronous endpoint but does not use any per-session lock to prevent overlapping scratchpad executions. Two concurrent requests against the same session can both unblock on the first idle notification and return the wrong output (the MCP implementation uses a per-session asyncio.Lock to prevent this). Add a per-session execution lock (e.g., stored on the session or session manager) around put_control_request + wait + extract_result.

Suggested change
listener = ScratchCellListener()
session.attach_extension(listener)
try:
done = listener.wait_for(session_id)
session.put_control_request(
ExecuteScratchpadCommand(code=body.code),
from_consumer_id=None,
)
await asyncio.wait_for(done.wait(), timeout=EXECUTION_TIMEOUT)
# FIXME: stdout/stderr are flushed every 10ms by the buffered
# writer thread. Wait 50ms so trailing console output arrives
# before we read cell_notifications.
await asyncio.sleep(0.05)
except asyncio.TimeoutError:
return JSONResponse(
content={
"success": False,
"error": f"Execution timed out after {EXECUTION_TIMEOUT}s",
}
)
finally:
session.detach_extension(listener)
result = extract_result(session)
return JSONResponse(content=asdict(result))
# Ensure scratchpad executions for a given session are serialized.
lock: asyncio.Lock = getattr(session, "_scratchpad_exec_lock", None)
if lock is None:
lock = asyncio.Lock()
setattr(session, "_scratchpad_exec_lock", lock)
async with lock:
listener = ScratchCellListener()
session.attach_extension(listener)
try:
done = listener.wait_for(session_id)
session.put_control_request(
ExecuteScratchpadCommand(code=body.code),
from_consumer_id=None,
)
await asyncio.wait_for(done.wait(), timeout=EXECUTION_TIMEOUT)
# FIXME: stdout/stderr are flushed every 10ms by the buffered
# writer thread. Wait 50ms so trailing console output arrives
# before we read cell_notifications.
await asyncio.sleep(0.05)
except asyncio.TimeoutError:
return JSONResponse(
content={
"success": False,
"error": f"Execution timed out after {EXECUTION_TIMEOUT}s",
}
)
finally:
session.detach_extension(listener)
result = extract_result(session)
return JSONResponse(content=asdict(result))

Copilot uses AI. Check for mistakes.
@manzt manzt added the enhancement New feature or request label Mar 5, 2026
@manzt manzt marked this pull request as draft March 5, 2026 23:40
manzt added a commit that referenced this pull request Mar 6, 2026
The experimental session CLI was designed for AI agents, so `marimo
agent` is a better home. The list command moves under a `sessions`
subgroup (`marimo agent sessions list`) while exec stays at the top
level (`marimo agent exec`).

This also addresses review feedback from PR #8592: `resp.json()` calls
in the exec command now catch `JSONDecodeError` instead of crashing;
`SessionRegistryWriter.register` uses `os.replace` instead of
`os.rename` for Windows compatibility and closes the fd in a `finally`
block to prevent leaks on write failure; `ScratchCellListener` now
unsubscribes from the event bus on detach to prevent listener leaks; the
MCP code server creates a fresh listener per execution instead of
sharing one globally; and the scratchpad execute endpoint returns 504 on
timeout instead of 200.
@manzt manzt changed the title Add experimental marimo session CLI Add experimental marimo agent CLI Mar 6, 2026
manzt added a commit that referenced this pull request Mar 6, 2026
The experimental session CLI was designed for AI agents, so `marimo
agent` is a better home. The list command moves under a `sessions`
subgroup (`marimo agent sessions list`) while exec stays at the top
level (`marimo agent exec`).

This also addresses review feedback from PR #8592: `resp.json()` calls
in the exec command now catch `JSONDecodeError` instead of crashing;
`SessionRegistryWriter.register` uses `os.replace` instead of
`os.rename` for Windows compatibility and closes the fd in a `finally`
block to prevent leaks on write failure; `ScratchCellListener` now
unsubscribes from the event bus on detach to prevent listener leaks; the
MCP code server creates a fresh listener per execution instead of
sharing one globally; and the scratchpad execute endpoint returns 504 on
timeout instead of 200.
@manzt manzt force-pushed the manzt/agent-cli branch from bee36ad to 1c78d7c Compare March 6, 2026 01:20
@manzt manzt changed the title Add experimental marimo agent CLI Add session registry and synchronous scratchpad endpoint Mar 6, 2026
manzt added a commit that referenced this pull request Mar 9, 2026
The experimental session CLI was designed for AI agents, so `marimo
agent` is a better home. The list command moves under a `sessions`
subgroup (`marimo agent sessions list`) while exec stays at the top
level (`marimo agent exec`).

This also addresses review feedback from PR #8592: `resp.json()` calls
in the exec command now catch `JSONDecodeError` instead of crashing;
`SessionRegistryWriter.register` uses `os.replace` instead of
`os.rename` for Windows compatibility and closes the fd in a `finally`
block to prevent leaks on write failure; `ScratchCellListener` now
unsubscribes from the event bus on detach to prevent listener leaks; the
MCP code server creates a fresh listener per execution instead of
sharing one globally; and the scratchpad execute endpoint returns 504 on
timeout instead of 200.
@manzt manzt force-pushed the manzt/agent-cli branch from 7d30d92 to 6d34015 Compare March 9, 2026 21:36
@manzt manzt marked this pull request as ready for review March 10, 2026 00:37
@manzt manzt force-pushed the manzt/agent-cli branch from 2f79126 to ee118c9 Compare March 10, 2026 00:43
manzt added a commit that referenced this pull request Mar 10, 2026
The experimental session CLI was designed for AI agents, so `marimo
agent` is a better home. The list command moves under a `sessions`
subgroup (`marimo agent sessions list`) while exec stays at the top
level (`marimo agent exec`).

This also addresses review feedback from PR #8592: `resp.json()` calls
in the exec command now catch `JSONDecodeError` instead of crashing;
`SessionRegistryWriter.register` uses `os.replace` instead of
`os.rename` for Windows compatibility and closes the fd in a `finally`
block to prevent leaks on write failure; `ScratchCellListener` now
unsubscribes from the event bus on detach to prevent listener leaks; the
MCP code server creates a fresh listener per execution instead of
sharing one globally; and the scratchpad execute endpoint returns 504 on
timeout instead of 200.
@manzt manzt force-pushed the manzt/agent-cli branch from ee118c9 to 21f156b Compare March 10, 2026 00:43
manzt and others added 11 commits March 12, 2026 17:16
Hidden CLI for programmatic access to running notebooks, aimed at MCP
servers, AI agents, and editor extensions.

```sh
marimo session list [--json]
marimo session exec -c "print(df.head())"
marimo session exec --port 2718 --id <session-id> -c "code"
```

Servers register themselves on startup in the XDG state directory
(`~/.local/state/marimo/sessions/`) with PID-based stale entry cleanup.
The CLI reads this registry to discover targets, then talks to two new
HTTP endpoints: `/api/sessions` and `/api/kernel/scratchpad/execute`.

The scratchpad execution logic was extracted from the MCP code server
into `marimo/_server/scratchpad.py` so both MCP and the new endpoint
share it.
The experimental session CLI was designed for AI agents, so `marimo
agent` is a better home. The list command moves under a `sessions`
subgroup (`marimo agent sessions list`) while exec stays at the top
level (`marimo agent exec`).

This also addresses review feedback from PR #8592: `resp.json()` calls
in the exec command now catch `JSONDecodeError` instead of crashing;
`SessionRegistryWriter.register` uses `os.replace` instead of
`os.rename` for Windows compatibility and closes the fd in a `finally`
block to prevent leaks on write failure; `ScratchCellListener` now
unsubscribes from the event bus on detach to prevent listener leaks; the
MCP code server creates a fresh listener per execution instead of
sharing one globally; and the scratchpad execute endpoint returns 504 on
timeout instead of 200.
The `/scratchpad/execute` HTTP endpoint had no concurrency guard, so two
concurrent requests against the same session could both unblock on the
first idle notification and return the wrong output. The MCP code server
already solved this with a local `dict[str, asyncio.Lock]`, but that
only protected MCP callers and leaked locks for dead sessions.

This moves the lock onto the `Session` object itself as
`scratchpad_lock`, so both the HTTP endpoint and MCP server share a
single per-session `asyncio.Lock`. This eliminates the race and removes
the need for MCP to maintain its own lock dictionary.
The session registry now includes `python_exe` so that external tools
can discover which Python interpreter a running marimo server uses. This
lets agents invoke `<python_exe> -m marimo` to ensure they hit the
correct marimo installation without relying on PATH.

The `marimo agent` CLI subcommand (exec, sessions list) has been
removed. Session discovery and code execution are better handled by
lightweight scripts that read the registry JSON files on disk and talk
to the HTTP API directly, avoiding a hard dependency on having the right
marimo binary on PATH in the first place.
The previous API required callers to manually pair `attach_extension`
and `detach_extension` in try/finally blocks. Every caller followed the
exact same pattern, and forgetting the finally would leak extensions. A
context manager eliminates this class of bug entirely.

```python
with session.scoped(listener):
    # listener is attached for the duration of this block
    ...
```
The new `POST /scratchpad/execute` endpoint is an internal API not
intended for public documentation. This adds `include_in_schema` support
to `router.post()` (matching the existing pattern on `get()` and
`delete()`) and uses it to exclude the endpoint from the generated
OpenAPI spec.
Running marimo instances now write a small JSON file to
`~/.local/state/marimo/servers/` at startup so external tools can
discover them without importing marimo or hitting an HTTP endpoint.
Registration is opt-in: only servers started with `--no-token` and
`--no-skew-protection` are registered, since those are the ones that
have explicitly relaxed local access. The registry entry is kept
minimal (host, port, base_url, pid, version) with no secrets or
session data, and is cleaned up on shutdown or via stale PID detection.

A new `/api/sessions` endpoint lists the active sessions within a
server, and a `/api/kernel/scratchpad/execute` endpoint provides
synchronous scratchpad execution over HTTP. Both are guarded behind
the same no-auth/no-skew-protection requirement and excluded from the
OpenAPI schema.

The scratchpad listener and result extraction logic, previously
inlined in the MCP code server, is extracted to a shared
`_server/scratchpad.py` module so both the MCP tool and the new HTTP
endpoint can reuse it. Session extensions now use a `session.scoped()`
context manager instead of manual attach/detach, and each session
carries a `scratchpad_lock` to serialize concurrent executions.
The `/api/kernel/execute` endpoint (formerly `/scratchpad/execute`) now
streams results as server-sent events instead of returning a single JSON
response. This fixes stdout/stderr being silently dropped and makes
errors appear as clean plain text instead of browser-targeted HTML.

The old endpoint waited for execution to finish, then read the final
state. This had a known race with the buffered console writer (which
flushes every 10ms) that was patched with a 50ms sleep. Streaming
eliminates the race by yielding console events as they arrive. A 50ms
drain after the idle sentinel handles any trailing output from the
buffer, matching the pattern already used by the MCP code path.

Two additional fixes: compile errors in `run_scratchpad` now broadcast
an idle status so the listener receives the done sentinel instead of
timing out after 30s. And a `plain_text_traceback` flag on
`ExecuteScratchpadCommand` lets the endpoint opt into plain-text
tracebacks via a context var in `write_traceback`, so API consumers see
standard Python tracebacks rather than Pygments-highlighted HTML.

The SSE protocol is:
 

```
    event: stdout
    data: {"data": "hello\n"}

    event: stderr
    data: {"data": "Traceback ...\nValueError: boom\n"}

    event: done
    data: {"success": true, "output": {"mimetype": "...", "data": "..."}}
```

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@manzt manzt force-pushed the manzt/agent-cli branch from 101f362 to d6592fb Compare March 12, 2026 21:18
@manzt manzt changed the title Add server registry and synchronous scratchpad endpoint Add server registry and SSE execute endpoint Mar 12, 2026
@manzt manzt merged commit 7f2c978 into main Mar 12, 2026
38 of 44 checks passed
@manzt manzt deleted the manzt/agent-cli branch March 12, 2026 21:40
@github-actions
Copy link

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.20.5-dev62

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants