fix: isolation when running multiple notebooks in an app server by akshayka · Pull Request #8611 · marimo-team/marimo

akshayka · 2026-03-07T08:30:33Z

Summary. This PR introduces process-level isolation when serving multiple apps from the same server (marimo run directory/, create_asgi_app()); clients of any given app are still run in threads in the app's process for efficiency. This fixes a critical bug in which different apps shared the same Python globals, leading to undefined behavior such as collisions in `sys.modules. It does make multi-app servers consume slightly more RAM.

Dependency on pyzmq. The proposed implementation also adds a dependency on pyzmq for multi-app servers, to pave the way for allowing sandboxed multi-app servers . It would be possible to design a different solution that used multiprocessing instead of pyzmq, at the cost of not supporting package sandboxes. It is perhaps worth discussing whether we are okay with making pyzmq a required dependency of marimo.

Context. When marimo was originally designed, marimo run only ever served a single notebook. A single process could safely serve multiple clients of the same notebook since they all share the same code.

When multiple app serving was introduced, we continued serving all clients from a single process, even though the clients were potentially running different programs. When two notebooks both import utils but expect different implementations from different directories, whichever app loads first wins, and the second app silently gets the wrong module. Similar problems may exist for other Python globals.

This PR. This PR fixes the problem by running each app in its own OS process. Multiple clients of the same app still share a process (as kernel threads), which allows for fast and cheap sessions. The isolation boundary is per-app, not per-client.

Before (shared process — sys.modules collisions):

  ┌──────────────── Main Process ─────────────────┐
  │                                               │
  │   Kernel(app1, client A)    ← all kernels     │
  │   Kernel(app1, client B)      share one       │
  │   Kernel(app2, client C)      sys.modules     │
  │   Kernel(app2, client D)                      │
  └───────────────────────────────────────────────┘

After (per-app process isolation):

  ┌──────────────── Main Process ─────────────────┐
  │           (HTTP, WebSocket, routing)           │
  └──────────────────┬──────────────┬──────────────┘
                     │ ZMQ          │ ZMQ
       ┌─────────────▼──┐   ┌──────▼──────────────┐
       │  App Process    │   │  App Process        │
       │  (app1.py)      │   │  (app2.py)          │
       │                 │   │                     │
       │  Kernel: cl. A  │   │  Kernel: cl. C      │
       │  Kernel: cl. B  │   │  Kernel: cl. D      │
       └─────────────────┘   └─────────────────────┘
        isolated sys.modules  isolated sys.modules

IPC. Each app process communicates with the main process over 4 shared ZeroMQ sockets (not per-client). Kernel commands and stream output are multiplexed over these shared channels using session ID tagging:

IPC channels (4 shared ZMQ sockets per app process):

  Main Process                             App Subprocess
  ────────────                             ──────────────
  mgmt     [PUSH] ─────────────────────▶ [PULL]  mgmt loop
  response [PULL] ◀───────────────────── [PUSH]  (create/stop kernel)
  cmd      [PUSH] ──[sid, channel, msg]─▶ [PULL]  dispatcher ──▶ kernel queues
  stream   [PULL] ◀──[sid, msg]───────── [PUSH]  collector  ◀── kernel output

Per session (main-process side):

  AppProcessQueueManager
    control_queue    ──┐
    ui_element_queue ──┤──▶ cmd socket (tagged with session_id)
    completion_queue ──┤
    input_queue      ──┘
    stream_queue     ◀──── stream receiver thread (regular Queue)

When this path is activated.

create_asgi_app(): always enables process isolation (the whole point is multi-app)
marimo run app1.py app2.py / directory serving: auto-enables when multiple files detected
marimo run app.py (single file): no change, uses existing thread-based kernels
marimo edit: no change, uses existing process-based kernels

When marimo serves multiple apps (via `create_asgi_app()` or `marimo run app1.py app2.py`), all kernel threads previously shared the same Python process and `sys.modules`. This caused module clashes when different apps imported modules with the same name from different directories. This change runs different apps in different OS processes while keeping multiple clients of the same app as kernel threads within a single worker process for performance. Architecture: - WorkerProcessPool: manages worker processes keyed by file path - WorkerProcess: wraps multiprocessing.Process per notebook file - WorkerKernelManager: implements KernelManager protocol via ZeroMQ IPC - worker_entry.py: subprocess entry point, spawns kernel threads Activation is automatic for all multi-app scenarios. Single-app serving is unaffected. pyzmq is required (graceful fallback if missing).

Worker subprocesses are in the same process group as the main process, so Ctrl-C sends SIGINT to them too. Without this fix, workers crash on SIGINT, then the main process hangs trying to send shutdown commands to dead workers via ZMQ/queues. The fix: workers ignore SIGINT and rely on the main process to send ShutdownWorkerCmd via the management queue for graceful teardown.

When shutting down, PUSH sockets with pending messages block context.destroy() indefinitely (default linger=-1). Setting linger=0 drops unsent messages immediately, allowing clean shutdown when the remote end is already gone.

The key abstraction is one OS process per app/notebook file. "AppProcess" conveys this directly, while "Worker" was generic. Renames: - WorkerProcess -> AppProcess - WorkerProcessPool -> AppProcessPool - WorkerKernelManager -> AppKernelManager - ShutdownWorkerCmd -> ShutdownAppProcessCmd - worker_main -> app_process_main - worker*.py -> app_process*.py

Move the pyzmq check for multi-app process isolation into cli.py using MarimoCLIMissingDependencyError, which shows a clean error with install instructions instead of an ugly traceback.

- Move nested imports to module top level in app_process*.py - Reorder functions bottom-up (leaf helpers before callers) - Fix unused import (QueueManager), unused param (tmp_path) - Fix SpawnProcess/Process type mismatch with proper TYPE_CHECKING imports - Replace assert with ValueError guard for file_path narrowing

Replace multiprocessing.Process + multiprocessing.Queue with subprocess.Popen + ZeroMQ for app process management. This enables future support for sandboxed apps where each notebook needs its own Python interpreter/venv. Changes: - AppProcess now uses subprocess.Popen to launch the app process - Management channel (commands/responses) uses ZMQ PUSH/PULL sockets instead of multiprocessing.Queue - Commands converted from dataclasses to msgspec.Struct with tagged JSON serialization - app_process_entry.py is now launchable as a module (python -m marimo._session.managers.app_process_entry) - Startup args passed via stdin, ready signal via stdout (same pattern as launch_kernel.py) - AppProcess accepts optional python= parameter for custom interpreter

Move zmq and app_process imports behind TYPE_CHECKING or into local scope so that importing marimo doesn't fail when pyzmq is absent. This restores the CLI-friendly missing dependency error.

Instead of creating 12 ZMQ sockets per client connection (6 on each side via IPCQueueManager), use 4 shared ZMQ channels per app process that multiplex all kernel communication via session_id tagging. New clients now just create threading.Queue objects and spawn a kernel thread, eliminating the ~500ms ZMQ setup overhead per connection. - Add MuxQueueManager and _MuxPushQueue for multiplexed command sending - Add cmd dispatcher and stream collector threads in app process - Pass full ZMQ addresses (not ports) to decouple transport from entry point - Remove per-kernel Connection.create()/connect() dependency

MuxQueueManager -> AppProcessQueueManager _MuxPushQueue -> _AppProcessPushQueue

- Shut down dead AppProcess (ZMQ sockets/context) before respawning - Move install_thread_local_proxies() to process startup (not per-kernel) - Remove unreachable None checks in AppKernelManager - Log dropped stream messages for unknown sessions - Add cross-reference comment for channel name constants

vercel · 2026-03-07T08:30:38Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Error		Mar 12, 2026 9:54pm

mscolnick

I'm starting to think maybe pyzmq should be a required dep. at least in recommended, if not already.

- merge management loop and command loop

for more information, see https://pre-commit.ci

akshayka added 12 commits March 6, 2026 22:11

fix: use CLI-friendly error formatting for missing pyzmq dependency

9ac3fa4

Move the pyzmq check for multi-app process isolation into cli.py using MarimoCLIMissingDependencyError, which shows a clean error with install instructions instead of an ugly traceback.

refactor: move AppProcessPool import to top of session_manager.py

69f9c94

fix: defer zmq imports to prevent crash when pyzmq is not installed

f9dc06c

Move zmq and app_process imports behind TYPE_CHECKING or into local scope so that importing marimo doesn't fail when pyzmq is absent. This restores the CLI-friendly missing dependency error.

refactor: rename Mux* to AppProcess* for clarity

93a5781

MuxQueueManager -> AppProcessQueueManager _MuxPushQueue -> _AppProcessPushQueue

akshayka added bug Something isn't working bash-focus Area to focus on during release bug bash breaking A breaking change labels Mar 7, 2026

mscolnick reviewed Mar 7, 2026

View reviewed changes

output pid and tid in smoke test

5a0b299

vercel bot deployed to Preview March 9, 2026 14:54 View deployment

pyzmq as a dep

3f1467c

vercel bot deployed to Preview March 9, 2026 15:30 View deployment

try deflake tests

ef7c205

vercel bot deployed to Preview March 10, 2026 16:18 View deployment

akshayka added 7 commits March 10, 2026 12:49

edit

bc4eb25

cleanup

a1c8e59

edits

fd0d992

thread liveness checking

f0c93fe

dont drop stdout/stderr

c3b4c1f

- clean up app host processes when sessions spin down to 0

66cc489

- merge management loop and command loop

refactor and rename

7bb08e4

vercel bot deployed to Preview March 11, 2026 18:42 View deployment

akshayka and others added 2 commits March 12, 2026 17:50

marimo run --sandbox <multiple apps> uses AppHosts

0bae097

[pre-commit.ci] auto fixes from pre-commit.com hooks

9038361

for more information, see https://pre-commit.ci

vercel bot had a problem deploying to Preview March 12, 2026 21:53 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: isolation when running multiple notebooks in an app server#8611

fix: isolation when running multiple notebooks in an app server#8611
akshayka wants to merge 24 commits intomainfrom
aka/fix-run-multiple-notebooks

akshayka commented Mar 7, 2026

Uh oh!

vercel bot commented Mar 7, 2026 •

edited

Loading

Uh oh!

mscolnick left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

akshayka commented Mar 7, 2026

Uh oh!

vercel bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mscolnick left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Mar 7, 2026 •

edited

Loading