truncate html output of run stale cells tool by PranavGopinath · Pull Request #8578 · marimo-team/marimo

PranavGopinath · 2026-03-04T20:26:14Z

📝 Summary

Truncate large cell outputs in run stale cells tool responses to prevent exceeding LLM token limits.

🔍 Description of Changes

When marimo's agent runs stale cells or retrieves cell outputs, large outputs like Plotly figures or rich dataframes (text/html) can produce tons of characters. In larger notebooks, it can be up to 10M characters, which far exceeds Anthropic's 200k token input limit and breaking the conversation. Later down the line, it might be possible to have a configurable option in the marimo.toml which might work better

Frontend (run-cells-tool.ts):

Summarize text/html outputs (not interpretable by LLMs) into a short description with char count
Truncate text outputs to 2,000 chars, error outputs to 3,000 chars
Track total output size across all cells with a 40k char limit; summarize remaining cells when exceeded

Tested end-to-end: 3 Plotly cells went from ~125k chars → ~300 chars (245x reduction).

📋 Checklist

I have read the contributor guidelines.
For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community
discussions (Please provide a link if applicable).
--> Messaged @mscolnick over slack
Tests have been added for the changes made.
Documentation has been updated where applicable, including docstrings for API changes.
Pull request title is a good summary of the changes - it will be used in the release notes.

vercel · 2026-03-04T20:26:20Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	Mar 4, 2026 8:28pm

github-actions · 2026-03-04T20:26:27Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

for more information, see https://pre-commit.ci

PranavGopinath · 2026-03-04T20:28:21Z

I have read the CLA Document and I hereby sign the CLA

Light2Dark

I think this is good, thank you! I'm going to clean up the code in a follow-up

Copilot

Pull request overview

This PR adds output truncation logic to the frontend RunStaleCellsTool to prevent exceeding LLM token limits when running stale cells in marimo's AI agent mode. Large cell outputs (like Plotly figures producing millions of characters of HTML) were breaking conversations by exceeding Anthropic's 200k token input limit.

Changes:

Added output size limits: text/html outputs are summarized to a short description, text outputs capped at 2,000 chars, error outputs at 3,000 chars, with a 40k char total budget across all cells
Restructured the cell output processing loop to track total output size and skip remaining cells when the budget is exceeded
Added tests for HTML summarization and text truncation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`frontend/src/core/ai/tools/run-cells-tool.ts`	Added output truncation constants and logic: HTML outputs are summarized, text/error outputs are truncated, and a global character budget limits total output size
`frontend/src/core/ai/tools/__tests__/run-cells-tool.test.ts`	Added tests for HTML output summarization and text output truncation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-05T09:22:49Z

frontend/src/core/ai/tools/__tests__/run-cells-tool.test.ts

+  describe("output truncation", () => {
+    it("should summarize text/html output instead of dumping raw content", async () => {
+      const notebook = MockNotebook.notebookState({
+        cellData: {
+          [cellId1]: { code: "fig.show()", edited: true },
+        },
+      });
+      store.set(notebookAtom, notebook);
+
+      vi.mocked(runCells).mockImplementation(async () => {
+        const updatedNotebook = store.get(notebookAtom);
+        updatedNotebook.cellRuntime[cellId1] = {
+          ...updatedNotebook.cellRuntime[cellId1],
+          status: "idle",
+        };
+        store.set(notebookAtom, updatedNotebook);
+      });
+
+      const largeHtml = `<div>${"x".repeat(2_000_000)}</div>`;
+      vi.mocked(getCellContextData).mockReturnValue({
+        cellOutput: {
+          outputType: "text",
+          processedContent: null,
+          imageUrl: null,
+          output: { mimetype: "text/html", data: largeHtml },
+        },
+        consoleOutputs: null,
+        cellName: "cell1",
+      } as never);
+
+      const result = await tool.handler({}, toolContext as never);
+
+      expect(result.status).toBe("success");
+      const output = result.cellsToOutput?.[cellId1]?.cellOutput ?? "";
+      expect(output).toContain("HTML Output:");
+      expect(output).toContain("text/html");
+      expect(output.length).toBeLessThan(200);
+      expect(output).not.toContain(largeHtml);
+    });
+
+    it("should truncate large text output to MAX_TEXT_OUTPUT_CHARS", async () => {
+      const notebook = MockNotebook.notebookState({
+        cellData: {
+          [cellId1]: { code: "print(big_string)", edited: true },
+        },
+      });
+      store.set(notebookAtom, notebook);
+
+      vi.mocked(runCells).mockImplementation(async () => {
+        const updatedNotebook = store.get(notebookAtom);
+        updatedNotebook.cellRuntime[cellId1] = {
+          ...updatedNotebook.cellRuntime[cellId1],
+          status: "idle",
+        };
+        store.set(notebookAtom, updatedNotebook);
+      });
+
+      const largeText = "a".repeat(10_000);
+      vi.mocked(getCellContextData).mockReturnValue({
+        cellOutput: {
+          outputType: "text",
+          processedContent: largeText,
+          imageUrl: null,
+          output: { mimetype: "text/plain", data: largeText },
+        },
+        consoleOutputs: null,
+        cellName: "cell1",
+      } as never);
+
+      const result = await tool.handler({}, toolContext as never);
+
+      const output = result.cellsToOutput?.[cellId1]?.cellOutput ?? "";
+      expect(output).toContain("[TRUNCATED:");
+      expect(output).toContain("Full output visible in the notebook UI.");
+      // Output should be capped (2000 chars content + "Output:\n" prefix + truncation message)
+      expect(output.length).toBeLessThan(2200);
+    });
+  });


The new truncation logic has several important code paths that lack test coverage:

Budget limiting (MAX_TOOL_OUTPUT_CHARS): No test verifies that when total output exceeds 40k chars, subsequent cells get the "output omitted due to context limits" message. This is the core mechanism for preventing token limit issues with many cells.

Error output truncation: The code uses a higher limit (MAX_ERROR_OUTPUT_CHARS = 3000) for error outputs, but there's no test verifying that error outputs use this separate limit rather than the standard 2000-char limit.

Console output truncation: Console outputs are truncated via this.truncateString(consoleOutputString, MAX_TEXT_OUTPUT_CHARS), but there's no test covering large console output truncation.

Copilot · 2026-03-05T09:22:50Z

frontend/src/core/ai/tools/run-cells-tool.ts

+// Output size limits to prevent exceeding LLM token limits.
+const MAX_TEXT_OUTPUT_CHARS = 2000;
+const MAX_ERROR_OUTPUT_CHARS = 3000;
+const MAX_TOOL_OUTPUT_CHARS = 40_000;


The PR description states that backend changes were made to cells.py (applying HTML summarization and truncation to GetCellOutputs tool) and shared utilities (truncate_string, summarize_html_output) were added to output_cleaning.py. However, none of these backend changes are present in this PR. The GetCellOutputs.handle() method in marimo/_ai/_tools/tools/cells.py still passes raw visual_output (including potentially large HTML) without any truncation. This means the backend MCP tool path is still vulnerable to the same large-output issue that the frontend fix addresses.

It does look like this is missing based on your PR description @PranavGopinath

Oh I forgot to remove it from the pr description. I initially implemented the change, but figured what is the point of GetCellOutputs if its not returning the full output? That being said, the output is still ridiculously large, but I find the function doesn't get called unless you explicitly ask the agent to use it. Run stale cells on the other hand is used after every notebook edit, so it's more of a necessary fix.

Got it, thank you. Will merge this in for next release

github-actions · 2026-03-06T19:06:18Z

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.20.5-dev20

truncate html output of run stale cells tool

e9f639c

PranavGopinath requested review from Light2Dark and manzt as code owners March 4, 2026 20:26

vercel bot deployed to Preview March 4, 2026 20:27 View deployment

[pre-commit.ci] auto fixes from pre-commit.com hooks

232f7ab

for more information, see https://pre-commit.ci

vercel bot deployed to Preview March 4, 2026 20:28 View deployment

Light2Dark added the bug Something isn't working label Mar 5, 2026

Light2Dark approved these changes Mar 5, 2026

View reviewed changes

Light2Dark requested a review from Copilot March 5, 2026 09:13

Copilot started reviewing on behalf of Light2Dark March 5, 2026 09:15 View session

Copilot AI reviewed Mar 5, 2026

View reviewed changes

mscolnick merged commit 64d9584 into marimo-team:main Mar 6, 2026
33 of 37 checks passed

Light2Dark mentioned this pull request Mar 9, 2026

clean up run cells tool code and add tests #8616

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

truncate html output of run stale cells tool#8578

truncate html output of run stale cells tool#8578
mscolnick merged 2 commits intomarimo-team:mainfrom
PranavGopinath:truncate-tool-output

PranavGopinath commented Mar 4, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

PranavGopinath commented Mar 4, 2026

Uh oh!

Light2Dark left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Light2Dark Mar 5, 2026

Uh oh!

PranavGopinath Mar 5, 2026 •

edited

Loading

Uh oh!

Light2Dark Mar 6, 2026

Uh oh!

Uh oh!

github-actions bot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

PranavGopinath commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PranavGopinath commented Mar 4, 2026

Uh oh!

Light2Dark left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Light2Dark Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

PranavGopinath Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Light2Dark Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

PranavGopinath commented Mar 4, 2026 •

edited

Loading

vercel bot commented Mar 4, 2026 •

edited

Loading

github-actions bot commented Mar 4, 2026 •

edited

Loading

PranavGopinath Mar 5, 2026 •

edited

Loading