fix(daytona): fix execute implementation by eyurtsev · Pull Request #1756 · langchain-ai/deepagents

Eugene Yurtsev (eyurtsev) · 2026-03-10T02:33:45Z

Fixes the Daytona sandbox execute path so timeout is actually honored instead of being dropped by the underlying SDK call. The backend now runs commands through a session, polls for completion, and raises a timeout when the requested limit is exceeded, while preserving the existing execute API and collecting stdout/stderr for the final response. This PR also adds regression coverage in the core sandbox tests for the heredoc-based command templates and payload formatting used by BaseSandbox.

Created with Deep Agents CLI.

Copilot

Pull request overview

Updates the Daytona sandbox backend to work around Daytona’s exec timeout propagation issue by executing commands via session APIs and polling for completion, and adjusts Deep Agents’ heredoc command templates to always terminate with a trailing newline (with a unit test to enforce it).

Changes:

Replace Daytona process.exec(...) usage with session-based execution + log polling to enforce timeouts.
Ensure heredoc-based command templates (write/edit/read) end with a trailing newline.
Add a unit test asserting heredoc templates end with \n.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.

File	Description
libs/partners/daytona/langchain_daytona/sandbox.py	Implements session-based command execution with manual timeout enforcement and log retrieval.
libs/deepagents/deepagents/backends/sandbox.py	Adds trailing newline to heredoc terminators in command templates.
libs/deepagents/tests/unit_tests/backends/test_sandbox_backend.py	Adds a regression test ensuring heredoc templates end with a newline.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-11T13:28:27Z

libs/deepagents/tests/unit_tests/backends/test_sandbox_backend.py

+    payload = json.dumps({"path": "/test/file.txt", "offset": 0, "limit": 100})
+    payload_b64 = base64.b64encode(payload.encode("utf-8")).decode("ascii")
+
+    write_cmd = _WRITE_COMMAND_TEMPLATE.format(payload_b64=payload_b64)
+    edit_cmd = _EDIT_COMMAND_TEMPLATE.format(payload_b64=payload_b64, replace_all=False)
+    read_cmd = _READ_COMMAND_TEMPLATE.format(payload_b64=payload_b64)
+


This test reuses a read-style payload (path/offset/limit) to format the write/edit templates, even though those templates expect different JSON schemas at runtime. Since the test is only checking for a trailing newline, consider using minimal payloads that match each template’s expected schema to avoid confusing future maintainers.

Suggested change

payload = json.dumps({"path": "/test/file.txt", "offset": 0, "limit": 100})

payload_b64 = base64.b64encode(payload.encode("utf-8")).decode("ascii")

write_cmd = _WRITE_COMMAND_TEMPLATE.format(payload_b64=payload_b64)

edit_cmd = _EDIT_COMMAND_TEMPLATE.format(payload_b64=payload_b64, replace_all=False)

read_cmd = _READ_COMMAND_TEMPLATE.format(payload_b64=payload_b64)

write_payload = json.dumps({"path": "/test/file.txt", "content": "test content"})

write_payload_b64 = base64.b64encode(write_payload.encode("utf-8")).decode("ascii")

edit_payload = json.dumps({"path": "/test/file.txt", "old": "foo", "new": "bar"})

edit_payload_b64 = base64.b64encode(edit_payload.encode("utf-8")).decode("ascii")

read_payload = json.dumps({"path": "/test/file.txt", "offset": 0, "limit": 100})

read_payload_b64 = base64.b64encode(read_payload.encode("utf-8")).decode("ascii")

write_cmd = _WRITE_COMMAND_TEMPLATE.format(payload_b64=write_payload_b64)

edit_cmd = _EDIT_COMMAND_TEMPLATE.format(payload_b64=edit_payload_b64, replace_all=False)

read_cmd = _READ_COMMAND_TEMPLATE.format(payload_b64=read_payload_b64)

Copilot · 2026-03-11T13:28:28Z

libs/partners/daytona/langchain_daytona/sandbox.py

+        output = logs.stdout or ""
+
+        if logs.stderr is not None and logs.stderr.strip():
+            output += f"\n<stderr>{logs.stderr.strip()}</stderr>"

        return ExecuteResponse(
-            output=result.result,
-            exit_code=result.exit_code,
+            output=output,
+            exit_code=command_result.exit_code,


logs.stdout is assigned directly to output, but Daytona SDK log fields may be None (and ExecuteResponse.output expects a string). Consider normalizing with output = logs.stdout or "" before concatenation to avoid TypeError and ensure consistent return type. Also, wrapping stderr in <stderr>...</stderr> is inconsistent with other sandbox backends in this repo, which append stderr as plain text; this could break consumers that expect the same formatting across backends.

Copilot · 2026-03-11T13:28:28Z

libs/partners/daytona/langchain_daytona/sandbox.py

+        finally:
+            self._sandbox.process.delete_session(session_id)
+


delete_session(session_id) in the finally block can raise and mask the original exception (including TimeoutError from the loop). Consider suppressing cleanup errors (or logging them) so that command execution failures/timeouts aren’t replaced by a session-deletion error.

Copilot · 2026-03-11T13:28:28Z

libs/partners/daytona/langchain_daytona/sandbox.py

+                if command_result.exit_code is not None:
+                    break
+                # sleep for 100 ms
+                time.sleep(0.1)


The loop polls get_session_command(...) every 100ms. With the default 30-minute timeout this can generate ~180,000 API calls for a single long-running command, which is likely to be expensive and may hit rate limits. Consider increasing the polling interval and/or using a backoff strategy (or a blocking/wait API if Daytona provides one).

Fixes the Daytona sandbox execute path so `timeout` is actually honored instead of being dropped by the underlying SDK call. The backend now runs commands through a session, polls for completion, and raises a timeout when the requested limit is exceeded, while preserving the existing execute API and collecting stdout/stderr for the final response. This PR also adds regression coverage in the core sandbox tests for the heredoc-based command templates and payload formatting used by `BaseSandbox`. Created with [Deep Agents CLI](https://docs.langchain.com/oss/python/deepagents/cli/overview).

Eugene Yurtsev (eyurtsev) added 2 commits March 9, 2026 22:17

x

a456fdf

x

7c2aa7e

github-actions bot added the internal User is a member of the `langchain-ai` GitHub organization label Mar 10, 2026

Eugene Yurtsev (eyurtsev) changed the title ~~partners(daytona): fix execute implementation~~ fix(daytona): fix execute implementation Mar 10, 2026

github-actions bot added the fix A bug fix (PATCH) label Mar 10, 2026

x

9cdf7c4

github-actions bot added the size: S 50-199 LOC label Mar 10, 2026

Eugene Yurtsev (eyurtsev) added 2 commits March 11, 2026 09:22

x

18189b0

x

9aacf62

github-actions bot added the deepagents Related to the `deepagents` SDK / agent harness label Mar 11, 2026

Eugene Yurtsev (eyurtsev) marked this pull request as ready for review March 11, 2026 13:23

Copilot AI review requested due to automatic review settings March 11, 2026 13:23

Eugene Yurtsev (eyurtsev) added 2 commits March 11, 2026 09:23

x

074b8bd

x

6ad826e

Copilot started reviewing on behalf of Eugene Yurtsev (eyurtsev) March 11, 2026 13:24 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Eugene Yurtsev (eyurtsev) merged commit 080f3a5 into main Mar 11, 2026
34 checks passed

Eugene Yurtsev (eyurtsev) deleted the eugene/fix_execute branch March 11, 2026 13:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(daytona): fix execute implementation#1756

fix(daytona): fix execute implementation#1756
Eugene Yurtsev (eyurtsev) merged 7 commits intomainfrom
eugene/fix_execute

Eugene Yurtsev (eyurtsev) commented Mar 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    payload = json.dumps({"path": "/test/file.txt", "offset": 0, "limit": 100})
-    payload_b64 = base64.b64encode(payload.encode("utf-8")).decode("ascii")
-    write_cmd = _WRITE_COMMAND_TEMPLATE.format(payload_b64=payload_b64)
-    edit_cmd = _EDIT_COMMAND_TEMPLATE.format(payload_b64=payload_b64, replace_all=False)
-    read_cmd = _READ_COMMAND_TEMPLATE.format(payload_b64=payload_b64)
+    write_payload = json.dumps({"path": "/test/file.txt", "content": "test content"})
+    write_payload_b64 = base64.b64encode(write_payload.encode("utf-8")).decode("ascii")
+    edit_payload = json.dumps({"path": "/test/file.txt", "old": "foo", "new": "bar"})
+    edit_payload_b64 = base64.b64encode(edit_payload.encode("utf-8")).decode("ascii")
+    read_payload = json.dumps({"path": "/test/file.txt", "offset": 0, "limit": 100})
+    read_payload_b64 = base64.b64encode(read_payload.encode("utf-8")).decode("ascii")
+    write_cmd = _WRITE_COMMAND_TEMPLATE.format(payload_b64=write_payload_b64)
+    edit_cmd = _EDIT_COMMAND_TEMPLATE.format(payload_b64=edit_payload_b64, replace_all=False)
+    read_cmd = _READ_COMMAND_TEMPLATE.format(payload_b64=read_payload_b64)

Conversation

Eugene Yurtsev (eyurtsev) commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Eugene Yurtsev (eyurtsev) commented Mar 10, 2026 •

edited

Loading