Skip to content

fix(daytona): fix execute implementation#1756

Merged
Eugene Yurtsev (eyurtsev) merged 7 commits intomainfrom
eugene/fix_execute
Mar 11, 2026
Merged

fix(daytona): fix execute implementation#1756
Eugene Yurtsev (eyurtsev) merged 7 commits intomainfrom
eugene/fix_execute

Conversation

@eyurtsev
Copy link
Collaborator

@eyurtsev Eugene Yurtsev (eyurtsev) commented Mar 10, 2026

Fixes the Daytona sandbox execute path so timeout is actually honored instead of being dropped by the underlying SDK call. The backend now runs commands through a session, polls for completion, and raises a timeout when the requested limit is exceeded, while preserving the existing execute API and collecting stdout/stderr for the final response. This PR also adds regression coverage in the core sandbox tests for the heredoc-based command templates and payload formatting used by BaseSandbox.

Created with Deep Agents CLI.

@github-actions github-actions bot added the internal User is a member of the `langchain-ai` GitHub organization label Mar 10, 2026
@eyurtsev Eugene Yurtsev (eyurtsev) changed the title partners(daytona): fix execute implementation fix(daytona): fix execute implementation Mar 10, 2026
@github-actions github-actions bot added the fix A bug fix (PATCH) label Mar 10, 2026
@github-actions github-actions bot added the size: S 50-199 LOC label Mar 10, 2026
@github-actions github-actions bot added the deepagents Related to the `deepagents` SDK / agent harness label Mar 11, 2026
@eyurtsev Eugene Yurtsev (eyurtsev) marked this pull request as ready for review March 11, 2026 13:23
Copilot AI review requested due to automatic review settings March 11, 2026 13:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Daytona sandbox backend to work around Daytona’s exec timeout propagation issue by executing commands via session APIs and polling for completion, and adjusts Deep Agents’ heredoc command templates to always terminate with a trailing newline (with a unit test to enforce it).

Changes:

  • Replace Daytona process.exec(...) usage with session-based execution + log polling to enforce timeouts.
  • Ensure heredoc-based command templates (write/edit/read) end with a trailing newline.
  • Add a unit test asserting heredoc templates end with \n.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.

File Description
libs/partners/daytona/langchain_daytona/sandbox.py Implements session-based command execution with manual timeout enforcement and log retrieval.
libs/deepagents/deepagents/backends/sandbox.py Adds trailing newline to heredoc terminators in command templates.
libs/deepagents/tests/unit_tests/backends/test_sandbox_backend.py Adds a regression test ensuring heredoc templates end with a newline.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +101 to +107
payload = json.dumps({"path": "/test/file.txt", "offset": 0, "limit": 100})
payload_b64 = base64.b64encode(payload.encode("utf-8")).decode("ascii")

write_cmd = _WRITE_COMMAND_TEMPLATE.format(payload_b64=payload_b64)
edit_cmd = _EDIT_COMMAND_TEMPLATE.format(payload_b64=payload_b64, replace_all=False)
read_cmd = _READ_COMMAND_TEMPLATE.format(payload_b64=payload_b64)

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test reuses a read-style payload (path/offset/limit) to format the write/edit templates, even though those templates expect different JSON schemas at runtime. Since the test is only checking for a trailing newline, consider using minimal payloads that match each template’s expected schema to avoid confusing future maintainers.

Suggested change
payload = json.dumps({"path": "/test/file.txt", "offset": 0, "limit": 100})
payload_b64 = base64.b64encode(payload.encode("utf-8")).decode("ascii")
write_cmd = _WRITE_COMMAND_TEMPLATE.format(payload_b64=payload_b64)
edit_cmd = _EDIT_COMMAND_TEMPLATE.format(payload_b64=payload_b64, replace_all=False)
read_cmd = _READ_COMMAND_TEMPLATE.format(payload_b64=payload_b64)
write_payload = json.dumps({"path": "/test/file.txt", "content": "test content"})
write_payload_b64 = base64.b64encode(write_payload.encode("utf-8")).decode("ascii")
edit_payload = json.dumps({"path": "/test/file.txt", "old": "foo", "new": "bar"})
edit_payload_b64 = base64.b64encode(edit_payload.encode("utf-8")).decode("ascii")
read_payload = json.dumps({"path": "/test/file.txt", "offset": 0, "limit": 100})
read_payload_b64 = base64.b64encode(read_payload.encode("utf-8")).decode("ascii")
write_cmd = _WRITE_COMMAND_TEMPLATE.format(payload_b64=write_payload_b64)
edit_cmd = _EDIT_COMMAND_TEMPLATE.format(payload_b64=edit_payload_b64, replace_all=False)
read_cmd = _READ_COMMAND_TEMPLATE.format(payload_b64=read_payload_b64)

Copilot uses AI. Check for mistakes.
Comment on lines +90 to +97
output = logs.stdout or ""

if logs.stderr is not None and logs.stderr.strip():
output += f"\n<stderr>{logs.stderr.strip()}</stderr>"

return ExecuteResponse(
output=result.result,
exit_code=result.exit_code,
output=output,
exit_code=command_result.exit_code,
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logs.stdout is assigned directly to output, but Daytona SDK log fields may be None (and ExecuteResponse.output expects a string). Consider normalizing with output = logs.stdout or "" before concatenation to avoid TypeError and ensure consistent return type. Also, wrapping stderr in <stderr>...</stderr> is inconsistent with other sandbox backends in this repo, which append stderr as plain text; this could break consumers that expect the same formatting across backends.

Copilot uses AI. Check for mistakes.
Comment on lines +87 to +89
finally:
self._sandbox.process.delete_session(session_id)

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete_session(session_id) in the finally block can raise and mask the original exception (including TimeoutError from the loop). Consider suppressing cleanup errors (or logging them) so that command execution failures/timeouts aren’t replaced by a session-deletion error.

Copilot uses AI. Check for mistakes.
Comment on lines +79 to +82
if command_result.exit_code is not None:
break
# sleep for 100 ms
time.sleep(0.1)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop polls get_session_command(...) every 100ms. With the default 30-minute timeout this can generate ~180,000 API calls for a single long-running command, which is likely to be expensive and may hit rate limits. Consider increasing the polling interval and/or using a backoff strategy (or a blocking/wait API if Daytona provides one).

Copilot uses AI. Check for mistakes.
@eyurtsev Eugene Yurtsev (eyurtsev) merged commit 080f3a5 into main Mar 11, 2026
34 checks passed
@eyurtsev Eugene Yurtsev (eyurtsev) deleted the eugene/fix_execute branch March 11, 2026 13:31
james8814 pushed a commit to james8814/deepagents that referenced this pull request Mar 13, 2026
Fixes the Daytona sandbox execute path so `timeout` is actually honored
instead of being dropped by the underlying SDK call. The backend now
runs commands through a session, polls for completion, and raises a
timeout when the requested limit is exceeded, while preserving the
existing execute API and collecting stdout/stderr for the final
response. This PR also adds regression coverage in the core sandbox
tests for the heredoc-based command templates and payload formatting
used by `BaseSandbox`.

Created with [Deep Agents
CLI](https://docs.langchain.com/oss/python/deepagents/cli/overview).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepagents Related to the `deepagents` SDK / agent harness fix A bug fix (PATCH) internal User is a member of the `langchain-ai` GitHub organization size: S 50-199 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants