Skip to content

feat(sdk): add baseten-hosted models to evals workflow#1682

Merged
Eugene Yurtsev (eyurtsev) merged 2 commits intomainfrom
eugene/add-baseten-evals
Mar 6, 2026
Merged

feat(sdk): add baseten-hosted models to evals workflow#1682
Eugene Yurtsev (eyurtsev) merged 2 commits intomainfrom
eugene/add-baseten-evals

Conversation

@eyurtsev
Copy link
Collaborator

Adds GLM-5 (zai-org/GLM-5) and MiniMax-M2.5 (MiniMaxAI/MiniMax-M2.5) via Baseten's OpenAI-compatible inference endpoint as eval targets, mirroring the existing Ollama Cloud entries for these models. Uses ChatOpenAI with base_url=https://inference.baseten.co/v1 following the same pattern as the NVIDIA special case in conftest.py. Requires a BASETEN_API_KEY repo secret.

Created with Deep Agents CLI.

@github-actions github-actions bot added github_actions PR touching `.github` deepagents Related to the `deepagents` SDK / agent harness internal User is a member of the `langchain-ai` GitHub organization feature New feature/enhancement or request for one labels Mar 6, 2026
# Conflicts:
#	.github/scripts/get_eval_models.py
#	.github/workflows/evals.yml
@eyurtsev Eugene Yurtsev (eyurtsev) marked this pull request as ready for review March 6, 2026 21:33
Copilot AI review requested due to automatic review settings March 6, 2026 21:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Baseten-hosted models as additional eval targets by routing baseten: model specs through Baseten’s OpenAI-compatible endpoint in the eval harness, and wiring the required secret + model entries into the GitHub Actions eval workflow.

Changes:

  • Add a baseten: model prefix handler in the eval model fixture that instantiates ChatOpenAI with base_url=https://inference.baseten.co/v1.
  • Extend the evals GitHub Actions workflow matrix/options and environment to include Baseten models and BASETEN_API_KEY.
  • Update the model-matrix generator script to include the Baseten models in both MODELS and SET1.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
libs/deepagents/tests/evals/conftest.py Adds Baseten-specific model initialization via ChatOpenAI using Baseten’s OpenAI-compatible base_url.
.github/workflows/evals.yml Adds Baseten models to workflow input options/matrix and injects BASETEN_API_KEY into the eval job environment.
.github/scripts/get_eval_models.py Registers Baseten models in the “all” and “set1” model selections used to build the workflow matrix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@eyurtsev Eugene Yurtsev (eyurtsev) merged commit 04dc92b into main Mar 6, 2026
41 checks passed
@eyurtsev Eugene Yurtsev (eyurtsev) deleted the eugene/add-baseten-evals branch March 6, 2026 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepagents Related to the `deepagents` SDK / agent harness feature New feature/enhancement or request for one github_actions PR touching `.github` internal User is a member of the `langchain-ai` GitHub organization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants