Skip to content

Mixed Python/JS evals#71

Draft
Olmo Maldonado (ibolmo) wants to merge 2 commits intomainfrom
mixed-evals
Draft

Mixed Python/JS evals#71
Olmo Maldonado (ibolmo) wants to merge 2 commits intomainfrom
mixed-evals

Conversation

@ibolmo
Copy link
Contributor

Adds support for running Python and JavaScript eval files together in a single bt eval invocation. Previously, mixing .py and .ts/.js files in one command would fail with a "mixed eval file types are not supported yet" error.

What changed

Language inference & parallel execution (src/eval.rs)

  • Removed the --language / BT_EVAL_LANGUAGE flag (now hidden and emits a deprecation warning — the language is always inferred from file extensions).
  • Replaced build_eval_plan (single plan) with partition_files_by_language + build_eval_plans (one plan per language), which splits the input file list into JS and Python partitions.
  • Plans are now executed concurrently via future::try_join_all; the combined exit status prefers the first failure.
  • Split the single --runner / BT_EVAL_RUNNER flag into two language-scoped flags:
    • --runner-js / BT_EVAL_JS_RUNNER (JS/TS runner, e.g. tsx, bun, deno)
    • --runner-python / BT_EVAL_PYTHON_RUNNER (Python runner, e.g. python3, uv run python)
    • The old BT_EVAL_RUNNER env var is kept as a backward-compatible alias for --runner-js.
  • Removed the ad-hoc std::env::var("BT_EVAL_PYTHON_RUNNER") lookup from build_python_command; the runner is now always passed through the CLI layer.
  • Dev-server (--dev) path updated to carry separate js_runner_override / python_runner_override fields instead of a single language_override + runner_override.

Test fixtures (tests/eval_fixtures.rs, tests/evals/)

  • Fixture fixture.json schema updated: runtime/runner/runners fields replaced by runners_js and runners_python arrays.
  • Test runner now discovers all category subdirectories (not just hard-coded js/ and py/) so the new mixed/ category is picked up automatically.
  • Mixed fixtures get a Cartesian-product runner matrix (js_runner × py_runner), exercising every combination.
  • All existing JS and Python fixture configs updated to the new schema.
  • New tests/evals/mixed/mixed-py-js/ fixture with a paired .eval.ts + eval_basic.py that both write to the same project, verified against tsx, bun, and deno JS runners.
  • New unit tests covering partition_files_by_language, build_eval_plans, and the BT_EVAL_RUNNERBT_EVAL_JS_RUNNER backward-compat env alias.

Replace explicit `--language` and `--runner` flags with automatic
detection based on file extensions. Introduce separate `--runner-js`
and `--runner-python` flags for explicit runner overrides. Hide the
deprecated `--language` flag and maintain backward compatibility for
the `BT_EVAL_RUNNER` environment variable via `--runner-js-legacy-env`.
Update test fixtures to use `runners_js` and `runners_python` instead
of `runtime` and `runners`.
@github-actions
Copy link

Latest downloadable build artifacts for this PR commit 8fe84573011c:

Available artifact names
  • ``artifacts-build-global
  • ``artifacts-build-local-x86_64-apple-darwin
  • ``artifacts-build-local-x86_64-pc-windows-msvc
  • ``artifacts-build-local-x86_64-unknown-linux-musl
  • ``artifacts-build-local-x86_64-unknown-linux-gnu
  • ``artifacts-build-local-aarch64-unknown-linux-musl
  • ``artifacts-build-local-aarch64-unknown-linux-gnu
  • ``artifacts-build-local-aarch64-apple-darwin
  • ``artifacts-plan-dist-manifest
  • ``cargo-dist-cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant