Guide for contributing to the Braintrust Python SDK repository.
- Use
miseas the source of truth for tools and environment. - Prefer
py/commands over rootmaketargets when working on the SDK itself. - Keep changes narrow and run the smallest relevant test session first.
- Do not rely on optional provider packages being installed unless the active nox session installs them.
py/: main Python package, tests, examples, nox sessions, release buildintegrations/: separate integration packagesinternal/golden/: compatibility and golden projectsdocs/: supporting docs
Important code areas in py/src/braintrust/:
- core SDK modules: top-level package files
- wrappers/integrations:
wrappers/ - temporal:
contrib/temporal/ - CLI/devserver:
cli/,devserver/ - tests: colocated
test_*.py
Preferred repo bootstrap:
mise install
make developPackage-focused setup:
cd py
make install-devInstall optional provider dependencies only if needed:
cd py
make install-optionalPreferred SDK workflow:
cd py
make lint
make test-core
nox -lFor larger or cross-cutting changes, also run make pylint from py/ before handing work off.
Targeted wrapper/session runs:
cd py
nox -s "test_openai(latest)"
nox -s "test_openai(latest)" -- -k "test_chat_metrics"Root Makefile exists as a convenience wrapper. The authoritative SDK workflow is in py/Makefile and py/noxfile.py.
py/noxfile.py is the source of truth for compatibility coverage.
Key facts:
test_coreruns without optional vendor packages.- wrapper coverage is split across dedicated nox sessions by provider/version.
pylintinstalls the broad dependency surface before checking files.cd py && make pylintruns onlypylint;cd py && make lintruns pre-commit hooks first and thenpylint.test-wheelis a wheel sanity check and requires a built wheel first.
When changing behavior, run the narrowest affected session first, then expand only if needed.
VCR cassette directories:
py/src/braintrust/cassettes/py/src/braintrust/wrappers/cassettes/py/src/braintrust/devserver/cassettes/py/src/braintrust/wrappers/claude_agent_sdk/cassettes/for Claude Agent SDK subprocess transport recordings
Behavior from py/src/braintrust/conftest.py:
- local default:
record_mode="once" - CI default:
record_mode="none" - wheel-mode skips VCR-marked tests
- test fixtures inject dummy API keys and reset global state
Common commands:
cd py
nox -s "test_openai(latest)"
nox -s "test_openai(latest)" -- --disable-vcr
nox -s "test_openai(latest)" -- --vcr-record=all -k "test_openai_chat_metrics"Claude Agent SDK does not use VCR because the SDK talks to the bundled claude subprocess over stdin/stdout. Those tests use a transport-level cassette helper instead.
Common Claude Agent SDK cassette commands:
cd py
nox -s "test_claude_agent_sdk(latest)"
BRAINTRUST_CLAUDE_AGENT_SDK_RECORD_MODE=all nox -s "test_claude_agent_sdk(latest)"
BRAINTRUST_CLAUDE_AGENT_SDK_RECORD_MODE=all nox -s "test_claude_agent_sdk(latest)" -- -k "test_calculator_with_multiple_operations"Only re-record HTTP or subprocess cassettes when the behavior change is intentional. If in doubt, ask the user.
Build from py/:
cd py
make buildImportant caveat:
py/scripts/template-version.pyrewritespy/src/braintrust/version.pyduring build.py/Makefilerestores that file afterward withgit checkout.
Avoid editing py/src/braintrust/version.py while also running build commands.
- Keep tests near the code they cover.
- Reuse existing fixtures and cassette patterns.
- If a change affects examples or integrations, update the nearest example or focused test.
- For CLI/devserver changes, consider whether wheel-mode behavior also needs coverage.
- Do not add
from __future__ import annotationsunless it is absolutely required (e.g., a genuine forward-reference that cannot be resolved any other way). This import changes annotation evaluation semantics at runtime and can silently breakget_type_hints(), Pydantic models, and other runtime introspection. Prefer quoted string literals ("MyClass") orTYPE_CHECKINGguards for forward references instead.