|
1 | 1 | --- |
2 | 2 | name: sdk-integrations |
3 | | -description: Create or update a Braintrust Python SDK integration using the integrations API. Use when asked to add an integration, update an existing integration, add or update patchers, update auto_instrument, add integration tests, or work in py/src/braintrust/integrations/. |
| 3 | +description: Create or update Braintrust Python SDK integrations built on the integrations API. Use for work in `py/src/braintrust/integrations/`, including new providers, patchers, tracing, `auto_instrument()` updates, integration exports, and integration tests. |
4 | 4 | --- |
5 | 5 |
|
6 | 6 | # SDK Integrations |
7 | 7 |
|
8 | | -SDK integrations define how Braintrust discovers a provider, patches it safely, and keeps provider-specific tracing local to that integration. Read the existing integration closest to your task before writing a new one. If there is no closer example, `py/src/braintrust/integrations/anthropic/` is a useful reference implementation. |
| 8 | +Use this skill for integrations API work under `py/src/braintrust/integrations/`. |
9 | 9 |
|
10 | | -## Workflow |
11 | | - |
12 | | -1. Read the shared integration primitives and the closest provider example. |
13 | | -2. Choose the task shape: new provider, existing provider update, or `auto_instrument()` update. |
14 | | -3. Implement the smallest integration, patcher, tracing, and export changes needed. |
15 | | -4. Add or update VCR-backed integration tests and only re-record cassettes when behavior changed intentionally. |
16 | | -5. Run the narrowest provider session first, then expand to shared validation only if the change touched shared code. |
| 10 | +Start from the nearest existing provider instead of designing from scratch: |
17 | 11 |
|
18 | | -## Commands |
| 12 | +- ADK (`py/src/braintrust/integrations/adk/`) is the best reference for direct method patching, `target_module`, `CompositeFunctionWrapperPatcher`, and public `wrap_*()` helpers. |
| 13 | +- Anthropic (`py/src/braintrust/integrations/anthropic/`) is the best reference for constructor patching with `FunctionWrapperPatcher`. |
19 | 14 |
|
20 | | -```bash |
21 | | -cd py && nox -s "test_<provider>(latest)" |
22 | | -cd py && nox -s "test_<provider>(latest)" -- -k "test_name" |
23 | | -cd py && nox -s "test_<provider>(latest)" -- --vcr-record=all -k "test_name" |
24 | | -cd py && make test-core |
25 | | -cd py && make lint |
26 | | -``` |
| 15 | +## Workflow |
27 | 16 |
|
28 | | -## Creating or Updating an Integration |
| 17 | +1. Read the shared primitives and the nearest provider example. |
| 18 | +2. Decide whether the task is a new provider, an existing provider update, or an `auto_instrument()` change. |
| 19 | +3. Change only the affected integration, patchers, tracing, exports, and tests. |
| 20 | +4. Update tests and cassettes only where behavior changed intentionally. |
| 21 | +5. Run the narrowest provider session first, then expand only if shared code changed. |
29 | 22 |
|
30 | | -### 1. Read the nearest existing implementation |
| 23 | +## Read First |
31 | 24 |
|
32 | | -Always inspect these first: |
| 25 | +Always read: |
33 | 26 |
|
34 | 27 | - `py/src/braintrust/integrations/base.py` |
35 | | -- `py/src/braintrust/integrations/runtime.py` |
36 | 28 | - `py/src/braintrust/integrations/versioning.py` |
37 | | -- `py/src/braintrust/integrations/config.py` |
38 | | - |
39 | | -Relevant example implementation: |
40 | | - |
41 | | -- `py/src/braintrust/integrations/anthropic/` |
42 | | - |
43 | | -Read these additional files only when the task needs them: |
44 | | - |
45 | | -- changing `auto_instrument()`: `py/src/braintrust/auto.py` and `py/src/braintrust/auto_test_scripts/test_auto_anthropic_patch_config.py` |
46 | | -- adding or updating VCR tests: `py/src/braintrust/conftest.py` and `py/src/braintrust/integrations/anthropic/test_anthropic.py` |
47 | | - |
48 | | -Then choose the path that matches the task: |
49 | 29 |
|
50 | | -- new provider: create `py/src/braintrust/integrations/<provider>/` |
51 | | -- existing provider: read the provider package first and change only the affected patchers, tracing, tests, or exports |
52 | | -- `auto_instrument()` only: keep the integration package unchanged unless the option shape or patcher surface also changed |
| 30 | +Read when relevant: |
53 | 31 |
|
54 | | -### 2. Create or extend the integration module |
| 32 | +- `py/src/braintrust/auto.py` for `auto_instrument()` work |
| 33 | +- `py/src/braintrust/conftest.py` for VCR behavior |
| 34 | +- `py/src/braintrust/integrations/adk/test_adk.py` for integration test patterns |
| 35 | +- `py/src/braintrust/integrations/auto_test_scripts/` for subprocess auto-instrument tests |
55 | 36 |
|
56 | | -For a new provider, create a package under `py/src/braintrust/integrations/<provider>/`. |
| 37 | +## Package Layout |
57 | 38 |
|
58 | | -For an existing provider, keep the module layout unless the current structure is actively causing problems. |
| 39 | +Create new providers under `py/src/braintrust/integrations/<provider>/`. Keep the existing layout for provider updates unless the current structure is the problem. |
59 | 40 |
|
60 | 41 | Typical files: |
61 | 42 |
|
62 | | -- `__init__.py`: public exports for the integration type and any public helpers |
63 | | -- `integration.py`: the `BaseIntegration` subclass, patcher registration, and high-level orchestration |
64 | | -- `patchers.py`: one patcher per patch target, with version gating and existence checks close to the patch |
65 | | -- `tracing.py`: provider-specific span creation, metadata extraction, stream handling, and output normalization |
66 | | -- `test_<provider>.py`: integration tests for `wrap(...)`, `setup()`, sync/async behavior, streaming, and error handling |
67 | | -- `cassettes/`: recorded provider traffic for VCR-backed integration tests when the provider uses HTTP |
| 43 | +- `__init__.py`: export the integration class, `setup_<provider>()`, and public `wrap_*()` helpers |
| 44 | +- `integration.py`: define the `BaseIntegration` subclass and register patchers |
| 45 | +- `patchers.py`: define patchers and `wrap_*()` helpers |
| 46 | +- `tracing.py`: keep provider-specific tracing, stream handling, and normalization |
| 47 | +- `test_<provider>.py`: keep provider behavior tests next to the integration |
| 48 | +- `cassettes/`: keep VCR recordings next to the integration tests when the provider uses HTTP |
68 | 49 |
|
69 | | -### 3. Define the integration class |
| 50 | +## Integration Rules |
70 | 51 |
|
71 | | -Implement a `BaseIntegration` subclass in `integration.py`. |
72 | | - |
73 | | -Set: |
| 52 | +Keep `integration.py` thin. Set: |
74 | 53 |
|
75 | 54 | - `name` |
76 | 55 | - `import_names` |
77 | | -- `min_version` and `max_version` only when needed |
78 | 56 | - `patchers` |
| 57 | +- `min_version` and `max_version` only when needed |
79 | 58 |
|
80 | | -Keep the class focused on orchestration. Provider-specific tracing logic should stay in `tracing.py`. |
| 59 | +Keep provider behavior in the provider package, not in shared integration code. Put span creation, metadata extraction, stream aggregation, error logging, and output normalization in `tracing.py`. |
81 | 60 |
|
82 | | -### 4. Add one patcher per coherent patch target |
| 61 | +Preserve provider behavior. Do not let tracing-only code break the provider call. |
83 | 62 |
|
84 | | -Put patchers in `patchers.py`. |
| 63 | +## Patcher Rules |
85 | 64 |
|
86 | | -Use `FunctionWrapperPatcher` when patching a single import path with `wrapt.wrap_function_wrapper`. Good examples: |
| 65 | +Create one patcher per coherent patch target. If targets are unrelated, split them. |
87 | 66 |
|
88 | | -- constructor patchers like `ProviderClient.__init__` |
89 | | -- single API surfaces like `client.responses.create` |
90 | | -- one sync and one async constructor patcher instead of one patcher doing both |
| 67 | +Use `FunctionWrapperPatcher` for one import path or one constructor/method surface, for example: |
91 | 68 |
|
92 | | -Keep patchers narrow. If you need to patch multiple unrelated targets, create multiple patchers rather than one large patcher. |
| 69 | +- `ProviderClient.__init__` |
| 70 | +- `client.responses.create` |
93 | 71 |
|
94 | | -Patchers are responsible for: |
| 72 | +Use `CompositeFunctionWrapperPatcher` when several closely related targets should appear as one patcher, for example: |
95 | 73 |
|
96 | | -- stable patcher ids via `name` |
97 | | -- optional version gating |
98 | | -- existence checks |
99 | | -- idempotence through the base patcher marker |
| 74 | +- sync and async variants of the same method |
| 75 | +- the same function patched across multiple modules |
100 | 76 |
|
101 | | -### 5. Keep tracing provider-local |
| 77 | +Set `target_module` when the patch target lives outside the module named by `import_names`, especially for optional or deep submodules. Failed `target_module` imports should cause the patcher to skip cleanly through `applies()`. |
102 | 78 |
|
103 | | -Put span creation, metadata extraction, stream aggregation, error logging, and output normalization in `tracing.py`. |
| 79 | +Expose manual wrapping helpers through `wrap_target()`: |
104 | 80 |
|
105 | | -This layer should: |
| 81 | +```python |
| 82 | +def wrap_agent(Agent: Any) -> Any: |
| 83 | + return AgentRunAsyncPatcher.wrap_target(Agent) |
| 84 | +``` |
106 | 85 |
|
107 | | -- preserve provider behavior |
108 | | -- support sync, async, and streaming paths as needed |
109 | | -- avoid raising from tracing-only code when that would break the provider call |
| 86 | +Use lower `priority` values only when ordering matters, such as context propagation before tracing. |
110 | 87 |
|
111 | | -If the provider has complex streaming internals, keep that logic local instead of forcing it into shared abstractions. |
| 88 | +Patchers must provide: |
112 | 89 |
|
113 | | -### 6. Wire public exports |
| 90 | +- stable `name` values |
| 91 | +- version gating only when needed |
| 92 | +- existence checks |
| 93 | +- idempotence through the base patcher marker |
114 | 94 |
|
115 | | -Update public exports only as needed: |
| 95 | +Use `IntegrationPatchConfig` only when users need patcher-level selection. Let `BaseIntegration.resolve_patchers()` reject unknown patcher ids instead of silently ignoring them. |
116 | 96 |
|
117 | | -- `py/src/braintrust/integrations/__init__.py` |
118 | | -- `py/src/braintrust/__init__.py` |
| 97 | +## Patching Patterns |
119 | 98 |
|
120 | | -### 7. Update auto_instrument only if this integration should be auto-patched |
| 99 | +Use constructor patching when the goal is to instrument future clients created by the provider SDK. Patch the constructor, then attach traced surfaces after the real constructor runs. |
121 | 100 |
|
122 | | -If the provider belongs in `braintrust.auto.auto_instrument()`, add a branch in `py/src/braintrust/auto.py`. |
| 101 | +Use direct method patching with `target_module` when the provider exposes a flatter API and there is no useful constructor patch point. |
123 | 102 |
|
124 | | -Match the current pattern: |
| 103 | +Keep public `wrap_*()` helpers in `patchers.py` and export them from the integration package. |
125 | 104 |
|
126 | | -- plain `bool` options for simple on/off integrations |
127 | | -- `IntegrationPatchConfig` only when users need patcher-level selection |
| 105 | +## Versioning |
128 | 106 |
|
129 | | -## Tests |
| 107 | +Prefer feature detection first and version checks second. |
130 | 108 |
|
131 | | -Keep integration tests with the integration package. |
| 109 | +Use: |
132 | 110 |
|
133 | | -Provider behavior tests should use `@pytest.mark.vcr` whenever the provider uses network calls. Avoid mocks and fakes. |
| 111 | +- `detect_module_version(...)` |
| 112 | +- `version_in_range(...)` |
| 113 | +- `version_matches_spec(...)` |
134 | 114 |
|
135 | | -Cover: |
| 115 | +Do not add `packaging` just for integration routing. |
136 | 116 |
|
137 | | -- direct `wrap(...)` behavior |
138 | | -- `setup()` patching new clients |
139 | | -- sync behavior |
140 | | -- async behavior |
141 | | -- streaming behavior |
142 | | -- idempotence |
143 | | -- failure/error logging |
144 | | -- patcher selection if using `IntegrationPatchConfig` |
| 117 | +## `auto_instrument()` |
145 | 118 |
|
146 | | -Preferred locations: |
| 119 | +Update `py/src/braintrust/auto.py` only if the integration should be auto-patched. |
147 | 120 |
|
148 | | -- provider behavior tests: `py/src/braintrust/integrations/<provider>/test_<provider>.py` |
149 | | -- version helper tests: `py/src/braintrust/integrations/test_versioning.py` |
150 | | -- auto-instrument subprocess tests: `py/src/braintrust/auto_test_scripts/` |
| 121 | +Match the existing option shape: |
151 | 122 |
|
152 | | -If the provider uses VCR, keep cassettes next to the integration test file under `py/src/braintrust/integrations/<provider>/cassettes/`. |
| 123 | +- use plain `bool` for simple on/off integrations that do not use the integrations API |
| 124 | +- use `InstrumentOption` for integrations API providers that support `IntegrationPatchConfig` |
153 | 125 |
|
154 | | -Only re-record cassettes when the behavior change is intentional. |
| 126 | +For integrations API providers, use `_normalize_instrument_option()` and `_instrument_integration(...)` instead of adding a custom `_instrument_*` function: |
155 | 127 |
|
156 | | -Use mocks or fakes only for cases that are hard to drive through recorded provider traffic, such as narrowly scoped error injection, local version-routing logic, or patcher existence checks. |
| 128 | +```python |
| 129 | +enabled, config = _normalize_instrument_option("provider", provider) |
| 130 | +if enabled: |
| 131 | + results["provider"] = _instrument_integration(ProviderIntegration, patch_config=config) |
| 132 | +``` |
157 | 133 |
|
158 | | -## Patterns |
| 134 | +Add the integration import near the other integration imports in `auto.py`. |
159 | 135 |
|
160 | | -### Constructor patching |
| 136 | +## Tests |
161 | 137 |
|
162 | | -If instrumenting future clients created by the SDK is the goal, patch constructors and attach traced surfaces after the real constructor runs. Anthropic is an example of this pattern. |
| 138 | +Keep integration tests in the provider package. |
163 | 139 |
|
164 | | -### Patcher selection |
| 140 | +Use `@pytest.mark.vcr` for real provider network behavior. Prefer recorded provider traffic over mocks or fakes. Use mocks or fakes only for cases that are hard to drive through recordings, such as: |
165 | 141 |
|
166 | | -Use `IntegrationPatchConfig` only when users benefit from enabling or disabling specific patchers. Validate unknown patcher ids through `BaseIntegration.resolve_patchers()` instead of silently ignoring them. |
| 142 | +- narrow error injection |
| 143 | +- local version-routing logic |
| 144 | +- patcher existence checks |
167 | 145 |
|
168 | | -### Versioning |
| 146 | +Cover the surfaces that changed: |
169 | 147 |
|
170 | | -Prefer feature detection first and version checks second. |
| 148 | +- direct `wrap(...)` behavior |
| 149 | +- `setup()` patching new clients |
| 150 | +- sync behavior |
| 151 | +- async behavior |
| 152 | +- streaming behavior |
| 153 | +- idempotence |
| 154 | +- failure and error logging |
| 155 | +- patcher selection when using `IntegrationPatchConfig` |
171 | 156 |
|
172 | | -Use: |
| 157 | +Keep VCR cassettes in `py/src/braintrust/integrations/<provider>/cassettes/`. Re-record them only for intentional behavior changes. |
173 | 158 |
|
174 | | -- `detect_module_version(...)` |
175 | | -- `version_in_range(...)` |
176 | | -- `version_matches_spec(...)` |
| 159 | +## Commands |
177 | 160 |
|
178 | | -Do not add `packaging` just for integration routing. |
| 161 | +```bash |
| 162 | +cd py && nox -s "test_<provider>(latest)" |
| 163 | +cd py && nox -s "test_<provider>(latest)" -- -k "test_name" |
| 164 | +cd py && nox -s "test_<provider>(latest)" -- --vcr-record=all -k "test_name" |
| 165 | +cd py && make test-core |
| 166 | +cd py && make lint |
| 167 | +``` |
179 | 168 |
|
180 | 169 | ## Validation |
181 | 170 |
|
182 | 171 | - Run the narrowest provider session first. |
183 | | -- Run `cd py && make test-core` if you changed shared integration code. |
| 172 | +- Run `cd py && make test-core` if shared integration code changed. |
184 | 173 | - Run `cd py && make lint` before handing off broader integration changes. |
185 | | -- If you changed `auto_instrument()`, run the relevant subprocess auto-instrument tests. |
186 | | - |
187 | | -## Done When |
188 | | - |
189 | | -- the provider package contains only the integration, patcher, tracing, export, and test changes required by the task |
190 | | -- provider behavior tests use VCR unless recorded traffic cannot cover the behavior |
191 | | -- cassette changes are present only when provider behavior changed intentionally |
192 | | -- the narrowest affected provider session passes |
193 | | -- `cd py && make test-core` has been run if shared integration code changed |
194 | | -- `cd py && make lint` has been run before handoff |
| 174 | +- Run the relevant auto-instrument subprocess tests if `auto_instrument()` changed. |
195 | 175 |
|
196 | | -## Common Pitfalls |
| 176 | +## Pitfalls |
197 | 177 |
|
198 | | -- Leaving provider behavior in `BaseIntegration` instead of the provider package. |
199 | | -- Combining multiple unrelated patch targets into one patcher. |
| 178 | +- Moving provider-specific behavior into shared integration code. |
| 179 | +- Combining unrelated targets into one patcher. |
200 | 180 | - Forgetting async or streaming coverage. |
201 | | -- Defaulting to mocks or fakes when the provider flow can be covered with VCR. |
202 | | -- Moving tests but not moving their cassettes. |
203 | 181 | - Adding patcher selection without tests for enabled and disabled cases. |
204 | | -- Editing `auto_instrument()` in a way that implies a registry exists when it does not. |
| 182 | +- Re-recording cassettes when behavior did not intentionally change. |
| 183 | +- Using `_normalize_bool_option()` for an integrations API provider. |
| 184 | +- Adding a custom `_instrument_*` helper where `_instrument_integration()` already fits. |
| 185 | +- Forgetting `target_module` for deep or optional submodule patch targets. |
0 commit comments