skills(execution-compute): drop bundled compute.py, document the CLI surface by jamesbroadhead · Pull Request #90 · databricks/databricks-agent-skills

jamesbroadhead · 2026-05-26T08:52:08Z

Summary

Per Lennart's audit on #73, item #7: "for getting this out of experimental I think we need to just rely on the Databricks CLI. And if that somehow has gaps then we should fix them!"

Doing this now rather than waiting for stable promotion — the CLI already covers everything scripts/compute.py wrapped, just verbosely for the cluster-code-execution path.

Removed

experimental/databricks-execution-compute/scripts/compute.py (743 lines, three subcommands).

Replacement mapping

`compute.py` subcommand	CLI replacement
`list-compute --resource clusters`	`databricks clusters list --output json`
`list-compute --resource node-types`	`databricks clusters list-node-types`
`list-compute --resource spark-versions`	`databricks clusters spark-versions`
`list-compute --auto-select`	`databricks clusters list --cluster-sources UI,API --output json \| jq '.[] \| select(.state=="RUNNING")'`
`manage-cluster --action create/start/restart/terminate/delete`	`databricks clusters create/start/restart/delete/permanent-delete`
`execute-code --compute-type serverless` (upload+submit+wait+cleanup)	The existing `databricks workspace import` + `databricks jobs submit` + `databricks jobs get-run` + `databricks jobs get-run-output` flow that SKILL.md already documents. The "Convenience wrapper" section pointing at `compute.py` is removed.
`execute-code --compute-type cluster` (code on a running cluster with persisted context)	Multi-step `databricks api post /api/1.2/contexts/create` → `/api/1.2/commands/execute` → poll `/api/1.2/commands/status` → optional `/api/1.2/contexts/destroy`. Full recipe (create context, submit, poll, fetch results, follow-up commands reusing `contextId`, `%pip install`, language switching) added to `references/3-interactive-cluster.md`.

Other changes

Removed the <SKILL_ROOT> path-convention note from SKILL.md (was only there to reference scripts/compute.py paths).
"CLI Commands" table replaced with a more useful "CLI Command Map" covering upload, submit, run state, run output, cluster lifecycle, code-execution-on-cluster, and warehouses.
Common-issues table for serverless updated to reference the tasks[].timeout_seconds field on submit JSON instead of --timeout on the wrapper.

Test plan

python3 scripts/skills.py generate clean.
python3 scripts/skills.py validate passes.
CI green.
Reviewer sanity-check of the cluster-execution recipe in references/3-interactive-cluster.md.

This pull request and its description were written by Claude.

…surface Per Lennart's audit on #73 (item #7): "for getting this out of experimental I think we need to just rely on the Databricks CLI." Going ahead and doing this now rather than waiting for stable promotion — the CLI already covers everything compute.py wrapped, just verbosely for the cluster-code-execution path. Removed: experimental/databricks-execution-compute/scripts/compute.py (743 lines, three subcommands). Replacement mapping: - `compute.py list-compute` (clusters/node-types/spark-versions) → `databricks clusters list/list-node-types/spark-versions` - `compute.py manage-cluster --action ...` → `databricks clusters create/start/restart/delete/permanent-delete/resize` - `compute.py execute-code --compute-type serverless` (upload+submit+ wait+cleanup) → existing `databricks workspace import` + `databricks jobs submit [--no-wait]` + `databricks jobs get-run` + `databricks jobs get-run-output` flow (already documented in SKILL.md / 2-serverless-job.md; the "Convenience wrapper" section that pointed at compute.py is removed). - `compute.py execute-code --compute-type cluster` (code execution on a running cluster with persisted context) → multi-step `databricks api post /api/1.2/contexts/create` → `/api/1.2/commands/execute` → poll `/api/1.2/commands/status` → optional `/api/1.2/contexts/destroy`. The full recipe (create context, submit, poll, fetch results, follow-up commands reusing contextId, %pip install, language switching) is in references/3-interactive-cluster.md. Side effects: - Removed the `<SKILL_ROOT>` path-convention note from SKILL.md (was only there to reference scripts/compute.py paths). - Added a "CLI Command Map" table to SKILL.md replacing the three-row compute.py table — now covers upload/submit/run-state/ output, cluster list/get/start/stop/create, code-execution-on-cluster, and warehouses. Manifest regenerated. `python3 scripts/skills.py validate` passes. Co-authored-by: Isaac

lennartkats-db

Minor comments

lennartkats-db · 2026-05-26T14:01:18Z

-Run code on Databricks. Three execution modes—choose based on workload.
-
-> **Path convention:** `<SKILL_ROOT>` in examples below = the directory containing this SKILL.md. Resolve it to the absolute path in your install (e.g. `~/.claude/skills/databricks-execution-compute`). Commands like `python <SKILL_ROOT>/scripts/compute.py ...` work from any cwd.
+Run code on Databricks. Three execution modes—choose based on workload. All examples below use the Databricks CLI; install with `pip install databricks-cli` (or follow the workspace-native `databricks` quickstart) and authenticate with `databricks auth login` (see the parent `databricks-core` skill for profile setup).


This installation instruction is for the legacy CLI; we don't use pip anymore

(Claude here, on James's behalf.)

Fixed in 3d4d88b — dropped the pip install databricks-cli reference and now point readers at the databricks-core skill for install + auth.

Also did a wider grep — found one more carryover in skills/databricks-pipelines/references/workflows.md (came in from the a-d-k port in #85, originally databricks-skills/databricks-spark-declarative-pipelines/references/1-project-initialization.md:418 upstream). Fixed in the same commit. There's also a stale error message in ai-dev-kit's databricks-mcp-server/databricks_mcp_server/tools/workspace.py:211 recommending the same legacy package — that one's out of scope here but worth flagging upstream.

lennartkats-db · 2026-05-26T14:02:00Z

 | Dependencies listed but not installed | `"client": "1"` silently drops `dependencies`; use `"client": "4"` |
 | `get-run-output` returns empty `notebook_output` | You passed the parent run_id, not `.tasks[0].run_id` |
-| Job times out | Default 1800 s on the script wrapper; raise `--timeout` or use `jobs submit --no-wait` + your own polling |
+| Job times out | Use `databricks jobs submit --no-wait` and poll `get-run` yourself, or set `tasks[].timeout_seconds` in the submit JSON to extend the per-task limit |


databricks api get … --json '{…}' for status polling is fishy. In references/3-interactive-cluster.md §3, the polling loop sends a JSON body on an HTTP GET to /api/1.2/commands/status. That endpoint takes clusterId/contextId/commandId as query string parameters, and HTTP GET bodies are usually dropped. The replacement should be: databricks api get "/api/1.2/commands/status?clusterId=$CID&contextId=$CTX&commandId=$CMD

(Claude here.)

Confirmed your read and fixed in 7ebcdb0.

Looked at alternatives before patching: the modern Databricks CLI doesn't ship a command-execution subcommand (just checked v0.299.2 — no group covers the 1.2 endpoints), and the modern SDKs route the same way under the hood. So databricks api get with query-string params is the only correct surface for this legacy API. Done as you suggested:

databricks api get "/api/1.2/commands/status?clusterId=${CID}&contextId=${CTX}&commandId=${CMD}"

Added a one-liner above the snippet noting why the GET body has to go in the URL.

# Conflicts: # manifest.json

@lennartkats-db

That PyPI package is the legacy CLI; the modern CLI is a binary. Per @lennartkats-db review on #90, point readers at the `databricks-core` skill (which has install + auth references) instead. Two hits fixed: - experimental/databricks-execution-compute/SKILL.md intro - skills/databricks-pipelines/references/workflows.md troubleshooting table (carried over from the a-d-k port in #85) Co-authored-by: Isaac

@lennartkats-db

Per @lennartkats-db review: that endpoint takes clusterId/contextId/ commandId as query string parameters, and HTTP GET bodies are typically dropped by intermediaries. The legacy 1.2 execution-context API is the only surface here (no native CLI subcommand exists), so the fix is to move the params into the URL. Co-authored-by: Isaac

… from a-d-k#534 (#97) ## Summary Mirrors the four fixes shipping in [databricks-solutions/ai-dev-kit#534](databricks-solutions/ai-dev-kit#534) to the same files under `experimental/`. Each of these files was carried over verbatim from a-d-k and still had the bugs that PR fixes. ## Files | File | Fix | |------|-----| | `experimental/databricks-apps-python/examples/fm-parallel-calls.py` | guard div-by-zero on speedup calc when `total_time` is below clock resolution; convert trailing module-level docstring (unreachable as docs) into a real comment block | | `experimental/databricks-apps-python/examples/llm_config.py` | stop echoing the raw token-endpoint response and HTTP body in error messages — they can contain credentials. Surface the shape (payload keys / status code) instead | | `experimental/databricks-python-sdk/examples/5-serving-and-vector-search.py` | replace syntactically-broken example `query_vector=[0.1, 0.2, 0.3, ...]` (literal Ellipsis tuple) with a real `list[float]` stand-in + a comment about matching the index's embedding dimension | | `experimental/databricks-apps-python/examples/fm-minimal-chat.py` | fix `streamlit run 2-minimal-chat-app.py` / `command: ["streamlit", "run", "2-minimal-chat-app.py"]` in the docstring — the actual filename is `fm-minimal-chat.py`. Skill-name reference left as `databricks-model-serving` to match DAS naming | The `compute.py` fix from #534 doesn't apply here — that file was removed in #90. ## Test plan - [x] `python3 scripts/skills.py validate` clean - [ ] CI green This pull request and its description were written by Claude.

QuentinAmbard · 2026-05-27T10:14:06Z


 ```bash
-python <SKILL_ROOT>/scripts/compute.py list-compute --resource clusters
+databricks clusters list --output json | jq '.[] | select(.state == "RUNNING") | {cluster_id, cluster_name, state, cluster_source}'


I believe we need to filter (server side) by default on
--cluster-sources UI,API
Otherwise it's super slow in our env so we want to have this default in the example

Done in 377b1c1. Also applied to line 163 which had the same issue — line 36 already had the filter so just bringing the other two in line with that convention.

jamesbroadhead · 2026-05-27T14:41:43Z

@lennartkats-db — (Claude on James's behalf.) Both your review comments are addressed:

pip install databricks-cli → dropped; points at the databricks-core skill for install + auth (commit 3d4d88b).
GET-with-body on /api/1.2/commands/status → rewritten to use query-string params; the modern CLI has no command-execution subcommand so this is the canonical surface (commit 7ebcdb0).

Replies on both threads. Quentin approved separately. When you have a minute, re-review so this can land.

@QuentinAmbard

…,API Per @QuentinAmbard: on busy workspaces the cluster listing is dominated by job clusters and the unfiltered API call is slow. Apply the server-side filter to the "first action" snippet at line 24 and the secondary reference at line 163, matching the convention the file already uses at line 36. This PR was prepared by Claude.

jamesbroadhead requested review from a team, dustinvannoy-db, lennartkats-db and simonfaltum as code owners May 26, 2026 08:52

jamesbroadhead requested review from dustinvannoy-db, lennartkats-db and simonfaltum May 26, 2026 08:52

lennartkats-db requested changes May 26, 2026

View reviewed changes

jamesbroadhead added 2 commits May 26, 2026 16:12

Merge remote-tracking branch 'origin/main' into jb/drop-compute-py

cf450ca

# Conflicts: # manifest.json

jamesbroadhead requested a review from camielstee-db as a code owner May 26, 2026 16:46

jamesbroadhead mentioned this pull request May 26, 2026

skills(experimental): propagate ace-review correctness/security fixes from a-d-k#534 #97

Merged

2 tasks

QuentinAmbard approved these changes May 27, 2026

View reviewed changes

jamesbroadhead requested a review from QuentinAmbard May 28, 2026 11:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

skills(execution-compute): drop bundled compute.py, document the CLI surface#90

skills(execution-compute): drop bundled compute.py, document the CLI surface#90
jamesbroadhead wants to merge 5 commits into
mainfrom
jb/drop-compute-py

jamesbroadhead commented May 26, 2026

Uh oh!

lennartkats-db left a comment

Uh oh!

lennartkats-db May 26, 2026

Uh oh!

jamesbroadhead May 26, 2026

Uh oh!

lennartkats-db May 26, 2026

Uh oh!

jamesbroadhead May 26, 2026

Uh oh!

QuentinAmbard May 27, 2026

Uh oh!

jamesbroadhead May 28, 2026

Uh oh!

jamesbroadhead commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jamesbroadhead commented May 26, 2026

Summary

Removed

Replacement mapping

Other changes

Test plan

Uh oh!

lennartkats-db left a comment

Choose a reason for hiding this comment

Uh oh!

lennartkats-db May 26, 2026

Choose a reason for hiding this comment

Uh oh!

jamesbroadhead May 26, 2026

Choose a reason for hiding this comment

Uh oh!

lennartkats-db May 26, 2026

Choose a reason for hiding this comment

Uh oh!

jamesbroadhead May 26, 2026

Choose a reason for hiding this comment

Uh oh!

QuentinAmbard May 27, 2026

Choose a reason for hiding this comment

Uh oh!

jamesbroadhead May 28, 2026

Choose a reason for hiding this comment

Uh oh!

jamesbroadhead commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants