From 0da2ae5f1a36688113a9e013a081fff96cb4e41f Mon Sep 17 00:00:00 2001 From: James Broadhead Date: Wed, 27 May 2026 15:39:50 +0000 Subject: [PATCH 1/2] docs(skills): use top-level `databricks aitools` everywhere Follow-up to #60, which only touched README/manifest and the jobs + pipelines SKILL.md. This sweeps the remaining `databricks experimental aitools` references across the core/apps skills, the experimental skills, and reference docs. The old paths still work as deprecated aliases (see databricks/cli#4917), but new readers should be pointed at the supported command. Co-authored-by: Isaac --- experimental/databricks-agent-bricks/SKILL.md | 2 +- .../databricks-aibi-dashboards/SKILL.md | 12 ++-- experimental/databricks-apps-python/SKILL.md | 2 +- experimental/databricks-dbsql/SKILL.md | 2 +- experimental/databricks-metric-views/SKILL.md | 6 +- .../databricks-synthetic-data-gen/SKILL.md | 4 +- .../references/2-troubleshooting.md | 14 ++--- .../databricks-unity-catalog/SKILL.md | 4 +- .../references/7-data-profiling.md | 2 +- .../references/end-to-end-rag.md | 4 +- skills/databricks-apps/SKILL.md | 2 +- .../references/appkit/genie.md | 4 +- skills/databricks-core/SKILL.md | 10 +-- skills/databricks-core/data-exploration.md | 62 +++++++++---------- .../references/workflows.md | 2 +- 15 files changed, 66 insertions(+), 66 deletions(-) diff --git a/experimental/databricks-agent-bricks/SKILL.md b/experimental/databricks-agent-bricks/SKILL.md index cb0bfb6..756caa4 100644 --- a/experimental/databricks-agent-bricks/SKILL.md +++ b/experimental/databricks-agent-bricks/SKILL.md @@ -19,7 +19,7 @@ Agent Bricks are pre-built AI tiles in Databricks that provide conversational in ```bash # Find volumes databricks volumes list CATALOG SCHEMA -databricks experimental aitools tools query --warehouse WH "LIST '/Volumes/catalog/schema/volume/'" +databricks aitools tools query --warehouse WH "LIST '/Volumes/catalog/schema/volume/'" # Create KA databricks knowledge-assistants create-knowledge-assistant "Name" "Description" diff --git a/experimental/databricks-aibi-dashboards/SKILL.md b/experimental/databricks-aibi-dashboards/SKILL.md index 54defbb..92afa3f 100644 --- a/experimental/databricks-aibi-dashboards/SKILL.md +++ b/experimental/databricks-aibi-dashboards/SKILL.md @@ -14,15 +14,15 @@ A dashboard should be showing something relevant for a human, typically some KPI | Task | Command | |------|---------| | List warehouses | `databricks warehouses list` | -| List tables | `databricks experimental aitools tools query --warehouse WH "SHOW TABLES IN catalog.schema"` | -| Get schema | `databricks experimental aitools tools discover-schema catalog.schema.table1 catalog.schema.table2` | -| Test query | `databricks experimental aitools tools query --warehouse WH "SELECT..."` | +| List tables | `databricks aitools tools query --warehouse WH "SHOW TABLES IN catalog.schema"` | +| Get schema | `databricks aitools tools discover-schema catalog.schema.table1 catalog.schema.table2` | +| Test query | `databricks aitools tools query --warehouse WH "SELECT..."` | | Create dashboard | `databricks lakeview create --display-name "X" --warehouse-id "WH" --dataset-catalog CATALOG --dataset-schema SCHEMA --serialized-dashboard "$(cat file.json)" --json '{"parent_path": "/Workspace/Users//path"}'` — `--dataset-catalog` / `--dataset-schema` are **flag-only** (REQUIRED; CLI silently drops them if put in `--json`); `parent_path` is JSON-only (no flag). Queries must use bare table names. | | Update dashboard | `databricks lakeview update DASHBOARD_ID --serialized-dashboard "$(cat file.json)"` | | Publish | `databricks lakeview publish DASHBOARD_ID --warehouse-id WH` | | Delete | `databricks lakeview trash DASHBOARD_ID` | -> **`--warehouse` flag**: if `databricks experimental aitools tools query --warehouse WH "..."` fails with `unknown flag: --warehouse` on your CLI version, set `DATABRICKS_WAREHOUSE_ID=WH` in the environment instead and drop the flag — the command auto-picks it from there. +> **`--warehouse` flag**: if `databricks aitools tools query --warehouse WH "..."` fails with `unknown flag: --warehouse` on your CLI version, set `DATABRICKS_WAREHOUSE_ID=WH` in the environment instead and drop the flag — the command auto-picks it from there. --- @@ -57,9 +57,9 @@ A good dashboard comes from knowing the data first. Spend time here — the expl Use `discover-schema` as the default — one call returns columns, types, sample rows, null counts, and row count. If you only know the schema, list tables first with `query "SHOW TABLES IN ..."`. -`databricks experimental aitools tools discover-schema catalog.schema.orders catalog.schema.customers` +`databricks aitools tools discover-schema catalog.schema.orders catalog.schema.customers` -Sample rows alone don't tell you what to build. you can write aggregate SQL through `databricks experimental aitools tools query --warehouse "..."` to probe typically: +Sample rows alone don't tell you what to build. you can write aggregate SQL through `databricks aitools tools query --warehouse "..."` to probe typically: - **Cardinality** of candidate grouping columns → decides chart color-group vs. table (≤8 distinct values for charts, see Cardinality & Readability below). - **Top categorical values** → populates filter options and chart legends meaningfully. diff --git a/experimental/databricks-apps-python/SKILL.md b/experimental/databricks-apps-python/SKILL.md index 9b14627..68c2cb1 100644 --- a/experimental/databricks-apps-python/SKILL.md +++ b/experimental/databricks-apps-python/SKILL.md @@ -39,7 +39,7 @@ databricks apps deploy ### AI-assisted development ```bash # Install agent skills for AI-powered scaffolding -databricks experimental aitools skills install +databricks aitools install # Query AppKit docs inline npx @databricks/appkit docs "your question here" diff --git a/experimental/databricks-dbsql/SKILL.md b/experimental/databricks-dbsql/SKILL.md index b2d8ced..b74e4bb 100644 --- a/experimental/databricks-dbsql/SKILL.md +++ b/experimental/databricks-dbsql/SKILL.md @@ -297,4 +297,4 @@ Load these for detailed syntax, full parameter lists, and advanced patterns: - **Star schema in Gold layer** for BI; OBT acceptable in Silver - **Define PK/FK constraints** on dimensional models for query optimization - **Use `COLLATE UTF8_LCASE`** for user-facing string columns that need case-insensitive search -- **Test SQL via CLI** (`databricks experimental aitools tools query`) or notebooks before deploying. If `--warehouse` is rejected on your CLI version, set `DATABRICKS_WAREHOUSE_ID` in the environment instead. +- **Test SQL via CLI** (`databricks aitools tools query`) or notebooks before deploying. If `--warehouse` is rejected on your CLI version, set `DATABRICKS_WAREHOUSE_ID` in the environment instead. diff --git a/experimental/databricks-metric-views/SKILL.md b/experimental/databricks-metric-views/SKILL.md index c2c396a..09107da 100644 --- a/experimental/databricks-metric-views/SKILL.md +++ b/experimental/databricks-metric-views/SKILL.md @@ -29,9 +29,9 @@ Use this skill when: Before authoring a metric view, inspect the source tables. Use `discover-schema` as the default — one call returns columns, types, sample rows, null counts, and row count. If you only know the schema, list tables first with `query "SHOW TABLES IN ..."`. -`databricks experimental aitools tools discover-schema catalog.schema.orders catalog.schema.customers` +`databricks aitools tools discover-schema catalog.schema.orders catalog.schema.customers` -For dimensions and measures, probe distribution beyond sampling — cardinality of candidate dimensions, min/max/percentiles for measures, top categorical values. Write aggregate SQL through `databricks experimental aitools tools query --warehouse "..."`. Both commands auto-pick the default warehouse; set `DATABRICKS_WAREHOUSE_ID` or pass `--warehouse ` to override. +For dimensions and measures, probe distribution beyond sampling — cardinality of candidate dimensions, min/max/percentiles for measures, top categorical values. Write aggregate SQL through `databricks aitools tools query --warehouse "..."`. Both commands auto-pick the default warehouse; set `DATABRICKS_WAREHOUSE_ID` or pass `--warehouse ` to override. ### Create a Metric View @@ -157,7 +157,7 @@ DROP VIEW IF EXISTS catalog.schema.orders_metrics; ```bash # Execute SQL via CLI -databricks experimental aitools tools query --warehouse WAREHOUSE_ID " +databricks aitools tools query --warehouse WAREHOUSE_ID " CREATE OR REPLACE VIEW catalog.schema.orders_metrics WITH METRICS LANGUAGE YAML diff --git a/experimental/databricks-synthetic-data-gen/SKILL.md b/experimental/databricks-synthetic-data-gen/SKILL.md index 510f576..e017a9d 100644 --- a/experimental/databricks-synthetic-data-gen/SKILL.md +++ b/experimental/databricks-synthetic-data-gen/SKILL.md @@ -128,10 +128,10 @@ Show a clear specification with **the business story and your assumptions surfac ### Post-Generation Validation -Use `databricks experimental aitools tools query` to validate generated data (row counts, distributions, referential integrity). Query parquet files directly: +Use `databricks aitools tools query` to validate generated data (row counts, distributions, referential integrity). Query parquet files directly: ```bash -databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " +databricks aitools tools query --warehouse $WAREHOUSE_ID " SELECT COUNT(*) FROM parquet.\`/Volumes/CATALOG/SCHEMA/raw_data/customers\` " ``` diff --git a/experimental/databricks-synthetic-data-gen/references/2-troubleshooting.md b/experimental/databricks-synthetic-data-gen/references/2-troubleshooting.md index 793b64f..76d4400 100644 --- a/experimental/databricks-synthetic-data-gen/references/2-troubleshooting.md +++ b/experimental/databricks-synthetic-data-gen/references/2-troubleshooting.md @@ -293,30 +293,30 @@ WAREHOUSE_ID="your-warehouse-id" VOLUME_PATH="/Volumes/CATALOG/SCHEMA/raw_data" # 1. Check row counts -databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " +databricks aitools tools query --warehouse $WAREHOUSE_ID " SELECT 'customers' as table_name, COUNT(*) as row_count FROM parquet.\`${VOLUME_PATH}/customers\` UNION ALL SELECT 'orders', COUNT(*) FROM parquet.\`${VOLUME_PATH}/orders\` " # 2. Preview schema and sample data -databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " +databricks aitools tools query --warehouse $WAREHOUSE_ID " DESCRIBE SELECT * FROM parquet.\`${VOLUME_PATH}/customers\` " -databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " +databricks aitools tools query --warehouse $WAREHOUSE_ID " SELECT * FROM parquet.\`${VOLUME_PATH}/customers\` LIMIT 5 " # 3. Verify distributions -databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " +databricks aitools tools query --warehouse $WAREHOUSE_ID " SELECT tier, COUNT(*) as count, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 1) as pct FROM parquet.\`${VOLUME_PATH}/customers\` GROUP BY tier ORDER BY tier " # 4. Check amount statistics -databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " +databricks aitools tools query --warehouse $WAREHOUSE_ID " SELECT MIN(amount) as min_amount, MAX(amount) as max_amount, @@ -326,7 +326,7 @@ FROM parquet.\`${VOLUME_PATH}/orders\` " # 5. Check referential integrity -databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " +databricks aitools tools query --warehouse $WAREHOUSE_ID " SELECT COUNT(*) as orphan_orders FROM parquet.\`${VOLUME_PATH}/orders\` o LEFT JOIN parquet.\`${VOLUME_PATH}/customers\` c ON o.customer_id = c.customer_id @@ -334,7 +334,7 @@ WHERE c.customer_id IS NULL " # 6. Verify date range -databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " +databricks aitools tools query --warehouse $WAREHOUSE_ID " SELECT MIN(order_date) as min_date, MAX(order_date) as max_date FROM parquet.\`${VOLUME_PATH}/orders\` " diff --git a/experimental/databricks-unity-catalog/SKILL.md b/experimental/databricks-unity-catalog/SKILL.md index bc2a9c8..e790bfd 100644 --- a/experimental/databricks-unity-catalog/SKILL.md +++ b/experimental/databricks-unity-catalog/SKILL.md @@ -101,11 +101,11 @@ GROUP BY workspace_id, sku_name; ## SQL Queries via CLI -Use `databricks experimental aitools tools query` for system table queries: +Use `databricks aitools tools query` for system table queries: ```bash # Query lineage via CLI -databricks experimental aitools tools query --warehouse WAREHOUSE_ID " +databricks aitools tools query --warehouse WAREHOUSE_ID " SELECT source_table_full_name, target_table_full_name FROM system.access.table_lineage WHERE event_date >= current_date() - 7 diff --git a/experimental/databricks-unity-catalog/references/7-data-profiling.md b/experimental/databricks-unity-catalog/references/7-data-profiling.md index 930d29a..0c64a80 100644 --- a/experimental/databricks-unity-catalog/references/7-data-profiling.md +++ b/experimental/databricks-unity-catalog/references/7-data-profiling.md @@ -68,7 +68,7 @@ DROP QUALITY MONITOR catalog.schema.my_table; ### Execute via CLI ```bash -databricks experimental aitools tools query --warehouse WAREHOUSE_ID " +databricks aitools tools query --warehouse WAREHOUSE_ID " CREATE OR REPLACE QUALITY MONITOR catalog.schema.my_table OPTIONS (OUTPUT_SCHEMA 'catalog.schema') " diff --git a/experimental/databricks-vector-search/references/end-to-end-rag.md b/experimental/databricks-vector-search/references/end-to-end-rag.md index 00959f9..c402420 100644 --- a/experimental/databricks-vector-search/references/end-to-end-rag.md +++ b/experimental/databricks-vector-search/references/end-to-end-rag.md @@ -6,7 +6,7 @@ Build a complete Retrieval-Augmented Generation pipeline: prepare documents, cre | Command | Step | |---------|------| -| `databricks experimental aitools tools query` | Create source table, insert documents | +| `databricks aitools tools query` | Create source table, insert documents | | `databricks vector-search-endpoints create-endpoint` | Create compute endpoint | | `databricks vector-search-indexes create-index` | Create Delta Sync index with managed embeddings | | `databricks vector-search-indexes sync-index` | Trigger index sync | @@ -37,7 +37,7 @@ INSERT INTO catalog.schema.knowledge_base VALUES Or via CLI: ```bash -databricks experimental aitools tools query --warehouse WAREHOUSE_ID " +databricks aitools tools query --warehouse WAREHOUSE_ID " CREATE TABLE IF NOT EXISTS catalog.schema.knowledge_base ( doc_id STRING, title STRING, diff --git a/skills/databricks-apps/SKILL.md b/skills/databricks-apps/SKILL.md index 95f3066..8d3c8c0 100644 --- a/skills/databricks-apps/SKILL.md +++ b/skills/databricks-apps/SKILL.md @@ -168,7 +168,7 @@ npx @databricks/appkit docs ./docs/plugins/analytics.md # example: specific doc Optionally use `--version ` to target a specific AppKit version. - **Required**: `--name`, `--profile`. Name: ≤26 chars, lowercase letters/numbers/hyphens only. Use `--features` only for **optional** plugins the user wants (plugins with `requiredByTemplate: false` or absent); mandatory plugins must not be listed in `--features`. - **Resources**: Pass `--set` for every required resource (each field in `resources.required`) for (1) all plugins with `requiredByTemplate: true`, and (2) any optional plugins you added to `--features`. Add `--set` for `resources.optional` only when the user requests them. - - **Discovery**: Use the parent `databricks-core` skill to resolve IDs (e.g. warehouse: `databricks warehouses list --profile ` or `databricks experimental aitools tools get-default-warehouse --profile `). + - **Discovery**: Use the parent `databricks-core` skill to resolve IDs (e.g. warehouse: `databricks warehouses list --profile ` or `databricks aitools tools get-default-warehouse --profile `). **DO NOT guess** plugin names, resource keys, or property names — always derive them from `databricks apps manifest` output. Example: if the manifest shows plugin `analytics` with a required resource `resourceKey: "sql-warehouse"` and `fields: { "id": ... }`, include `--set analytics.sql-warehouse.id=`. diff --git a/skills/databricks-apps/references/appkit/genie.md b/skills/databricks-apps/references/appkit/genie.md index 21734d1..63ab37f 100644 --- a/skills/databricks-apps/references/appkit/genie.md +++ b/skills/databricks-apps/references/appkit/genie.md @@ -44,14 +44,14 @@ databricks genie get-space --include-serialized-space --profile +databricks aitools tools get-default-warehouse --profile ``` ## Scaffolding a New Genie App ```bash # 1. Discover warehouse -databricks experimental aitools tools get-default-warehouse --profile +databricks aitools tools get-default-warehouse --profile # 2. Create Genie space (see syntax above) databricks genie create-space '' \ diff --git a/skills/databricks-core/SKILL.md b/skills/databricks-core/SKILL.md index b41f8df..3e3715f 100644 --- a/skills/databricks-core/SKILL.md +++ b/skills/databricks-core/SKILL.md @@ -61,13 +61,13 @@ databricks apps list # profile not set! ```bash # discover table structure (columns, types, sample data, stats) -databricks experimental aitools tools discover-schema catalog.schema.table --profile +databricks aitools tools discover-schema catalog.schema.table --profile # run ad-hoc SQL queries -databricks experimental aitools tools query "SELECT * FROM table LIMIT 10" --profile +databricks aitools tools query "SELECT * FROM table LIMIT 10" --profile # find the default warehouse -databricks experimental aitools tools get-default-warehouse --profile +databricks aitools tools get-default-warehouse --profile ``` See [Data Exploration](data-exploration.md) for details. @@ -100,8 +100,8 @@ databricks tables get .. --profile # databricks schemas list --catalog-name ← WILL FAIL # databricks tables list --catalog ← WILL FAIL # databricks sql-warehouses list ← doesn't exist, use `warehouses list` -# databricks execute-statement ← doesn't exist, use `experimental aitools tools query` -# databricks sql execute ← doesn't exist, use `experimental aitools tools query` +# databricks execute-statement ← doesn't exist, use `aitools tools query` +# databricks sql execute ← doesn't exist, use `aitools tools query` # When in doubt, check help: # databricks schemas list --help diff --git a/skills/databricks-core/data-exploration.md b/skills/databricks-core/data-exploration.md index 5908cc5..329cd1c 100644 --- a/skills/databricks-core/data-exploration.md +++ b/skills/databricks-core/data-exploration.md @@ -10,17 +10,17 @@ Use `information_schema` to search for tables by keyword — do NOT manually ite ```bash # Find tables matching a keyword -databricks experimental aitools tools query \ +databricks aitools tools query \ "SELECT table_catalog, table_schema, table_name FROM system.information_schema.tables WHERE table_name LIKE '%keyword%'" \ --profile # Then discover schema for the tables you found -databricks experimental aitools tools discover-schema catalog.schema.table1 catalog.schema.table2 --profile +databricks aitools tools discover-schema catalog.schema.table1 catalog.schema.table2 --profile ``` ## Overview -The `databricks experimental aitools tools` command group provides tools for data discovery and exploration: +The `databricks aitools tools` command group provides tools for data discovery and exploration: - **discover-schema**: Batch discover table metadata, columns, types, sample data, and statistics - **query**: Execute SQL queries against Databricks SQL warehouses @@ -43,7 +43,7 @@ Batch discover table metadata including columns, types, sample data, and null co ### Command Syntax ```bash -databricks experimental aitools tools discover-schema TABLE... [flags] +databricks aitools tools discover-schema TABLE... [flags] ``` Tables must be specified in **CATALOG.SCHEMA.TABLE** format. @@ -60,16 +60,16 @@ For each table, returns: ```bash # Discover schema for a single table -databricks experimental aitools tools discover-schema samples.nyctaxi.trips --profile my-workspace +databricks aitools tools discover-schema samples.nyctaxi.trips --profile my-workspace # Discover schema for multiple tables -databricks experimental aitools tools discover-schema \ +databricks aitools tools discover-schema \ catalog.schema.table1 \ catalog.schema.table2 \ --profile my-workspace # Get JSON output -databricks experimental aitools tools discover-schema \ +databricks aitools tools discover-schema \ samples.nyctaxi.trips \ --output json \ --profile my-workspace @@ -79,12 +79,12 @@ databricks experimental aitools tools discover-schema \ 1. **Understanding table structure before querying** ```bash - databricks experimental aitools tools discover-schema catalog.schema.customer_data --profile my-workspace + databricks aitools tools discover-schema catalog.schema.customer_data --profile my-workspace ``` 2. **Comparing schemas across multiple tables** ```bash - databricks experimental aitools tools discover-schema \ + databricks aitools tools discover-schema \ catalog.schema.table_v1 \ catalog.schema.table_v2 \ --profile my-workspace @@ -100,7 +100,7 @@ Execute SQL statements against a Databricks SQL warehouse and return results. ### Command Syntax ```bash -databricks experimental aitools tools query "SQL" [flags] +databricks aitools tools query "SQL" [flags] ``` ### Warehouse Selection @@ -112,7 +112,7 @@ The command **auto-detects** an available warehouse unless: To check which warehouse will be used: ```bash # Get the default warehouse that would be auto-detected -databricks experimental aitools tools get-default-warehouse --profile my-workspace +databricks aitools tools get-default-warehouse --profile my-workspace ``` ### Output @@ -126,23 +126,23 @@ Returns: ```bash # Simple SELECT query -databricks experimental aitools tools query \ +databricks aitools tools query \ "SELECT * FROM samples.nyctaxi.trips LIMIT 5" \ --profile my-workspace # Aggregation query -databricks experimental aitools tools query \ +databricks aitools tools query \ "SELECT vendor_id, COUNT(*) as trip_count FROM samples.nyctaxi.trips GROUP BY vendor_id" \ --profile my-workspace # With JSON output -databricks experimental aitools tools query \ +databricks aitools tools query \ "SELECT * FROM catalog.schema.table WHERE date > '2024-01-01'" \ --output json \ --profile my-workspace # Using specific warehouse -DATABRICKS_WAREHOUSE_ID=abc123 databricks experimental aitools tools query \ +DATABRICKS_WAREHOUSE_ID=abc123 databricks aitools tools query \ "SELECT * FROM samples.nyctaxi.trips LIMIT 10" \ --profile my-workspace ``` @@ -152,17 +152,17 @@ DATABRICKS_WAREHOUSE_ID=abc123 databricks experimental aitools tools query \ 1. **Exploratory data analysis** ```bash # Check table size - databricks experimental aitools tools query \ + databricks aitools tools query \ "SELECT COUNT(*) FROM catalog.schema.table" \ --profile my-workspace # View sample data - databricks experimental aitools tools query \ + databricks aitools tools query \ "SELECT * FROM catalog.schema.table LIMIT 10" \ --profile my-workspace # Get column statistics - databricks experimental aitools tools query \ + databricks aitools tools query \ "SELECT MIN(column), MAX(column), AVG(column) FROM catalog.schema.table" \ --profile my-workspace ``` @@ -170,12 +170,12 @@ DATABRICKS_WAREHOUSE_ID=abc123 databricks experimental aitools tools query \ 2. **Data validation** ```bash # Check for null values - databricks experimental aitools tools query \ + databricks aitools tools query \ "SELECT COUNT(*) FROM catalog.schema.table WHERE column IS NULL" \ --profile my-workspace # Verify data freshness - databricks experimental aitools tools query \ + databricks aitools tools query \ "SELECT MAX(timestamp_column) FROM catalog.schema.table" \ --profile my-workspace ``` @@ -183,7 +183,7 @@ DATABRICKS_WAREHOUSE_ID=abc123 databricks experimental aitools tools query \ 3. **Quick analytics** ```bash # Group by analysis - databricks experimental aitools tools query \ + databricks aitools tools query \ "SELECT category, COUNT(*), AVG(value) FROM catalog.schema.table GROUP BY category" \ --profile my-workspace ``` @@ -194,12 +194,12 @@ Here's a typical workflow combining both commands: ```bash # 1. Discover the schema first -databricks experimental aitools tools discover-schema \ +databricks aitools tools discover-schema \ samples.nyctaxi.trips \ --profile my-workspace # 2. Based on discovered columns, run targeted queries -databricks experimental aitools tools query \ +databricks aitools tools query \ "SELECT vendor_id, payment_type, COUNT(*) as trips, AVG(fare_amount) as avg_fare FROM samples.nyctaxi.trips GROUP BY vendor_id, payment_type @@ -208,7 +208,7 @@ databricks experimental aitools tools query \ --profile my-workspace # 3. Investigate specific patterns found in the data -databricks experimental aitools tools query \ +databricks aitools tools query \ "SELECT * FROM samples.nyctaxi.trips WHERE fare_amount > 100 LIMIT 20" \ @@ -221,15 +221,15 @@ Remember that each Bash command in Claude Code runs in a separate shell: ```bash # ✅ RECOMMENDED: Use --profile flag -databricks experimental aitools tools discover-schema samples.nyctaxi.trips --profile my-workspace +databricks aitools tools discover-schema samples.nyctaxi.trips --profile my-workspace # ✅ ALTERNATIVE: Chain with && export DATABRICKS_CONFIG_PROFILE=my-workspace && \ - databricks experimental aitools tools query "SELECT * FROM samples.nyctaxi.trips LIMIT 5" + databricks aitools tools query "SELECT * FROM samples.nyctaxi.trips LIMIT 5" # ❌ DOES NOT WORK: Separate export export DATABRICKS_CONFIG_PROFILE=my-workspace -databricks experimental aitools tools query "SELECT * FROM samples.nyctaxi.trips LIMIT 5" +databricks aitools tools query "SELECT * FROM samples.nyctaxi.trips LIMIT 5" ``` ## Flags @@ -264,7 +264,7 @@ Both commands support: **Solution**: 1. Check for default warehouse: ```bash - databricks experimental aitools tools get-default-warehouse --profile my-workspace + databricks aitools tools get-default-warehouse --profile my-workspace ``` 2. List available warehouses: ```bash @@ -272,7 +272,7 @@ Both commands support: ``` 3. Set specific warehouse: ```bash - DATABRICKS_WAREHOUSE_ID= databricks experimental aitools tools query "SELECT 1" --profile my-workspace + DATABRICKS_WAREHOUSE_ID= databricks aitools tools query "SELECT 1" --profile my-workspace ``` 4. Start a stopped warehouse: ```bash @@ -310,12 +310,12 @@ Both commands support: 2. **Use LIMIT for exploration** - When exploring large tables, always use LIMIT to avoid long-running queries: ```bash - databricks experimental aitools tools query "SELECT * FROM large_table LIMIT 100" --profile my-workspace + databricks aitools tools query "SELECT * FROM large_table LIMIT 100" --profile my-workspace ``` 3. **JSON output for parsing** - Use `--output json` when you need to process results programmatically: ```bash - databricks experimental aitools tools query "SELECT * FROM table" --output json --profile my-workspace | jq '.results' + databricks aitools tools query "SELECT * FROM table" --output json --profile my-workspace | jq '.results' ``` 4. **Check table existence** - Before querying, verify the table exists: diff --git a/skills/databricks-pipelines/references/workflows.md b/skills/databricks-pipelines/references/workflows.md index 671157c..f9f4428 100644 --- a/skills/databricks-pipelines/references/workflows.md +++ b/skills/databricks-pipelines/references/workflows.md @@ -295,7 +295,7 @@ databricks pipelines start-update Even on `COMPLETED`, verify the data: ```bash -databricks experimental aitools tools discover-schema \ +databricks aitools tools discover-schema \ my_catalog.my_schema.bronze_orders \ my_catalog.my_schema.silver_orders \ my_catalog.my_schema.gold_summary From 212d3efd68aa9d080975dd37738f2303edf6dad5 Mon Sep 17 00:00:00 2001 From: James Broadhead Date: Wed, 27 May 2026 19:00:44 +0000 Subject: [PATCH 2/2] docs(skills): keep `experimental aitools tools` refs (only install was promoted) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Verified against CLI v1.1.0: - `databricks aitools install/list/uninstall/update/version` — top-level (correct) - `databricks aitools tools discover-schema|query|get-default-warehouse|statement` — does NOT exist; still under `experimental` Reverts the `databricks experimental aitools tools X` → `databricks aitools tools X` substitutions across SKILL.md / reference files. Keeps the lone correct rename (`experimental aitools skills install` → `aitools install`) in `experimental/databricks-apps-python/SKILL.md`. Co-authored-by: Isaac --- experimental/databricks-agent-bricks/SKILL.md | 2 +- .../databricks-aibi-dashboards/SKILL.md | 12 ++-- experimental/databricks-dbsql/SKILL.md | 2 +- experimental/databricks-metric-views/SKILL.md | 6 +- .../databricks-synthetic-data-gen/SKILL.md | 4 +- .../references/2-troubleshooting.md | 14 ++--- .../databricks-unity-catalog/SKILL.md | 4 +- .../references/7-data-profiling.md | 2 +- .../references/end-to-end-rag.md | 4 +- skills/databricks-apps/SKILL.md | 2 +- .../references/appkit/genie.md | 4 +- skills/databricks-core/SKILL.md | 10 +-- skills/databricks-core/data-exploration.md | 62 +++++++++---------- .../references/workflows.md | 2 +- 14 files changed, 65 insertions(+), 65 deletions(-) diff --git a/experimental/databricks-agent-bricks/SKILL.md b/experimental/databricks-agent-bricks/SKILL.md index 756caa4..cb0bfb6 100644 --- a/experimental/databricks-agent-bricks/SKILL.md +++ b/experimental/databricks-agent-bricks/SKILL.md @@ -19,7 +19,7 @@ Agent Bricks are pre-built AI tiles in Databricks that provide conversational in ```bash # Find volumes databricks volumes list CATALOG SCHEMA -databricks aitools tools query --warehouse WH "LIST '/Volumes/catalog/schema/volume/'" +databricks experimental aitools tools query --warehouse WH "LIST '/Volumes/catalog/schema/volume/'" # Create KA databricks knowledge-assistants create-knowledge-assistant "Name" "Description" diff --git a/experimental/databricks-aibi-dashboards/SKILL.md b/experimental/databricks-aibi-dashboards/SKILL.md index 92afa3f..54defbb 100644 --- a/experimental/databricks-aibi-dashboards/SKILL.md +++ b/experimental/databricks-aibi-dashboards/SKILL.md @@ -14,15 +14,15 @@ A dashboard should be showing something relevant for a human, typically some KPI | Task | Command | |------|---------| | List warehouses | `databricks warehouses list` | -| List tables | `databricks aitools tools query --warehouse WH "SHOW TABLES IN catalog.schema"` | -| Get schema | `databricks aitools tools discover-schema catalog.schema.table1 catalog.schema.table2` | -| Test query | `databricks aitools tools query --warehouse WH "SELECT..."` | +| List tables | `databricks experimental aitools tools query --warehouse WH "SHOW TABLES IN catalog.schema"` | +| Get schema | `databricks experimental aitools tools discover-schema catalog.schema.table1 catalog.schema.table2` | +| Test query | `databricks experimental aitools tools query --warehouse WH "SELECT..."` | | Create dashboard | `databricks lakeview create --display-name "X" --warehouse-id "WH" --dataset-catalog CATALOG --dataset-schema SCHEMA --serialized-dashboard "$(cat file.json)" --json '{"parent_path": "/Workspace/Users//path"}'` — `--dataset-catalog` / `--dataset-schema` are **flag-only** (REQUIRED; CLI silently drops them if put in `--json`); `parent_path` is JSON-only (no flag). Queries must use bare table names. | | Update dashboard | `databricks lakeview update DASHBOARD_ID --serialized-dashboard "$(cat file.json)"` | | Publish | `databricks lakeview publish DASHBOARD_ID --warehouse-id WH` | | Delete | `databricks lakeview trash DASHBOARD_ID` | -> **`--warehouse` flag**: if `databricks aitools tools query --warehouse WH "..."` fails with `unknown flag: --warehouse` on your CLI version, set `DATABRICKS_WAREHOUSE_ID=WH` in the environment instead and drop the flag — the command auto-picks it from there. +> **`--warehouse` flag**: if `databricks experimental aitools tools query --warehouse WH "..."` fails with `unknown flag: --warehouse` on your CLI version, set `DATABRICKS_WAREHOUSE_ID=WH` in the environment instead and drop the flag — the command auto-picks it from there. --- @@ -57,9 +57,9 @@ A good dashboard comes from knowing the data first. Spend time here — the expl Use `discover-schema` as the default — one call returns columns, types, sample rows, null counts, and row count. If you only know the schema, list tables first with `query "SHOW TABLES IN ..."`. -`databricks aitools tools discover-schema catalog.schema.orders catalog.schema.customers` +`databricks experimental aitools tools discover-schema catalog.schema.orders catalog.schema.customers` -Sample rows alone don't tell you what to build. you can write aggregate SQL through `databricks aitools tools query --warehouse "..."` to probe typically: +Sample rows alone don't tell you what to build. you can write aggregate SQL through `databricks experimental aitools tools query --warehouse "..."` to probe typically: - **Cardinality** of candidate grouping columns → decides chart color-group vs. table (≤8 distinct values for charts, see Cardinality & Readability below). - **Top categorical values** → populates filter options and chart legends meaningfully. diff --git a/experimental/databricks-dbsql/SKILL.md b/experimental/databricks-dbsql/SKILL.md index b74e4bb..b2d8ced 100644 --- a/experimental/databricks-dbsql/SKILL.md +++ b/experimental/databricks-dbsql/SKILL.md @@ -297,4 +297,4 @@ Load these for detailed syntax, full parameter lists, and advanced patterns: - **Star schema in Gold layer** for BI; OBT acceptable in Silver - **Define PK/FK constraints** on dimensional models for query optimization - **Use `COLLATE UTF8_LCASE`** for user-facing string columns that need case-insensitive search -- **Test SQL via CLI** (`databricks aitools tools query`) or notebooks before deploying. If `--warehouse` is rejected on your CLI version, set `DATABRICKS_WAREHOUSE_ID` in the environment instead. +- **Test SQL via CLI** (`databricks experimental aitools tools query`) or notebooks before deploying. If `--warehouse` is rejected on your CLI version, set `DATABRICKS_WAREHOUSE_ID` in the environment instead. diff --git a/experimental/databricks-metric-views/SKILL.md b/experimental/databricks-metric-views/SKILL.md index 09107da..c2c396a 100644 --- a/experimental/databricks-metric-views/SKILL.md +++ b/experimental/databricks-metric-views/SKILL.md @@ -29,9 +29,9 @@ Use this skill when: Before authoring a metric view, inspect the source tables. Use `discover-schema` as the default — one call returns columns, types, sample rows, null counts, and row count. If you only know the schema, list tables first with `query "SHOW TABLES IN ..."`. -`databricks aitools tools discover-schema catalog.schema.orders catalog.schema.customers` +`databricks experimental aitools tools discover-schema catalog.schema.orders catalog.schema.customers` -For dimensions and measures, probe distribution beyond sampling — cardinality of candidate dimensions, min/max/percentiles for measures, top categorical values. Write aggregate SQL through `databricks aitools tools query --warehouse "..."`. Both commands auto-pick the default warehouse; set `DATABRICKS_WAREHOUSE_ID` or pass `--warehouse ` to override. +For dimensions and measures, probe distribution beyond sampling — cardinality of candidate dimensions, min/max/percentiles for measures, top categorical values. Write aggregate SQL through `databricks experimental aitools tools query --warehouse "..."`. Both commands auto-pick the default warehouse; set `DATABRICKS_WAREHOUSE_ID` or pass `--warehouse ` to override. ### Create a Metric View @@ -157,7 +157,7 @@ DROP VIEW IF EXISTS catalog.schema.orders_metrics; ```bash # Execute SQL via CLI -databricks aitools tools query --warehouse WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse WAREHOUSE_ID " CREATE OR REPLACE VIEW catalog.schema.orders_metrics WITH METRICS LANGUAGE YAML diff --git a/experimental/databricks-synthetic-data-gen/SKILL.md b/experimental/databricks-synthetic-data-gen/SKILL.md index e017a9d..510f576 100644 --- a/experimental/databricks-synthetic-data-gen/SKILL.md +++ b/experimental/databricks-synthetic-data-gen/SKILL.md @@ -128,10 +128,10 @@ Show a clear specification with **the business story and your assumptions surfac ### Post-Generation Validation -Use `databricks aitools tools query` to validate generated data (row counts, distributions, referential integrity). Query parquet files directly: +Use `databricks experimental aitools tools query` to validate generated data (row counts, distributions, referential integrity). Query parquet files directly: ```bash -databricks aitools tools query --warehouse $WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " SELECT COUNT(*) FROM parquet.\`/Volumes/CATALOG/SCHEMA/raw_data/customers\` " ``` diff --git a/experimental/databricks-synthetic-data-gen/references/2-troubleshooting.md b/experimental/databricks-synthetic-data-gen/references/2-troubleshooting.md index 76d4400..793b64f 100644 --- a/experimental/databricks-synthetic-data-gen/references/2-troubleshooting.md +++ b/experimental/databricks-synthetic-data-gen/references/2-troubleshooting.md @@ -293,30 +293,30 @@ WAREHOUSE_ID="your-warehouse-id" VOLUME_PATH="/Volumes/CATALOG/SCHEMA/raw_data" # 1. Check row counts -databricks aitools tools query --warehouse $WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " SELECT 'customers' as table_name, COUNT(*) as row_count FROM parquet.\`${VOLUME_PATH}/customers\` UNION ALL SELECT 'orders', COUNT(*) FROM parquet.\`${VOLUME_PATH}/orders\` " # 2. Preview schema and sample data -databricks aitools tools query --warehouse $WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " DESCRIBE SELECT * FROM parquet.\`${VOLUME_PATH}/customers\` " -databricks aitools tools query --warehouse $WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " SELECT * FROM parquet.\`${VOLUME_PATH}/customers\` LIMIT 5 " # 3. Verify distributions -databricks aitools tools query --warehouse $WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " SELECT tier, COUNT(*) as count, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 1) as pct FROM parquet.\`${VOLUME_PATH}/customers\` GROUP BY tier ORDER BY tier " # 4. Check amount statistics -databricks aitools tools query --warehouse $WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " SELECT MIN(amount) as min_amount, MAX(amount) as max_amount, @@ -326,7 +326,7 @@ FROM parquet.\`${VOLUME_PATH}/orders\` " # 5. Check referential integrity -databricks aitools tools query --warehouse $WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " SELECT COUNT(*) as orphan_orders FROM parquet.\`${VOLUME_PATH}/orders\` o LEFT JOIN parquet.\`${VOLUME_PATH}/customers\` c ON o.customer_id = c.customer_id @@ -334,7 +334,7 @@ WHERE c.customer_id IS NULL " # 6. Verify date range -databricks aitools tools query --warehouse $WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse $WAREHOUSE_ID " SELECT MIN(order_date) as min_date, MAX(order_date) as max_date FROM parquet.\`${VOLUME_PATH}/orders\` " diff --git a/experimental/databricks-unity-catalog/SKILL.md b/experimental/databricks-unity-catalog/SKILL.md index e790bfd..bc2a9c8 100644 --- a/experimental/databricks-unity-catalog/SKILL.md +++ b/experimental/databricks-unity-catalog/SKILL.md @@ -101,11 +101,11 @@ GROUP BY workspace_id, sku_name; ## SQL Queries via CLI -Use `databricks aitools tools query` for system table queries: +Use `databricks experimental aitools tools query` for system table queries: ```bash # Query lineage via CLI -databricks aitools tools query --warehouse WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse WAREHOUSE_ID " SELECT source_table_full_name, target_table_full_name FROM system.access.table_lineage WHERE event_date >= current_date() - 7 diff --git a/experimental/databricks-unity-catalog/references/7-data-profiling.md b/experimental/databricks-unity-catalog/references/7-data-profiling.md index 0c64a80..930d29a 100644 --- a/experimental/databricks-unity-catalog/references/7-data-profiling.md +++ b/experimental/databricks-unity-catalog/references/7-data-profiling.md @@ -68,7 +68,7 @@ DROP QUALITY MONITOR catalog.schema.my_table; ### Execute via CLI ```bash -databricks aitools tools query --warehouse WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse WAREHOUSE_ID " CREATE OR REPLACE QUALITY MONITOR catalog.schema.my_table OPTIONS (OUTPUT_SCHEMA 'catalog.schema') " diff --git a/experimental/databricks-vector-search/references/end-to-end-rag.md b/experimental/databricks-vector-search/references/end-to-end-rag.md index c402420..00959f9 100644 --- a/experimental/databricks-vector-search/references/end-to-end-rag.md +++ b/experimental/databricks-vector-search/references/end-to-end-rag.md @@ -6,7 +6,7 @@ Build a complete Retrieval-Augmented Generation pipeline: prepare documents, cre | Command | Step | |---------|------| -| `databricks aitools tools query` | Create source table, insert documents | +| `databricks experimental aitools tools query` | Create source table, insert documents | | `databricks vector-search-endpoints create-endpoint` | Create compute endpoint | | `databricks vector-search-indexes create-index` | Create Delta Sync index with managed embeddings | | `databricks vector-search-indexes sync-index` | Trigger index sync | @@ -37,7 +37,7 @@ INSERT INTO catalog.schema.knowledge_base VALUES Or via CLI: ```bash -databricks aitools tools query --warehouse WAREHOUSE_ID " +databricks experimental aitools tools query --warehouse WAREHOUSE_ID " CREATE TABLE IF NOT EXISTS catalog.schema.knowledge_base ( doc_id STRING, title STRING, diff --git a/skills/databricks-apps/SKILL.md b/skills/databricks-apps/SKILL.md index 8d3c8c0..95f3066 100644 --- a/skills/databricks-apps/SKILL.md +++ b/skills/databricks-apps/SKILL.md @@ -168,7 +168,7 @@ npx @databricks/appkit docs ./docs/plugins/analytics.md # example: specific doc Optionally use `--version ` to target a specific AppKit version. - **Required**: `--name`, `--profile`. Name: ≤26 chars, lowercase letters/numbers/hyphens only. Use `--features` only for **optional** plugins the user wants (plugins with `requiredByTemplate: false` or absent); mandatory plugins must not be listed in `--features`. - **Resources**: Pass `--set` for every required resource (each field in `resources.required`) for (1) all plugins with `requiredByTemplate: true`, and (2) any optional plugins you added to `--features`. Add `--set` for `resources.optional` only when the user requests them. - - **Discovery**: Use the parent `databricks-core` skill to resolve IDs (e.g. warehouse: `databricks warehouses list --profile ` or `databricks aitools tools get-default-warehouse --profile `). + - **Discovery**: Use the parent `databricks-core` skill to resolve IDs (e.g. warehouse: `databricks warehouses list --profile ` or `databricks experimental aitools tools get-default-warehouse --profile `). **DO NOT guess** plugin names, resource keys, or property names — always derive them from `databricks apps manifest` output. Example: if the manifest shows plugin `analytics` with a required resource `resourceKey: "sql-warehouse"` and `fields: { "id": ... }`, include `--set analytics.sql-warehouse.id=`. diff --git a/skills/databricks-apps/references/appkit/genie.md b/skills/databricks-apps/references/appkit/genie.md index 63ab37f..21734d1 100644 --- a/skills/databricks-apps/references/appkit/genie.md +++ b/skills/databricks-apps/references/appkit/genie.md @@ -44,14 +44,14 @@ databricks genie get-space --include-serialized-space --profile +databricks experimental aitools tools get-default-warehouse --profile ``` ## Scaffolding a New Genie App ```bash # 1. Discover warehouse -databricks aitools tools get-default-warehouse --profile +databricks experimental aitools tools get-default-warehouse --profile # 2. Create Genie space (see syntax above) databricks genie create-space '' \ diff --git a/skills/databricks-core/SKILL.md b/skills/databricks-core/SKILL.md index 3e3715f..b41f8df 100644 --- a/skills/databricks-core/SKILL.md +++ b/skills/databricks-core/SKILL.md @@ -61,13 +61,13 @@ databricks apps list # profile not set! ```bash # discover table structure (columns, types, sample data, stats) -databricks aitools tools discover-schema catalog.schema.table --profile +databricks experimental aitools tools discover-schema catalog.schema.table --profile # run ad-hoc SQL queries -databricks aitools tools query "SELECT * FROM table LIMIT 10" --profile +databricks experimental aitools tools query "SELECT * FROM table LIMIT 10" --profile # find the default warehouse -databricks aitools tools get-default-warehouse --profile +databricks experimental aitools tools get-default-warehouse --profile ``` See [Data Exploration](data-exploration.md) for details. @@ -100,8 +100,8 @@ databricks tables get ..
--profile # databricks schemas list --catalog-name ← WILL FAIL # databricks tables list --catalog ← WILL FAIL # databricks sql-warehouses list ← doesn't exist, use `warehouses list` -# databricks execute-statement ← doesn't exist, use `aitools tools query` -# databricks sql execute ← doesn't exist, use `aitools tools query` +# databricks execute-statement ← doesn't exist, use `experimental aitools tools query` +# databricks sql execute ← doesn't exist, use `experimental aitools tools query` # When in doubt, check help: # databricks schemas list --help diff --git a/skills/databricks-core/data-exploration.md b/skills/databricks-core/data-exploration.md index 329cd1c..5908cc5 100644 --- a/skills/databricks-core/data-exploration.md +++ b/skills/databricks-core/data-exploration.md @@ -10,17 +10,17 @@ Use `information_schema` to search for tables by keyword — do NOT manually ite ```bash # Find tables matching a keyword -databricks aitools tools query \ +databricks experimental aitools tools query \ "SELECT table_catalog, table_schema, table_name FROM system.information_schema.tables WHERE table_name LIKE '%keyword%'" \ --profile # Then discover schema for the tables you found -databricks aitools tools discover-schema catalog.schema.table1 catalog.schema.table2 --profile +databricks experimental aitools tools discover-schema catalog.schema.table1 catalog.schema.table2 --profile ``` ## Overview -The `databricks aitools tools` command group provides tools for data discovery and exploration: +The `databricks experimental aitools tools` command group provides tools for data discovery and exploration: - **discover-schema**: Batch discover table metadata, columns, types, sample data, and statistics - **query**: Execute SQL queries against Databricks SQL warehouses @@ -43,7 +43,7 @@ Batch discover table metadata including columns, types, sample data, and null co ### Command Syntax ```bash -databricks aitools tools discover-schema TABLE... [flags] +databricks experimental aitools tools discover-schema TABLE... [flags] ``` Tables must be specified in **CATALOG.SCHEMA.TABLE** format. @@ -60,16 +60,16 @@ For each table, returns: ```bash # Discover schema for a single table -databricks aitools tools discover-schema samples.nyctaxi.trips --profile my-workspace +databricks experimental aitools tools discover-schema samples.nyctaxi.trips --profile my-workspace # Discover schema for multiple tables -databricks aitools tools discover-schema \ +databricks experimental aitools tools discover-schema \ catalog.schema.table1 \ catalog.schema.table2 \ --profile my-workspace # Get JSON output -databricks aitools tools discover-schema \ +databricks experimental aitools tools discover-schema \ samples.nyctaxi.trips \ --output json \ --profile my-workspace @@ -79,12 +79,12 @@ databricks aitools tools discover-schema \ 1. **Understanding table structure before querying** ```bash - databricks aitools tools discover-schema catalog.schema.customer_data --profile my-workspace + databricks experimental aitools tools discover-schema catalog.schema.customer_data --profile my-workspace ``` 2. **Comparing schemas across multiple tables** ```bash - databricks aitools tools discover-schema \ + databricks experimental aitools tools discover-schema \ catalog.schema.table_v1 \ catalog.schema.table_v2 \ --profile my-workspace @@ -100,7 +100,7 @@ Execute SQL statements against a Databricks SQL warehouse and return results. ### Command Syntax ```bash -databricks aitools tools query "SQL" [flags] +databricks experimental aitools tools query "SQL" [flags] ``` ### Warehouse Selection @@ -112,7 +112,7 @@ The command **auto-detects** an available warehouse unless: To check which warehouse will be used: ```bash # Get the default warehouse that would be auto-detected -databricks aitools tools get-default-warehouse --profile my-workspace +databricks experimental aitools tools get-default-warehouse --profile my-workspace ``` ### Output @@ -126,23 +126,23 @@ Returns: ```bash # Simple SELECT query -databricks aitools tools query \ +databricks experimental aitools tools query \ "SELECT * FROM samples.nyctaxi.trips LIMIT 5" \ --profile my-workspace # Aggregation query -databricks aitools tools query \ +databricks experimental aitools tools query \ "SELECT vendor_id, COUNT(*) as trip_count FROM samples.nyctaxi.trips GROUP BY vendor_id" \ --profile my-workspace # With JSON output -databricks aitools tools query \ +databricks experimental aitools tools query \ "SELECT * FROM catalog.schema.table WHERE date > '2024-01-01'" \ --output json \ --profile my-workspace # Using specific warehouse -DATABRICKS_WAREHOUSE_ID=abc123 databricks aitools tools query \ +DATABRICKS_WAREHOUSE_ID=abc123 databricks experimental aitools tools query \ "SELECT * FROM samples.nyctaxi.trips LIMIT 10" \ --profile my-workspace ``` @@ -152,17 +152,17 @@ DATABRICKS_WAREHOUSE_ID=abc123 databricks aitools tools query \ 1. **Exploratory data analysis** ```bash # Check table size - databricks aitools tools query \ + databricks experimental aitools tools query \ "SELECT COUNT(*) FROM catalog.schema.table" \ --profile my-workspace # View sample data - databricks aitools tools query \ + databricks experimental aitools tools query \ "SELECT * FROM catalog.schema.table LIMIT 10" \ --profile my-workspace # Get column statistics - databricks aitools tools query \ + databricks experimental aitools tools query \ "SELECT MIN(column), MAX(column), AVG(column) FROM catalog.schema.table" \ --profile my-workspace ``` @@ -170,12 +170,12 @@ DATABRICKS_WAREHOUSE_ID=abc123 databricks aitools tools query \ 2. **Data validation** ```bash # Check for null values - databricks aitools tools query \ + databricks experimental aitools tools query \ "SELECT COUNT(*) FROM catalog.schema.table WHERE column IS NULL" \ --profile my-workspace # Verify data freshness - databricks aitools tools query \ + databricks experimental aitools tools query \ "SELECT MAX(timestamp_column) FROM catalog.schema.table" \ --profile my-workspace ``` @@ -183,7 +183,7 @@ DATABRICKS_WAREHOUSE_ID=abc123 databricks aitools tools query \ 3. **Quick analytics** ```bash # Group by analysis - databricks aitools tools query \ + databricks experimental aitools tools query \ "SELECT category, COUNT(*), AVG(value) FROM catalog.schema.table GROUP BY category" \ --profile my-workspace ``` @@ -194,12 +194,12 @@ Here's a typical workflow combining both commands: ```bash # 1. Discover the schema first -databricks aitools tools discover-schema \ +databricks experimental aitools tools discover-schema \ samples.nyctaxi.trips \ --profile my-workspace # 2. Based on discovered columns, run targeted queries -databricks aitools tools query \ +databricks experimental aitools tools query \ "SELECT vendor_id, payment_type, COUNT(*) as trips, AVG(fare_amount) as avg_fare FROM samples.nyctaxi.trips GROUP BY vendor_id, payment_type @@ -208,7 +208,7 @@ databricks aitools tools query \ --profile my-workspace # 3. Investigate specific patterns found in the data -databricks aitools tools query \ +databricks experimental aitools tools query \ "SELECT * FROM samples.nyctaxi.trips WHERE fare_amount > 100 LIMIT 20" \ @@ -221,15 +221,15 @@ Remember that each Bash command in Claude Code runs in a separate shell: ```bash # ✅ RECOMMENDED: Use --profile flag -databricks aitools tools discover-schema samples.nyctaxi.trips --profile my-workspace +databricks experimental aitools tools discover-schema samples.nyctaxi.trips --profile my-workspace # ✅ ALTERNATIVE: Chain with && export DATABRICKS_CONFIG_PROFILE=my-workspace && \ - databricks aitools tools query "SELECT * FROM samples.nyctaxi.trips LIMIT 5" + databricks experimental aitools tools query "SELECT * FROM samples.nyctaxi.trips LIMIT 5" # ❌ DOES NOT WORK: Separate export export DATABRICKS_CONFIG_PROFILE=my-workspace -databricks aitools tools query "SELECT * FROM samples.nyctaxi.trips LIMIT 5" +databricks experimental aitools tools query "SELECT * FROM samples.nyctaxi.trips LIMIT 5" ``` ## Flags @@ -264,7 +264,7 @@ Both commands support: **Solution**: 1. Check for default warehouse: ```bash - databricks aitools tools get-default-warehouse --profile my-workspace + databricks experimental aitools tools get-default-warehouse --profile my-workspace ``` 2. List available warehouses: ```bash @@ -272,7 +272,7 @@ Both commands support: ``` 3. Set specific warehouse: ```bash - DATABRICKS_WAREHOUSE_ID= databricks aitools tools query "SELECT 1" --profile my-workspace + DATABRICKS_WAREHOUSE_ID= databricks experimental aitools tools query "SELECT 1" --profile my-workspace ``` 4. Start a stopped warehouse: ```bash @@ -310,12 +310,12 @@ Both commands support: 2. **Use LIMIT for exploration** - When exploring large tables, always use LIMIT to avoid long-running queries: ```bash - databricks aitools tools query "SELECT * FROM large_table LIMIT 100" --profile my-workspace + databricks experimental aitools tools query "SELECT * FROM large_table LIMIT 100" --profile my-workspace ``` 3. **JSON output for parsing** - Use `--output json` when you need to process results programmatically: ```bash - databricks aitools tools query "SELECT * FROM table" --output json --profile my-workspace | jq '.results' + databricks experimental aitools tools query "SELECT * FROM table" --output json --profile my-workspace | jq '.results' ``` 4. **Check table existence** - Before querying, verify the table exists: diff --git a/skills/databricks-pipelines/references/workflows.md b/skills/databricks-pipelines/references/workflows.md index f9f4428..671157c 100644 --- a/skills/databricks-pipelines/references/workflows.md +++ b/skills/databricks-pipelines/references/workflows.md @@ -295,7 +295,7 @@ databricks pipelines start-update Even on `COMPLETED`, verify the data: ```bash -databricks aitools tools discover-schema \ +databricks experimental aitools tools discover-schema \ my_catalog.my_schema.bronze_orders \ my_catalog.my_schema.silver_orders \ my_catalog.my_schema.gold_summary