Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions skills/databricks-dabs/references/bundle-structure.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,54 @@ resources:
timezone_id: 'America/Los_Angeles'
```

### Job Resource (Shared Classic Cluster)

When multiple tasks should reuse the same cluster, declare it once under `job_clusters` and reference it via `job_cluster_key`:

```yaml
resources:
jobs:
sample_job:
name: sample_job
tasks:
- task_key: notebook_task
notebook_task:
notebook_path: ../src/sample_notebook.ipynb
job_cluster_key: job_cluster
libraries:
- whl: ../dist/*.whl

- task_key: main_task
depends_on:
- task_key: notebook_task
python_wheel_task:
package_name: my_project
entry_point: main
job_cluster_key: job_cluster
libraries:
- whl: ../dist/*.whl

job_clusters:
- job_cluster_key: job_cluster
new_cluster:
spark_version: 16.4.x-scala2.12
node_type_id: i3.xlarge
data_security_mode: SINGLE_USER
autoscale:
min_workers: 1
max_workers: 4
```

## Registered Model Resources

```yaml
resources:
registered_models:
customer_churn:
name: '${var.catalog}.${var.schema}.customer_churn_model'
description: 'Customer churn prediction model'
```

## Volume Resources

```yaml
Expand Down Expand Up @@ -177,6 +225,23 @@ databricks bundle generate app <app-name>

DABs supports schemas, models, experiments, clusters, warehouses, etc. Use `databricks bundle schema` to inspect schemas.

## Substitutions

Substitutions are resolved at deploy time and are usable in any string field across `databricks.yml`, resource files, and variable defaults.

| Substitution | Resolves to |
| --------------------------------------- | ------------------------------------------------------ |
| `${var.my_variable}` | User-defined variable from `variables:` block |
| `${bundle.name}` | The bundle's `bundle.name` |
| `${bundle.target}` | The active target (`dev`, `staging`, `prod`, …) |
| `${workspace.current_user.userName}` | Deployer's email |
| `${workspace.current_user.short_name}` | Deployer's short name (handle before `@`) |
| `${workspace.file_path}` | Bundle's workspace file path |
| `${resources.jobs.<key>.id}` | ID of another job in the same bundle |
| `${resources.pipelines.<key>.id}` | ID of another pipeline in the same bundle |

Variables themselves are declared in `databricks.yml` (with optional `default:` or `lookup:`) and overridden per target.

## Key Principles

1. **Path resolution**: `../src/` in resources/\*.yml, `./src/` in databricks.yml
Expand Down
43 changes: 43 additions & 0 deletions skills/databricks-dabs/references/deploy-and-run.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,39 @@
# Deploy and Run Declarative Automation Bundles

## Initialization

Start a new bundle interactively:

```bash
databricks bundle init
```

Built-in templates:

| Template | Use for |
| -------------------- | -------------------------------------------------- |
| `default-python` | Python project with jobs and a pipeline |
| `default-sql` | SQL project with jobs |
| `default-scala` | Scala/Java project |
| `lakeflow-pipelines` | Lakeflow Declarative Pipelines (Python or SQL) |
| `dbt-sql` | dbt integration |
| `default-minimal` | Minimal bundle skeleton |

Pass a template name or a Git URL pointing at a template directory to skip the interactive picker.

## Generate from Existing Resources

If a workspace already has the resource, generate its bundle YAML instead of writing it by hand:

```bash
databricks bundle generate job <job-id>
databricks bundle generate pipeline <pipeline-id>
databricks bundle generate dashboard <dashboard-id>
databricks bundle generate app <app-name>
```

This writes a resource file under `resources/` plus any referenced source assets.

## Validation

Validate bundle configuration:
Expand Down Expand Up @@ -32,6 +66,13 @@ Run resources:

View status: `bundle summary`

## Destroy

`bundle destroy` removes everything the bundle previously deployed to the target workspace. It is destructive; confirm the target before running it.

- `bundle destroy -t dev`
- `bundle destroy -t prod`

## Monitoring and Logs

```bash
Expand All @@ -56,3 +97,5 @@ databricks apps logs <app-name> --profile <profile-name>
| **App not starting after deploy** | Apps require `databricks bundle run <resource_key>` to start |
| **App env vars not working** | Environment variables go in `app.yaml` (source dir), not databricks.yml |
| **Debugging any app issue** | First step: `databricks apps logs <app-name>` |
| **Variable shows as `${var.name}` literal** | Variable not declared in `databricks.yml` `variables:`, missing from the active target, or wrong syntax (use `${var.<name>}`) |
| **Validation errors unclear** | Re-run with `databricks bundle validate --strict --debug` |
Loading