When running tests remotely on Databricks, you need to ensure that required libraries (including dbx_test itself) are available on the cluster. This guide shows you how to configure library installation.
Add the libraries configuration to your config/test_config.yml:
cluster:
libraries:
- pypi:
package: "dbx_test==0.1.0"That's it! The dbx_test library will now be automatically installed on the cluster when running tests.
If your package is published on PyPI:
cluster:
libraries:
- pypi:
package: "dbx_test==0.1.0"
- pypi:
package: "pandas>=2.0.0"
- pypi:
package: "numpy>=1.24.0"Upload your wheel file to DBFS or workspace, then reference it:
cluster:
libraries:
- whl: "dbfs:/FileStore/wheels/dbx_test-0.1.0-py3-none-any.whl"
- whl: "/Workspace/Users/youruser@company.com/libs/custom_lib-1.0.0-py3-none-any.whl"To upload a wheel file:
# Build the wheel
cd /Users/james.parham/development/dbx_test
python -m build
# Upload to DBFS
databricks fs cp dist/dbx_test-0.1.0-py3-none-any.whl dbfs:/FileStore/wheels/ --profile aws-west
# Or upload to workspace
databricks workspace import dist/dbx_test-0.1.0-py3-none-any.whl /Workspace/Users/youruser@company.com/libs/dbx_test-0.1.0-py3-none-any.whl --profile aws-westInstall directly from a Git repository:
cluster:
libraries:
- pypi:
package: "git+https://github.com/jsparhamii/dbx_test.git"
# Or specific branch/tag
- pypi:
package: "git+https://github.com/jsparhamii/dbx_test.git@main"
- pypi:
package: "git+https://github.com/jsparhamii/dbx_test.git@v0.1.0"cluster:
libraries:
- maven:
coordinates: "com.databricks:spark-xml_2.12:0.17.0"cluster:
libraries:
- jar: "dbfs:/FileStore/jars/my-custom-lib.jar"# config/test_config.yml
workspace:
profile: "aws-west"
cluster:
# Libraries to install
libraries:
# Core testing framework
- pypi:
package: "dbx_test==0.1.0"
# Other dependencies your tests need
- pypi:
package: "pandas>=2.0.0"
- pypi:
package: "pyspark>=3.5.0"
# Custom wheel from DBFS
- whl: "dbfs:/FileStore/wheels/my_custom_lib-1.0.0-py3-none-any.whl"
# Use serverless compute (recommended)
# Libraries will be installed automatically
execution:
timeout: 600
parallel: false
reporting:
output_dir: ".dbx-test-results"
formats:
- "console"
- "junit"- Configuration Loading: When you run
dbx_test run --remote, the framework loads your config file - Job Submission: The framework submits a notebook job to Databricks with the specified libraries
- Library Installation: Databricks automatically installs the libraries before running your tests
- Test Execution: Your test notebooks can now import and use the libraries
If you're using Databricks Asset Bundles (DAB), you might already have libraries configured in your databricks.yml. You can still use dbx_test configuration for test-specific libraries:
# databricks.yml (your bundle config)
resources:
jobs:
my_job:
libraries:
- pypi:
package: "my-production-lib==1.0.0"
# config/test_config.yml (your test config)
cluster:
libraries:
- pypi:
package: "dbx_test==0.1.0"
- pypi:
package: "pytest>=7.0.0"Once configured, just run your tests normally:
dbx_test run --remote \
--tests-dir /Workspace/Users/james.parham@databricks.com/.bundle/my_app/dev/files/tests \
--profile aws-westThe framework will:
- ✅ Load the library configuration from
config/test_config.yml - ✅ Submit the job with library installation instructions
- ✅ Wait for libraries to install
- ✅ Run your tests with all libraries available
Cause: The library is not configured or failed to install
Solution:
- Check your
config/test_config.ymlincludes the library - Run with
--verboseto see installation logs - Verify the library format matches Databricks SDK expectations
Cause: Large libraries or slow network
Solution:
execution:
timeout: 1200 # Increase timeout to 20 minutesCause: Multiple versions of the same library specified
Solution:
cluster:
libraries:
# Use specific versions
- pypi:
package: "pandas==2.0.3" # Not "pandas>=2.0.0"Cause: Incorrect path or file not uploaded
Solution:
# Verify file exists
databricks fs ls dbfs:/FileStore/wheels/ --profile aws-west
# Re-upload if needed
databricks fs cp dist/dbx_test-0.1.0-py3-none-any.whl \
dbfs:/FileStore/wheels/ --profile aws-west --overwrite-
Pin Versions: Use exact versions (
==) for reproducibility- pypi: package: "dbx_test==0.1.0" # Good # - pypi: # package: "dbx_test" # Bad: unpredictable
-
Use Wheel Files for Development: Faster iteration for custom libraries
- whl: "dbfs:/FileStore/wheels/dbx_test-0.1.0-py3-none-any.whl"
-
Group by Purpose: Organize libraries logically
libraries: # Testing framework - pypi: package: "dbx_test==0.1.0" # Data processing - pypi: package: "pandas==2.0.3" - pypi: package: "numpy==1.24.3" # Custom libraries - whl: "dbfs:/FileStore/wheels/my_lib-1.0.0-py3-none-any.whl"
-
Test Locally First: Ensure libraries work together
pip install dbx_test pandas numpy python -m pytest tests/
cluster:
libraries:
- pypi:
package: "dbx_test==0.1.0"dbx_test run --remote --tests-dir /Workspace/Users/me/tests --profile prod# Build wheel
cd /path/to/dbx_test
python -m build
# Upload to DBFS
databricks fs cp dist/dbx_test-0.1.0-py3-none-any.whl \
dbfs:/FileStore/wheels/ --profile aws-west --overwritecluster:
libraries:
- whl: "dbfs:/FileStore/wheels/dbx_test-0.1.0-py3-none-any.whl"dbx_test run --remote --tests-dir /Workspace/Users/me/tests --profile aws-westcluster:
libraries:
- pypi:
package: "git+https://github.com/jsparhamii/dbx_test.git@${GIT_COMMIT}"cluster:
libraries:
# Framework
- pypi:
package: "dbx_test==0.1.0"
# Data science stack
- pypi:
package: "pandas==2.0.3"
- pypi:
package: "numpy==1.24.3"
- pypi:
package: "scikit-learn==1.3.0"
# Databricks libraries
- pypi:
package: "databricks-sdk>=0.20.0"
# Custom wheel
- whl: "dbfs:/FileStore/wheels/company_lib-2.1.0-py3-none-any.whl"To ensure dbx_test is available when running remote tests:
- Configure libraries in
config/test_config.yml - Choose installation method: PyPI, wheel, or Git
- Run tests with
--remoteflag - Libraries auto-install on the cluster
The framework handles the rest!