- This project uses uv. Do not use the uv pip wrapper, only use uv native commands like uv add.
- always run python using
uv run <name-of-file>. no need to activate a venv. - use the logging module instead of
print() - us the logger created in common/logging_config.py
- Use the snowflake CLI
snowto run test queries or to do any setup. For example, if you need to figure out if your suggested SQL is valid, just run it real quick usingsnow sql -q ... - when using the snowflake cli, use the
--connectionflag with the value specified in the.envfile'sSNOWFLAKE_CONNECTIONvariable. - the goal is to create a python file that will run multiple queries against snowflake and databricks.
- the overall project plan is at
project_plan.md. - dbx is short for databricks.
- run
source ~/.zshrcto get the databricks credentials loaded. - any time you need to connect to snowflake or dbx with python, use the global connections modules in the common folder: connections/databricks_connection.py and connections/snowflake_connection.py
- whenever I mention duckDB, I mean the file called
benchmark_results.duckdbin the project root. - in duckdb, the schema we use is always benchmark_results.main
- if duckdb is locked when you try to use it, just stop and ask me to close it. Don't start doing crazy work arounds.
- after you write a bunch of python code, run
uvx ruff check --fixto make sure unused imports and other linting issues are cleaned up. Please inspect the output of this command to see if there were any unfixable violations, and help fix them.
- Separation of Concerns: Benchmark modules are split into:
warehouse_manager.py- Warehouse lifecycle (create, destroy, suspend/resume)query_executor.py- Query execution and metrics collectionbenchmark.py- Orchestration using the managers
- Scenario Support: Warehouse names include scenario (
BENCHMARK_WH_XLARGE_NORMAL_001) to prevent conflicts when running multiple scenarios with same run_id - Run Type Classification: Queries are classified as
cold(first),semi-warm(new query on warm warehouse), orwarm(repeated query)
- Always run tests after refactoring:
./run_tests.shoruv run pytest tests/ -v - Keep tests updated when changing core logic (warehouse naming, scenario handling, run types)
- Test coverage: 34 tests covering warehouse managers, query executors, and scenario integration
- Tests use mocks - no database connections required
- Run
uv run dbt build --profiles-dir .in transformations directory to rebuild all views if needed. Views don't really need to be updated unless something in the sql chagned.