Enable GDS and nsys for cluster usage#299
Closed
kingcrimsontianyu wants to merge 25 commits into
Closed
Conversation
Merged
Contributor
Author
|
Superseded by #330 |
rapids-bot Bot
pushed a commit
that referenced
this pull request
May 19, 2026
This PR adds optional capabilities to the Presto-Velox TPC-H benchmark runner on the NVL72 EPG cluster, all controlled by new flags on `launch-run.sh`. Default behavior is unchanged except that GDS is now on by default. ## GDS I/O (`--disable-gds` to opt out, on by default) Workers run with `KVIKIO_COMPAT_MODE=OFF` so KvikIO uses GPU Direct Storage. With `--disable-gds`, workers fall back to POSIX I/O via KvikIO compat mode. ## Tunable worker env vars (`--worker-env-file`) Env vars to be set in each worker container can now be declared in a sourced file rather than buried in the bash scripts. Defaults live in `worker.env` (currently `KVIKIO_TASK_SIZE=16MiB`, `KVIKIO_NTHREADS=16`); override the path with `--worker-env-file`. ## nsys profiling (`-p, --profile`) Captures one `.nsys-rep` per query for a single worker (selectable via `--nsys-worker-id`). The worker image must include the `nsys` CLI. After pytest exits, the slurm job waits up to 10 minutes for nsys to finish flushing reports before tearing down the containers. ## Metrics collection (`-m, --metrics`) After each query, pytest pulls per-query stats from the coordinator's REST API and writes them to `result_dir/metrics/<query>.json`. ## Nsys report and metrics uploading Updates the `post_results.py` code so that nsys report and metrics can be uploaded to the online database. In particular, S3 is used to upload the large size nsys report. ## Other changes - New `-q, --queries LIST` flag forwards a comma-separated query list through to pytest, useful for narrowing profile/metrics runs. - README updated with full parameter documentation for `launch-run.sh`. - `run_benchmark.sh` gains `--profile-script-path` so the slurm path can supply its own profiler functions instead of the docker default. This PR supersedes #299 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) - Karthikeyan (https://github.com/karthikeyann) Approvers: - Tom Augspurger (https://github.com/TomAugspurger) - Karthikeyan (https://github.com/karthikeyann) URL: #330
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.