Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
81f711b
slurm: rebuild presto-nvl72 launcher layer from SpaceMicePOC
misiugodfrey Apr 19, 2026
3238142
config: extend coordinator query.max-execution-time to 60m
misiugodfrey Apr 19, 2026
5203575
generate_presto_config: fix CPU multi-worker config generation
misiugodfrey Apr 19, 2026
ec58e17
py_env: pin ncurses<6.5 to work around conda symlink conflict on aarch64
misiugodfrey Apr 19, 2026
d8f5505
post_results: capture node_count, image_digest, engine repo/branch me…
misiugodfrey Apr 19, 2026
151a578
slurm: set worker LD_LIBRARY_PATH inside container instead of via pyxis
misiugodfrey Apr 19, 2026
13e1071
slurm: stage static jq for coord, drop validate_results from run_queries
misiugodfrey Apr 19, 2026
9ab7a2d
slurm: zero gpu_count/gpu_name in injected metadata when cudf is disa…
misiugodfrey Apr 19, 2026
160c361
slurm: add shareable Hive metastore snapshot layer
misiugodfrey Apr 19, 2026
0d3b578
slurm: forward HIVE_METASTORE_{VERSION,SHARED_ROOT} through launchers
misiugodfrey Apr 19, 2026
677e5c3
slurm: export shared-metastore vars from slurm scripts to child bash
misiugodfrey Apr 19, 2026
c28013d
slurm: adopt the published cluster-wide metastore as the default
misiugodfrey Apr 20, 2026
a6ac563
slurm: consolidate IMAGES_DIR into IMAGE_DIR; add --overwrite to pull…
misiugodfrey Apr 20, 2026
4084e50
slurm: let Slurm pick any available node for run_interactive.sh
misiugodfrey Apr 20, 2026
f3c4b8a
slurm: drop DEFAULT_SINGLE_NODE from defaults.env
misiugodfrey Apr 20, 2026
86cc385
slurm: drop DEFAULT_NODELIST; let launch-run.sh pick any available nodes
misiugodfrey Apr 20, 2026
1acd024
slurm: correct GPU-to-NUMA mapping for GB200 NVL72 workers
misiugodfrey Apr 20, 2026
77e7195
slurm: drop 4 unused helpers from functions.sh
misiugodfrey Apr 20, 2026
18f0e59
remove run_multiple.sh
misiugodfrey Apr 20, 2026
5b07779
Fix path to enroot-decompress.sh
quasiben Apr 27, 2026
ec1ec4b
Review comments
misiugodfrey Apr 28, 2026
79adceb
Merge branch 'main' into misiug/slurm-nvl72
misiugodfrey Apr 28, 2026
02725ad
cleanup
misiugodfrey Apr 28, 2026
48a09d8
Merge branch 'misiug/slurm-nvl72' of https://github.com/rapidsai/velo…
misiugodfrey Apr 28, 2026
1675daf
fixed lib path for ucx deps updates
misiugodfrey Apr 28, 2026
f5978f2
pre-commit
misiugodfrey Apr 28, 2026
b6b2f41
Remove references to specific paths
misiugodfrey Apr 28, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 66 additions & 5 deletions benchmark_reporting_tools/post_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,10 +93,12 @@ class BenchmarkMetadata:
kind: str | None = None
execution_number: int = 1
worker_count: int | None = None
node_count: int | None = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to also capture the number of workers per node?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there's much use for a "number of workers per node" field since there is already a "number of workers" and "number of nodes"? Unless there was some use-case for an uneven number of workers per node (not currently a supported use-case), then I think that value should always be derivable from the existing information.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, if we have 6 workers and 2 nodes, is there a guarantee that each node would have 3 workers, or can it be 4 workers on one node and 2 on another node (assuming each node has 4 GPUs)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it's guaranteed to be an even split. In CPU-mode you always get one worker per node. In GPU-mode you always get NUM_GPUS_PER_NODE workers per node (default 4).

The API to the launch script requires you to specify number of nodes (-n) and optionally number of gpus (workers) per node (-g). So the number of workers is dynamically calculated based on this and guarantees an even worker split among nodes.

scale_factor: int | None = None
gpu_count: int | None = None
num_drivers: int | None = None
gpu_name: str | None = None
image_digest: str | None = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, by default, the images are deleted after 30 days. Is this information that we want to persist in the database?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the Benchmarking DB requires an identifier for run submissions and we default to the image_digest. I think even if the image is deleted this is useful for identifying an image uniquely.


@classmethod
def from_parsed(cls, raw: dict) -> "BenchmarkMetadata":
Expand Down Expand Up @@ -259,8 +261,9 @@ def _parse_args() -> argparse.Namespace:
)
parser.add_argument(
"--identifier-hash",
help="Unique identifier hash for software environment (e.g. a container image digest).",
required=True,
default=None,
help="Unique identifier hash for software environment (e.g. a container image digest). "
"If omitted, the image_digest from benchmark_result.json context is used.",
)
parser.add_argument(
"--version",
Expand Down Expand Up @@ -299,6 +302,26 @@ def _parse_args() -> argparse.Namespace:
help="Benchmark definition name",
required=True,
)
parser.add_argument(
"--velox-branch",
default=None,
help="Velox branch used to build the worker image.",
)
parser.add_argument(
"--velox-repo",
default=None,
help="Velox repository used to build the worker image.",
)
parser.add_argument(
"--presto-branch",
default=None,
help="Presto branch used to build the worker image.",
)
parser.add_argument(
"--presto-repo",
default=None,
help="Presto repository used to build the worker image.",
)
parser.add_argument(
"--concurrency-streams",
help="Number of concurrency streams to use for the benchmark run",
Expand Down Expand Up @@ -355,6 +378,10 @@ def _build_submission_payload(
is_official: bool,
asset_ids: list[int] | None = None,
concurrency_streams: int = 1,
velox_branch: str | None = None,
velox_repo: str | None = None,
presto_branch: str | None = None,
presto_repo: str | None = None,
) -> dict:
"""Build a BenchmarkSubmission payload from parsed dataclasses.

Expand Down Expand Up @@ -449,6 +476,16 @@ def _query_sort_key(name: str):
if v is not None
}

engine_config_payload = engine_config.serialize() if engine_config else {}
if velox_branch or velox_repo or presto_branch or presto_repo:
engine_config_payload = {
**engine_config_payload,
"velox_branch": velox_branch,
"velox_repo": velox_repo,
"presto_branch": presto_branch,
"presto_repo": presto_repo,
}

return {
"sku_name": sku_name,
"storage_configuration_name": storage_configuration_name,
Expand All @@ -461,11 +498,11 @@ def _query_sort_key(name: str):
"commit_hash": commit_hash,
},
"run_at": benchmark_metadata.timestamp.isoformat(),
"node_count": 1,
"node_count": benchmark_metadata.node_count or 1,
"gpu_count": benchmark_metadata.gpu_count or 0,
"query_logs": query_logs,
"concurrency_streams": concurrency_streams,
"engine_config": engine_config.serialize() if engine_config else {},
"engine_config": engine_config_payload,
"extra_info": extra_info,
"is_official": is_official,
"asset_ids": asset_ids,
Expand Down Expand Up @@ -550,7 +587,7 @@ async def _process_benchmark_dir(
storage_configuration_name: str,
cache_state: str,
engine_name: str | None,
identifier_hash: str,
identifier_hash: str | None,
version: str | None,
commit_hash: str | None,
is_official: bool,
Expand All @@ -563,6 +600,10 @@ async def _process_benchmark_dir(
concurrency_streams: int = 1,
config_dir: Path | None = None,
logs_dir: Path | None = None,
velox_branch: str | None = None,
velox_repo: str | None = None,
presto_branch: str | None = None,
presto_repo: str | None = None,
) -> int:
"""Process a benchmark directory and post results to API.

Expand All @@ -589,6 +630,18 @@ async def _process_benchmark_dir(
print(f" Error loading metadata: {e}", file=sys.stderr)
return 1

# Fall back to the container image_digest captured in the benchmark
# results context when no explicit identifier_hash was provided on the CLI.
if identifier_hash is None:
identifier_hash = benchmark_metadata.image_digest
if identifier_hash is None:
print(
" Error: --identifier-hash was not provided and benchmark_result.json "
"context has no image_digest to fall back to.",
file=sys.stderr,
)
return 1

# Resolve config directory: explicit override → auto-detect from variant
effective_config_dir = config_dir
variant = _ENGINE_TO_VARIANT.get(benchmark_metadata.engine)
Expand Down Expand Up @@ -669,6 +722,10 @@ async def _process_benchmark_dir(
is_official=is_official,
asset_ids=asset_ids,
concurrency_streams=concurrency_streams,
velox_branch=velox_branch,
velox_repo=velox_repo,
presto_branch=presto_branch,
presto_repo=presto_repo,
)
except Exception as e:
print(f" Error building payload for '{bench_name}': {e}", file=sys.stderr)
Expand Down Expand Up @@ -747,6 +804,10 @@ async def main() -> int:
concurrency_streams=args.concurrency_streams,
config_dir=Path(args.config_dir) if args.config_dir else None,
logs_dir=Path(args.logs_dir) if args.logs_dir else None,
velox_branch=args.velox_branch,
velox_repo=args.velox_repo,
presto_branch=args.presto_branch,
presto_repo=args.presto_repo,
)

return result
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ query.execution-policy=phased
# Kill queries based on total reservation on blocked nodes to recover memory.
query.low-memory-killer.policy=total-reservation-on-blocked-nodes
# Upper limit on query wall time to keep tests bounded.
query.max-execution-time=10m
query.max-execution-time=60m
# Keep metadata of up to 1000 queries for UI and debugging.
query.max-history=1000
# Memory quotas per node and cluster to protect stability.
Expand Down
7 changes: 5 additions & 2 deletions presto/scripts/generate_presto_config.sh
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,10 @@ EOF
fi

if [[ "${VARIANT_TYPE}" == "cpu" ]]; then
echo "cluster-tag=native-cpu" >>${COORD_CONFIG}
echo "cluster-tag=native-cpu" >> ${COORD_CONFIG}
# cuDF has no effect in CPU mode but leaving cudf.enabled=true in the worker
# config causes noisy startup warnings; force it off for CPU runs.
sed -i 's/^cudf\.enabled=true/cudf.enabled=false/' ${WORKER_CONFIG}
fi

# for Java variant, disable some Parquet properties which are now rejected
Expand All @@ -162,7 +165,7 @@ fi

# We want to propagate any changes from the original worker config to the new worker configs even if
# we did not re-generate the configs.
if [[ -n "$NUM_WORKERS" && "$VARIANT_TYPE" == "gpu" ]]; then
if [[ -n "$NUM_WORKERS" ]]; then
if [[ -n ${GPU_IDS:-} ]]; then
WORKER_IDS=($(echo "$GPU_IDS" | tr ',' ' '))
else
Expand Down
Loading
Loading