Skip to content

feat(server): add optional GPU metrics collection during prove#363

Open
Andrurachi wants to merge 1 commit into
eth-act:masterfrom
Andrurachi:feat-flag-gpu-metrics-265
Open

feat(server): add optional GPU metrics collection during prove#363
Andrurachi wants to merge 1 commit into
eth-act:masterfrom
Andrurachi:feat-flag-gpu-metrics-265

Conversation

@Andrurachi
Copy link
Copy Markdown
Contributor

closes #265

Adds optional hardware metrics collection during GPU proving operations.

Implementation Details:

  • Adds --collect-gpu-metrics and --gpu-metrics-dir flags.
  • Uses the ERE_GPU_METRICS env var to allow overriding the default nvidia-smi query fields.
  • Wraps nvidia-smi using std::process::Command to write a timestamped CSV during zkvm.prove().
  • ImplementsDrop to guarantee the nvidia-smi subprocess is killed and reaped.
  • Catches NotFound errors; if nvidia-smi is missing, it logs a warning and continues proving without interrupting the server.

Testing Note:
Due to local CUDA architecture limitations, I am unable to verify the CSV output. Ready for verification on a capable machine :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider an optional flag to collect and dump GPU metrics

1 participant