Skip to content

Add GPU support;Use resource limits for resource detection#24

Merged
dciangot merged 1 commit into
interlink-hq:mainfrom
chihchun1011:feature/gpu-translation
Apr 27, 2026
Merged

Add GPU support;Use resource limits for resource detection#24
dciangot merged 1 commit into
interlink-hq:mainfrom
chihchun1011:feature/gpu-translation

Conversation

@chihchun1011

Copy link
Copy Markdown
Contributor

Summary

This PR makes two changes:

  1. Switches resource detection from requests to limits for CPU, memory, and GPU.

  2. Adds automatic GPU passthrough. When nvidia.com/gpu is present in limits, the plugin automatically injects the --nv flag into Singularity options to enable NVIDIA runtime support inside the container.

A test manifest (tests/gpu_k8s.yaml) is included to validate GPU end-to-end via nvidia-smi.

Changes

  • Plugin: Read CPU, memory, and GPU from resources.limits instead of resources.requests.
  • Plugin: Auto-inject --nv into Singularity options when nvidia.com/gpu is detected.
  • Test: Add tests/gpu_k8s.yaml — runs nvidia-smi via the Interlink HTCondor virtual node.

Testing

kubectl apply -f tests/gpu_k8s.yaml
# Expected: nvidia-smi lists GPUs, CUDA_VISIBLE_DEVICES=0,1

@chihchun1011 chihchun1011 force-pushed the feature/gpu-translation branch from b4c3f9d to 27fe0d9 Compare April 27, 2026 10:13
@dciangot

Copy link
Copy Markdown
Member

Thank you @chihchun1011 , we are almost there with the checks!

Warning: orkspace/tests/gpu_k8s.yaml:1:1: [warning] missing document start "---" (document-start)
Warning: orkspace/tests/gpu_k8s.yaml:8:3: [warning] wrong indentation: expected 4 but found 2 (indentation)
Warning: orkspace/tests/gpu_k8s.yaml:28:3: [warning] wrong indentation: expected 4 but found 2 (indentation)

@chihchun1011 chihchun1011 force-pushed the feature/gpu-translation branch from 27fe0d9 to 9480993 Compare April 27, 2026 10:37
Signed-off-by: Chih-Chun Kuo <b06901164@ntu.edu.tw>
@chihchun1011 chihchun1011 force-pushed the feature/gpu-translation branch from 9480993 to f9180ce Compare April 27, 2026 11:08
@dciangot dciangot requested review from dciangot April 27, 2026 11:40

@dciangot dciangot left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dciangot

Copy link
Copy Markdown
Member

I'm approving this; the only left test failure is a flake.

@dciangot dciangot merged commit e271db1 into interlink-hq:main Apr 27, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants