Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions src/cluster-configuration/deploy/start.sh.template
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,19 @@ echo kubectl label nodes {{ cluster_cfg['layout']['machine-list'][host]['hostnam
{%- endif %}
{%- if 'pai-worker' in cluster_cfg['layout']['machine-list'][host] and cluster_cfg['layout']['machine-list'][host]['pai-worker'] == 'true' %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} pai-worker=true || exit $?
{%- set machine_type = cluster_cfg['layout']['machine-list'][host]['machine-type'] %}
{%- if machine_type in cluster_cfg['layout']['machine-sku'] and 'computing-device' in cluster_cfg['layout']['machine-sku'][machine_type] %}
{%- set device_type = cluster_cfg['layout']['machine-sku'][machine_type]['computing-device']['type'] %}
{%- if device_type == 'nvidia.com/gpu' %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=nvidia || exit $?
{%- elif device_type == 'amd.com/gpu' %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=amd || exit $?
{%- else %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=unknown || exit $?
{%- endif %}
{%- else %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=cpu || exit $?
{%- endif %}
{%- else %}
Comment on lines +51 to 64
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation of the Jinja2 template blocks is inconsistent. Lines 51-63 use excessive leading spaces (8 spaces) compared to the surrounding code which uses 4 spaces for the conditional blocks. This makes the code harder to read and maintain.

Suggested change
{%- set machine_type = cluster_cfg['layout']['machine-list'][host]['machine-type'] %}
{%- if machine_type in cluster_cfg['layout']['machine-sku'] and 'computing-device' in cluster_cfg['layout']['machine-sku'][machine_type] %}
{%- set device_type = cluster_cfg['layout']['machine-sku'][machine_type]['computing-device']['type'] %}
{%- if device_type == 'nvidia.com/gpu' %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=nvidia || exit $?
{%- elif device_type == 'amd.com/gpu' %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=amd || exit $?
{%- else %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=unknown || exit $?
{%- endif %}
{%- else %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=cpu || exit $?
{%- endif %}
{%- else %}
{%- set machine_type = cluster_cfg['layout']['machine-list'][host]['machine-type'] %}
{%- if machine_type in cluster_cfg['layout']['machine-sku'] and 'computing-device' in cluster_cfg['layout']['machine-sku'][machine_type] %}
{%- set device_type = cluster_cfg['layout']['machine-sku'][machine_type]['computing-device']['type'] %}
{%- if device_type == 'nvidia.com/gpu' %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=nvidia || exit $?
{%- elif device_type == 'amd.com/gpu' %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=amd || exit $?
{%- else %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=unknown || exit $?
{%- endif %}
{%- else %}
echo kubectl label --overwrite=true nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} vendor=cpu || exit $?
{%- endif %}
{%- else %}

Copilot uses AI. Check for mistakes.
echo kubectl label nodes {{ cluster_cfg['layout']['machine-list'][host]['hostname'] }} pai-worker- || exit $?
{%- endif %}
Expand Down
10 changes: 8 additions & 2 deletions src/device-plugin/deploy/start.sh.template
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ pushd $(dirname "$0") > /dev/null
s/^([[:space:]]*)allowPrivilegeEscalation: false.*$/\1privileged: false/
G
s/(^[[:space:]]*allowPrivilegeEscalation: false.*)\n([[:space:]]*privileged: false)/\1\n\2/
}';
}' \
| sed '/^[[:space:]]*tolerations:/i\ nodeSelector:\n vendor: nvidia';
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sed command uses || (logical OR operator) on line 39, which should be | (pipe operator) to chain the sed commands. The || operator will only execute the second sed if the first one fails, which is not the intended behavior.

Copilot uses AI. Check for mistakes.
cat <<'YAML'
imagePullSecrets:
- name: {{ cluster_cfg["cluster"]["docker-registry"]["secret-name"] }}
Expand All @@ -50,7 +51,12 @@ YAML
{% if 'amd.com/gpu' in cluster_cfg['device-plugin']['devices'] %}

{ curl -s https://raw.githubusercontent.com/ROCm/k8s-device-plugin/master/k8s-ds-amdgpu-dp.yaml \
| sed 's|rocm/k8s-device-plugin|{{ cluster_cfg['cluster']['docker-registry']['prefix'] }}k8s-rocm-device-plugin:{{ cluster_cfg['cluster']['docker-registry']['tag'] }}|';
| sed 's|rocm/k8s-device-plugin|{{ cluster_cfg['cluster']['docker-registry']['prefix'] }}k8s-rocm-device-plugin:{{ cluster_cfg['cluster']['docker-registry']['tag'] }}|' \
| sed -E '/^[[:space:]]*nodeSelector:[[:space:]]*$/{
n
s/^([[:space:]]*)(.*)$/\1vendor: amd\
\1\2/
}';
Comment on lines +54 to +59
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sed command uses || (logical OR operator) on line 54, which should be | (pipe operator) to chain the sed commands. The || operator will only execute the second sed if the first one fails, which is not the intended behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +59
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing closing brace for the opening brace on line 53. The AMD device plugin section starts with { but the sed command chain ends with a semicolon without a corresponding } and the cat command structure that exists in the NVIDIA section.

Copilot uses AI. Check for mistakes.
cat <<'YAML'
imagePullSecrets:
- name: {{ cluster_cfg["cluster"]["docker-registry"]["secret-name"] }}
Expand Down