Skip to content

feat(slurm): default to flat YAML topology if partition topology is absent#235

Merged
dmitsh merged 1 commit into
mainfrom
ds-empty-partition
Mar 18, 2026
Merged

feat(slurm): default to flat YAML topology if partition topology is absent#235
dmitsh merged 1 commit into
mainfrom
ds-empty-partition

Conversation

@dmitsh
Copy link
Copy Markdown
Collaborator

@dmitsh dmitsh commented Mar 11, 2026

No description provided.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 11, 2026

Greptile Summary

This PR updates the Slurm YAML topology generation logic so that when a partition's node list cannot be resolved to any valid topology data (e.g., all nodes are offline or absent from the topology graph), the output defaults to a flat: true topology unit instead of emitting an empty tree or block structure. The change affects both getTreeTopologyUnit (checks len(tree) == 0 after calling getPartitionTree) and getBlockTopologyUnit (checks len(blockMap) == 0 after iterating resolved nodes), and is covered by a new test TestEmptyPartitionTopology.

  • getBlockTopologyUnit: when no nodes from the partition spec can be mapped to block data, tu.Flat = true is set rather than returning an empty BlockTopo.
  • getTreeTopologyUnit: when getPartitionTree returns an empty map (no resolvable node paths), tu.Flat = true is set rather than returning an empty TreeTopo.
  • The Tree and Block fields on TopologyUnit are now only populated inside their respective else branches, meaning neither field will be nil-but-empty in the flat-fallback path.
  • A new test exercises all five combinations: valid tree, down-node tree, mixed-valid block, down-node block, and explicit flat — verifying the expected YAML output for each.

Confidence Score: 5/5

  • This PR is safe to merge — the fallback logic is simple, well-isolated, and fully exercised by the new test.
  • The change is small and localised to two functions in yaml.go. The existing Flat field on TopologyUnit was already supported (used by TopologyFlat plugin), so no new struct changes are needed. The new TestEmptyPartitionTopology test directly validates both the tree and block fallback paths. No regressions to existing tests are expected, and the prior behaviour (empty switch/block lists) was arguably misleading to Slurm.
  • No files require special attention.

Important Files Changed

Filename Overview
pkg/translate/yaml.go Refactors getBlockTopologyUnit and getTreeTopologyUnit to fall back to a flat topology (tu.Flat = true) when no valid partition nodes are found, instead of emitting an empty block/tree structure. Logic is correct; the nBlocks variable in the if-initializer is slightly verbose but valid.
pkg/translate/yaml_test.go Adds TestEmptyPartitionTopology covering tree and block topologies with no resolvable nodes, verifying they produce flat: true, alongside existing valid topologies. Good coverage of the new fallback behaviour.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[getTreeTopologyUnit / getBlockTopologyUnit] --> B{Gather candidate\nnodes from spec}
    B --> C{Any valid nodes\nresolved?}
    C -- "No valid nodes\n(all 'down' / not found)" --> D[Build empty\ntree / blockMap]
    C -- "Some valid nodes" --> E[Build partition tree\nor block map]
    D --> F{len == 0?}
    E --> F
    F -- Yes --> G[tu.Flat = true\nReturn flat TopologyUnit]
    F -- No --> H{Plugin type}
    H -- Tree --> I[Traverse tree,\npopulate Switches]
    H -- Block --> J[Sort blocks,\npopulate Blocks]
    I --> K[Return TreeTopo TopologyUnit]
    J --> L[Return BlockTopo TopologyUnit]
    G --> M[YAML: flat: true]
    K --> N[YAML: tree: switches: ...]
    L --> O[YAML: block: blocks: ...]
Loading

Last reviewed commit: 15e11ac

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 96.07843% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.98%. Comparing base (d79be88) to head (15e11ac).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/translate/yaml.go 96.07% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #235      +/-   ##
==========================================
+ Coverage   66.96%   66.98%   +0.02%     
==========================================
  Files          82       82              
  Lines        4646     4649       +3     
==========================================
+ Hits         3111     3114       +3     
  Misses       1424     1424              
  Partials      111      111              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dmitsh dmitsh force-pushed the ds-empty-partition branch from 7ee57bc to bcd8fde Compare March 11, 2026 21:33
…bsent

Signed-off-by: Dmitry Shmulevich <dshmulevich@nvidia.com>
@dmitsh dmitsh force-pushed the ds-empty-partition branch from bcd8fde to 15e11ac Compare March 11, 2026 21:49
@dmitsh dmitsh merged commit a8af994 into main Mar 18, 2026
7 checks passed
@dmitsh dmitsh deleted the ds-empty-partition branch March 18, 2026 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant