Fix issues #51-#56: scheduler, images, attach, CLI, K8s#57
Merged
Conversation
…h hang, CLI show, K8s retry/address Fixes: - #56: Scheduler crash recovery via catch_unwind; safety check for num_nodes=0; update_pending_reasons now checks constraint/exclusive/fully-consumed (matching find_suitable_nodes) so Reason accurately reflects why job can't be scheduled - #55: Agent image_dir() now uses 3-tier fallback matching CLI (env → system dir if exists → ~/.spur/images) instead of hardcoding /var/spool/spur/images - #54: sattach uses per-byte reads instead of line-buffered; channel buffer increased 32→256 to prevent deadlock; graceful task shutdown instead of abort - #53: `spur show node X` now dispatches as `scontrol show node X` by inserting implicit show subcommand (docs said `spur show node` but required `spur show show node`) - #52: K8s operator wraps background tasks (node watcher, job controller, health) in retry loops with exponential backoff (1s→60s cap) - #51: K8s operator adds --address flag with POD_IP env var fallback; Pod hostname is no longer used as default (unroutable from spurctld) Tests: 743 passed, 0 failed (+13 new tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This was referenced Apr 8, 2026
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes 6 open issues (3 reopened from PR #49, 3 new):
catch_unwind; safety fornum_nodes=0;update_pending_reasonsnow checks constraint/exclusive/fully-consumed to match scheduler's actual filteringimage_dir()uses 3-tier fallback matching CLI (env → system if exists →~/.spur/images)spur show node Xinserts implicitshowsubcommand → dispatches asscontrol show node X--addressflag withPOD_IPenv var fallback instead of unroutable Pod hostnameTest plan
cargo fmt --checkcleanCloses #51 #52 #53 #54 #55 #56
🤖 Generated with Claude Code