Skip to content

fix(recycle): delete pod on release when NoopRecycler is used#1046

Open
ashishpatel26 wants to merge 2 commits into
opensandbox-group:mainfrom
ashishpatel26:fix/issue-743-v2
Open

fix(recycle): delete pod on release when NoopRecycler is used#1046
ashishpatel26 wants to merge 2 commits into
opensandbox-group:mainfrom
ashishpatel26:fix/issue-743-v2

Conversation

@ashishpatel26

Copy link
Copy Markdown

Summary

  • NoopRecycler.TryRecycle returned StateSucceeded + NeedDelete: false for any pod, including live ones
  • collectRecycleResults treated StateSucceeded as "recycling complete → return pod to pool", so the pod re-entered the pool carrying all state from the previous sandbox run
  • Fix: return StateRecycling + NeedDelete: true for live pods (matching DeleteRecycler semantics); the pool controller deletes the pod and scales up a fresh replacement

Test plan

  • go test ./internal/controller/recycle/... — 3 cases: nil pod → Succeeded; pod with DeletionTimestamp → Succeeded; live pod → Recycling + NeedDelete

Fixes #743

When a sandbox backed by a pre-warmed pool is destroyed, the pod was
returned directly to the pool without being rebuilt because NoopRecycler
returned StateSucceeded immediately — causing the pod (with stale user
data) to be treated as available for the next sandbox.

NoopRecycler now sets NeedDelete=true so the pod is deleted on release,
matching the behaviour of DeleteRecycler. The pool controller then
creates a fresh replacement pod, ensuring no data from a previous
run persists.

Fixes opensandbox-group#743
Copilot AI review requested due to automatic review settings June 12, 2026 17:39
@github-actions

Copy link
Copy Markdown
Contributor

⚠️ This PR has no labels. Please add one based on the changes.

Changed directories: kubernetes.

📋 Recommended labels (based on changed files):

  • component/k8s ⬅️

Other available labels:

  • bug - Something isn't working
  • dependencies - Pull requests that update a dependency file
  • documentation - Improvements or additions to documentation
  • feature - New feature or request
  • packages - Changes for package, image and configuration

💡 Tip: Use feature for new functionality or improvements, bug for fixes.

cc @ashishpatel26

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates the Kubernetes “noop” recycle handler to avoid reusing potentially stale pods by signaling the caller to delete existing pods, and aligns unit tests with the new state-machine behavior.

Changes:

  • Change NoopRecycler.TryRecycle to return StateRecycling with NeedDelete=true when a pod still exists.
  • Treat nil / already-deleting pods as StateSucceeded.
  • Expand noop_test.go to cover the new expected states and NeedDelete behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
kubernetes/internal/controller/recycle/noop.go Updates Noop recycler semantics to request pod deletion for existing pods and adjusts status/messages.
kubernetes/internal/controller/recycle/noop_test.go Updates tests to assert the new status state machine and deletion intent.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +41 to +46
if pod == nil || pod.DeletionTimestamp != nil {
return &Status{
State: StateSucceeded,
Message: "noop recycler: pod is deleted",
}, nil
}

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8f5ca47a13

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +48 to +50
State: StateRecycling,
Message: "noop recycler: pod marked for deletion",
NeedDelete: true,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve Noop's no-delete contract

When a Pool explicitly sets recycleStrategy.type: Noop, the public CRD contract still says RecycleTypeNoop does nothing, and the e2e flow at kubernetes/test/e2e/e2e_test.go:2550 verifies those pods are not deleted. Returning NeedDelete: true for every live pod makes collectRecycleResults pass the pod to scalePool, which deletes it, so Noop now behaves like Delete and breaks users/tests that selected Noop to keep warm pods alive; if delete-on-release is the desired behavior, the public contract and Noop-specific tests need to change together rather than silently redefining this enum.

Useful? React with 👍 / 👎.

@ashishpatel26

Copy link
Copy Markdown
Author

Thanks for the reviews. Addressing feedback:

Codex P1 / Copilot – Noop contract and e2e tests: This is the core design question. The original NoopRecycler re-pooled pods immediately without deletion, meaning a returned pod could carry state (processes, files, env changes) from the previous sandbox. The fix ensures a fresh pod is always created by returning NeedDelete: true for live pods.

If the pool's recycleStrategy.type: Noop is documented to mean "return pod to pool as-is without any cleanup" (including keeping the same pod object), then this is a breaking change and the correct fix belongs in a different recycler type (e.g. DeleteAndRecreate). If Noop is meant to mean "no in-pod cleanup steps" (but a fresh pod is still guaranteed), then this change is consistent with that intent.

Requesting maintainer guidance: should RecycleTypeNoop preserve the exact pod, or only skip in-pod cleanup while still ensuring a fresh pod? Happy to revert and introduce a new recycler type if the former is the intended contract.

Copilot – DeletionTimestamp != nil treated as Succeeded: Valid. A pod with a deletion timestamp may still be running (terminating). Returning StateSucceeded before the pod is fully gone can let the pool re-allocate before cleanup finishes. Will change to return StateRecycling while the pod has a deletion timestamp, and only StateSucceeded once the pod is nil.

A pod with DeletionTimestamp is still running/terminating. Returning
StateSucceeded allowed the pool to re-allocate the pod before it fully
disappeared, risking data leakage between sandbox runs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] pre-warmed pod won't be deleted never

2 participants