Skip to content

[Storage] STP: File-Level Restore#121

Open
ema-aka-young wants to merge 14 commits into
RedHatQE:mainfrom
ema-aka-young:file-level-restore-qualityflow-iterations
Open

[Storage] STP: File-Level Restore#121
ema-aka-young wants to merge 14 commits into
RedHatQE:mainfrom
ema-aka-young:file-level-restore-qualityflow-iterations

Conversation

@ema-aka-young

@ema-aka-young ema-aka-young commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

STP Metadata

VEP issue:
kubevirt/enhancements#170

What this PR does

Add STP for File-Level Restore

Special notes for your reviewer

Assisted by Quality Flow and Claude Code

Summary by CodeRabbit

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive test plan for File-Level Restore (Dev Preview), including feature overview for automatic and manual restore modes.
    • Defined quality checkpoints, testing scope/goals, entry criteria, and known limitations.
    • Documented test strategy by priority tiers, including integration/compatibility and upgrade coverage, plus infrastructure, risks, and traceability to related initiatives.

Signed-off-by: Emanuele Prella <eprella@redhat.com>
Signed-off-by: Emanuele Prella <eprella@redhat.com>
Signed-off-by: Emanuele Prella <eprella@redhat.com>
Signed-off-by: Emanuele Prella <eprella@redhat.com>
Signed-off-by: Emanuele Prella <eprella@redhat.com>
Signed-off-by: Emanuele Prella <eprella@redhat.com>
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@ema-aka-young, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 39 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: fbf30da2-d8b2-453c-92eb-645392f62567

📥 Commits

Reviewing files that changed from the base of the PR and between 5a949d8 and fd7c214.

📒 Files selected for processing (1)
  • stps/sig-storage/VIRTSTRAT-480_file_level_restore.md
📝 Walkthrough
📝 Walkthrough
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title clearly and concisely summarizes the main change: a storage STP for file-level restore.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-virtualization-qe-bot-2

Copy link
Copy Markdown

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: Enabled categories: branch, can-be-merged, cherry-pick, has-conflicts, hold, needs-rebase, size, verified, wip

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message
  • /security-override - Set security check runs to pass (maintainers only)
  • /security-override cancel - Re-run security checks

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest all - Run all available tests

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3
  • /cherry-pick-retry <branch> - Retry a failed cherry-pick (merged PRs only)

Branch Management

  • /rebase - Rebase this PR branch onto its base branch

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 2 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)

📊 Review Process

Approvers and Reviewers

Approvers:

  • jpeimer

Reviewers:

  • Acedus
  • Ahmad-Hafe
  • Dsanatar
  • ShellyKas13
  • acinko-rh
  • akalenyu
  • alromeros
  • arnongilboa
  • awels
  • dalia-frank
  • ema-aka-young
  • josemacassan
  • jpeimer
  • kgoldbla
  • kshvaika
  • noamasu
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
AI Features
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6-1m)
Security Checks
  • Suspicious Path Detection: Monitors paths: .claude/, .vscode/, .cursor/, .devcontainer/, .pi/, .github/workflows/, .github/actions/
  • Committer Identity Check: Verifies last committer matches PR author
  • Mandatory: Security checks block merge (use /security-override to bypass — maintainers only)

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@stps/sig-storage/VIRTSTRAT-480_file_level_restore.md`:
- Around line 68-81: The acceptance criteria in VIRTSTRAT-480 need a verifiable
non-disruption check instead of only stating that restore happens “without
interrupting the VM’s availability.” Update the acceptance criteria section to
add a condition tied to continuous guest accessibility during the restore, such
as uninterrupted VM network connectivity and guest responsiveness throughout the
operation, so the check would fail if the implementation stopped and restarted
the VM. Keep the wording aligned with the existing seamless restore items and
ensure the new criterion is explicit enough to validate in tests.
- Around line 83-93: The NFRs section is missing required categories, so update
the checklist in the restore spec to explicitly cover UI, Documentation, and
Observability alongside the existing Security/Usability/Monitoring items. In the
NFRs block, use the relevant spec symbols/section headings to add a clear
PM/UX-backed statement for UI (even if no UI changes are planned), note whether
any user-facing docs or configuration guidance will be validated under
Documentation, and make Observability explicit if it is being treated as part of
Monitoring; keep Performance and Scalability as justified deferrals.
- Around line 94-113: Replace every approval-blocking placeholder in the STP
with real sign-offs and names, including the Sign-off entries in Known
Limitations, Test Limitations, and Risks, the PM/Lead Agreement entries in Out
of Scope, and the Approval section placeholders. Update the affected sections in
the document by using the existing labels like Known Limitations, Out of Scope,
Test Limitations, Risks, and Approval, and populate each `[TBD]` or `TBD` field
with the actual reviewer name and date/handle as required.
- Around line 157-189: Update the P2 testing goal about path formatting
variations so it names the missing configuration dimensions needed to make the
test actionable. In the testing goals list, revise the entry for the restore
path-variation case to explicitly call out the guest OS and filesystem contexts,
using the existing restore test items as the reference point for consistency.
Keep the scope focused on the restore workflow, but make sure the goal clearly
states which Linux/Windows and filesystem combinations it applies to.
- Around line 416-423: The file restore scenarios are incorrectly marked as Tier
3 even though the STP guidelines only allow Tier 1 or Tier 2. Update the
affected scenario entries for Windows VM backup PVC restore, Windows VM volume
snapshot restore, Windows VM manual file browsing, and concurrent restore on
different VMs to use Tier 2, or add explicit repository-level justification if
Tier 3 must remain. Keep the scenario wording intact and only change the tier
classification in the relevant bullet items.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 857d770d-6168-49c1-8978-f8f633f8a6c0

📥 Commits

Reviewing files that changed from the base of the PR and between 2dc8cf8 and e375c8a.

📒 Files selected for processing (1)
  • stps/sig-storage/VIRTSTRAT-480_file_level_restore.md

Comment thread stps/sig-storage/VIRTSTRAT-480_file_level_restore.md
Comment thread stps/sig-storage/VIRTSTRAT-480_file_level_restore.md
Comment thread stps/sig-storage/VIRTSTRAT-480_file_level_restore.md
Comment thread stps/sig-storage/VIRTSTRAT-480_file_level_restore.md
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
stps/sig-storage/VIRTSTRAT-480_file_level_restore.md (2)

196-220: 🎯 Functional Correctness | 🔴 Critical | ⚖️ Poor tradeoff

CRITICAL: Out of Scope items still contain [TBD] placeholders.

All 5 exclusions require actual PM/Lead Agreement names and dates. The UI testing item even claims "PM/Lead confirmed" but still shows [TBD] — this is contradictory and must be resolved.

Replace all [TBD] placeholders or keep the PR in WIP state until sign-offs are obtained.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@stps/sig-storage/VIRTSTRAT-480_file_level_restore.md` around lines 196 - 220,
The out-of-scope section still has unresolved PM/Lead Agreement placeholders and
one contradictory confirmation in the UI testing item. Update the exclusions in
VIRTSTRAT-480_file_level_restore to replace every [TBD] with the actual PM/Lead
names and dates, and make the UI testing note consistent with the documented
sign-off; if sign-offs are not yet available, keep this section marked as WIP
until they are added.

416-526: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

HIGH: Tier 3 classifications remain non-compliant.

The STP guidelines define only Tier 1 (single feature, isolated) and Tier 2 (end-to-end, multi-feature integration, upgrade paths). Four scenarios are still labeled [Tier 3]:

  • Line 424: Windows VM backup PVC restore
  • Line 428: Windows VM volume snapshot restore
  • Line 492: Windows VM manual file browsing
  • Line 525: Concurrent restore on different VMs

Reclassify all [Tier 3] scenarios as [Tier 2] (appropriate for specialized environments and cross-VM workflows) or provide explicit repository-level justification that supersedes the guidelines. The concurrent multi-VM scenario and Windows guest scenarios fit Tier 2 as end-to-end/specialized environment tests.

Also remove "Tier 3" references in Section II.2 (Automation Testing and Scale Testing details) to maintain terminological consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@stps/sig-storage/VIRTSTRAT-480_file_level_restore.md` around lines 416 - 526,
Reclassify the remaining non-compliant Tier 3 items in the restore scenario list
to Tier 2, since the guidelines only allow Tier 1 and Tier 2; update the
scenarios for Windows VM backup PVC restore, Windows VM volume snapshot restore,
Windows VM manual file browsing, and concurrent restore on different VMs
accordingly. Make the same terminology change wherever Tier 3 is referenced in
the Section II.2 automation/scale testing text so the document stays consistent.
Use the scenario labels and the VIRTSTRAT-480 restore section as the primary
anchors when editing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@stps/sig-storage/VIRTSTRAT-480_file_level_restore.md`:
- Around line 97-116: The Known Limitations section still has placeholder PM
sign-offs, so update each limitation entry in the file-level restore doc with
the actual approver name and date instead of [TBD]. Use the existing limitation
bullets in the “Known Limitations” section and replace every placeholder
consistently; if those sign-offs are not available yet, mark the document/PR as
draft or WIP until they are completed.
- Line 89: The NFR note in this markdown conflicts with the Out of Scope
placeholder: the statement in the UI section says PM confirmation already
exists, while the corresponding agreement entry still shows TBD. Update the
related markdown sections together so the same decision is reflected
consistently, using the UI/NFR text and the Out of Scope item as the matching
symbols to locate the conflicting content.

---

Outside diff comments:
In `@stps/sig-storage/VIRTSTRAT-480_file_level_restore.md`:
- Around line 196-220: The out-of-scope section still has unresolved PM/Lead
Agreement placeholders and one contradictory confirmation in the UI testing
item. Update the exclusions in VIRTSTRAT-480_file_level_restore to replace every
[TBD] with the actual PM/Lead names and dates, and make the UI testing note
consistent with the documented sign-off; if sign-offs are not yet available,
keep this section marked as WIP until they are added.
- Around line 416-526: Reclassify the remaining non-compliant Tier 3 items in
the restore scenario list to Tier 2, since the guidelines only allow Tier 1 and
Tier 2; update the scenarios for Windows VM backup PVC restore, Windows VM
volume snapshot restore, Windows VM manual file browsing, and concurrent restore
on different VMs accordingly. Make the same terminology change wherever Tier 3
is referenced in the Section II.2 automation/scale testing text so the document
stays consistent. Use the scenario labels and the VIRTSTRAT-480 restore section
as the primary anchors when editing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 88482ff8-9bbd-447f-b0d2-f11445bcedb2

📥 Commits

Reviewing files that changed from the base of the PR and between 82ed029 and ba06f10.

📒 Files selected for processing (1)
  • stps/sig-storage/VIRTSTRAT-480_file_level_restore.md

- Security: Credentials used for guest access are generated per-operation and cleaned up after completion
- Security: Guest helper scripts prevent command injection
- Usability: Clear error reporting through restore status and events
- UI: No UI changes introduced; feature is API-only. Confirmed with PM that no console integration is planned for Dev Preview. See Section "II - Out of Scope".

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

MEDIUM: UI confirmation in NFRs contradicts Out of Scope placeholder.

Line 89 states "Confirmed with PM that no console integration is planned for Dev Preview," but the corresponding Out of Scope item at line 217 still shows *PM/Lead Agreement:* [TBD]. Either the confirmation exists (and should replace [TBD]) or it doesn't (and the NFR claim is premature). Resolve the inconsistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@stps/sig-storage/VIRTSTRAT-480_file_level_restore.md` at line 89, The NFR
note in this markdown conflicts with the Out of Scope placeholder: the statement
in the UI section says PM confirmation already exists, while the corresponding
agreement entry still shows TBD. Update the related markdown sections together
so the same decision is reflected consistently, using the UI/NFR text and the
Out of Scope item as the matching symbols to locate the conflicting content.

Comment on lines +97 to +116
#### **2. Known Limitations**

- **Backup file browsing is not supported; users must know the path of files to restore**
- *PM Sign-off:* [TBD]

- **Parallel file restores of the same VM are not supported**
- *PM Sign-off:* [TBD]

- **Remote storage (S3) source is not supported in Dev Preview; only PVC and VolumeSnapshot sources**
- *PM Sign-off:* [TBD]

- **The `DeclarativeHotplugVolumes` feature gate must be enabled in KubeVirt for the operator to function**
- *PM Sign-off:* [TBD]

- **Guest helper script must be pre-installed in the VM; the operator does not install it automatically**
- *PM Sign-off:* [TBD]

- **SSH access must be configured on the VM with the `filerestore` user; the operator does not configure guest SSH automatically**
- *PM Sign-off:* [TBD]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🔴 Critical | ⚖️ Poor tradeoff

CRITICAL: Known Limitations still contain [TBD] placeholders.

All 6 limitations require actual PM sign-offs with names and dates. Empty placeholders block approval.

Replace [TBD] with actual names/dates for each limitation, or if the PR is not yet ready for final sign-off, mark the PR as draft/WIP and complete before requesting approval.

🧰 Tools
🪛 GitHub Check: can-be-merged

[error] 116-116: CI check failed due to an unresolved review conversation (approval/LGTM requirements not met). Discussion: #121 (comment)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@stps/sig-storage/VIRTSTRAT-480_file_level_restore.md` around lines 97 - 116,
The Known Limitations section still has placeholder PM sign-offs, so update each
limitation entry in the file-level restore doc with the actual approver name and
date instead of [TBD]. Use the existing limitation bullets in the “Known
Limitations” section and replace every placeholder consistently; if those
sign-offs are not available yet, mark the document/PR as draft or WIP until they
are completed.

@ema-aka-young

Copy link
Copy Markdown
Contributor Author

/wip cancel

@ema-aka-young ema-aka-young changed the title WIP: [Storage] STP: File-Level Restore [Storage] STP: File-Level Restore Jun 30, 2026

@rnetser rnetser left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Found 8 issue(s) in this PR:

💡 Suggestions (8)

File Line Issue
stps/sig-storage/VIRTSTRAT-480_file_level_restore.md 73 [CRITICAL] Non-disruptive restore AC lacks falsifiable verification.
stps/sig-storage/VIRTSTRAT-480_file_level_restore.md 17 [HIGH] Feature Maturity uses "TBD" for TP and GA — not an allowed value.
stps/sig-storage/VIRTSTRAT-480_file_level_restore.md 337 [HIGH] Test Environment risk is missing the required supplemental field.
stps/sig-storage/VIRTSTRAT-480_file_level_restore.md 343 [HIGH] Untestable Aspects risk is missing the required supplemental field.
stps/sig-storage/VIRTSTRAT-480_file_level_restore.md 240 [HIGH] Automation Testing strategy references non-existent "Tier 3".
stps/sig-storage/VIRTSTRAT-480_file_level_restore.md 188 [HIGH] Testing Goal P0 #9 overlaps with P0 #1 and #2.
stps/sig-storage/VIRTSTRAT-480_file_level_restore.md 167 [MEDIUM] P0 priority overuse — all 9 testing goals are P0, which dilutes "bl
stps/sig-storage/VIRTSTRAT-480_file_level_restore.md 193 [MEDIUM] Compound testing goal is a grab-bag of 5+ distinct behaviors.

Review generated by pi

- Data protection partners can restore single files or a subset of user data into a VM
- Users do not need to restore an entire VM to recover a small set of files
- Backup vendor can trigger file-level restore from a backup PVC
- VM Admin/User can restore specific files/directories from a VolumeSnapshot into a running VM without interrupting the VM's availability — the VM remains network-reachable and guest-responsive throughout the restore operation (no reboot, no pause)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL] Non-disruptive restore AC lacks falsifiable verification.

The AC says "VM remains network-reachable and guest-responsive throughout the restore operation (no reboot, no pause)" — but the matching test scenario (line 381) just says "without interrupting VM availability."

A simple end-state check (VM is running after restore) would pass even after a disruptive stop-and-restart. The AC and its matching scenario need to specify the continuous verification mechanism.

Suggested rewrite for the AC:

VM Admin/User can restore specific files/directories from a VolumeSnapshot into a running VM — a continuous connectivity check (ping or TCP connection) shows zero interruption during restore, and a long-running guest process completes without restart

And for the scenario (line 381):

Verify files and directories are restored from a volume snapshot while a continuous connectivity check confirms zero interruption to the running VM

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we would need to rewrite the AC, but I'll proceed with the update for the test scenario.

- [CNV-89229](https://issues.redhat.com/browse/CNV-89229) - File-Level Restore: API extensions and support encryption
- **Feature Maturity:**
- DP: CNV 5.0.0
- TP: TBD

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Feature Maturity uses "TBD" for TP and GA — not an allowed value.

AGENTS.md requires: TP: [version or N/A], GA: [version]. Use actual target versions or N/A if not yet determined.

Suggested change
- TP: TBD
- TP: N/A
- GA: N/A

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK - we should enforce this through Quality Flow.


- **Risk:** Windows guest testing may have limited coverage due to image availability and configuration complexity
- **Mitigation:** Prioritize Linux guest testing for Dev Preview. Establish Windows test VM image with SSH pre-configured.
- *Areas with reduced coverage:* Windows guest restore

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Test Environment risk is missing the required supplemental field.

Per AGENTS.md II.5: when a risk exists, the category-specific supplemental field is required. For Test Environment, add:

*Missing or unavailable environments:* LVM-backed storage provisioner in standard CI clusters

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my bad - I thought Quality Flow was diverging from the templates. I'll update in the next iteration.

**Test Environment**

- **Risk:** LVM-backed storage for UUID collision testing may not be available in standard CI environments
- **Mitigation:** Use dedicated test environment with LVM provisioner or mock the UUID collision scenario at the operator level.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Untestable Aspects risk is missing the required supplemental field.

Per AGENTS.md II.5: when a risk exists, the category-specific supplemental field is required. For Untestable Aspects, add:

*Reason untestable and mitigation approach:* Third-party vendor backup formats are proprietary; tested via CRD API surface and PVC-based restore path

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment thread stps/sig-storage/VIRTSTRAT-480_file_level_restore.md
- [P1] Verify the file restore operator deploys and operates correctly as an HCO-managed component
- [P2] Verify cross-namespace restore from a volume in a different namespace completes with temporary resources cleaned up after completion
- [P2] Verify restore succeeds when the source volume's storage mode differs from the cluster default
- [P2] Verify restore from an LVM-based snapshot handles volume identifier collisions without mount failures

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Testing Goal P0 #9 overlaps with P0 #1 and #2.

"Verify end-to-end restore workflow from restore request creation to file verification using the file restore API" appears to be a summary of P0 #1 (restore from backup volume) and P0 #2 (restore from volume snapshot).

If it tests the same flows, it's redundant — remove it. If it tests something distinct (e.g., a specific multi-step API-driven workflow not covered by #1/#2), clarify what distinguishes it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing it.

- [P0] Verify a backup vendor can restore files from a backup volume into a running Linux VM with file integrity preserved (size, ownership, permissions)
- [P0] Verify a VM user can restore files from a volume snapshot into a running VM without interrupting the VM's availability
- [P0] Verify manual restore mode provides read-only access to backup contents and the system cleans up when the restore request is removed
- [P0] Verify the system correctly detects whether a VM runs Linux or Windows and selects the appropriate restore method

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] P0 priority overuse — all 9 testing goals are P0, which dilutes "blocks GA" meaning.

5 of the 9 P0 goals are error-handling/validation scenarios (clear errors at each stage, reject invalid requests, cleanup, security). Error handling is important but having ALL goals as P0 means nothing is differentiated.

Consider keeping core happy-path goals (restore from backup, restore from snapshot, manual mode, OS detection) as P0 and moving error-handling scenarios to P1.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this specific goal

@ema-aka-young ema-aka-young Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rnetser I will re-evaluate them, although if P0 equals to "blocks GA" then I'd say error validation does it from my perspective.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnongilboa I'll rephrase the "selects the appropriate restore method" as per Slack conversation

- [P2] Verify paths with formatting variations (trailing slashes, double slashes) are normalized and restore succeeds on Linux (ext4, XFS) and Windows (NTFS) guests
- [P2] Verify the system handles guest connection loss during file transfer gracefully with partial completion status
- [P2] Verify operator upgrade preserves existing restore resources and their status
- [P2] Verify restore operations handle edge cases including large files, root disk snapshots, cross-filesystem constraints, concurrent multi-VM operations, and API server rate-limiting resilience

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Compound testing goal is a grab-bag of 5+ distinct behaviors.

"Verify restore operations handle edge cases including large files, root disk snapshots, cross-filesystem constraints, concurrent multi-VM operations, and API server rate-limiting resilience" — each of these can independently pass or fail.

Per AGENTS.md: "if one scenario can fail while another passes, they are separate items." Split into individual goals or remove this meta-goal since each behavior already has its own dedicated goal/scenario elsewhere in the STP.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm removing it; the corresponding scenarios can trace back to other existing goals.

- [x] **Non-Functional Requirements (NFRs)**
- *List applicable NFRs and their targets:*
- Security: Restore operations use authenticated, restricted guest access with dedicated user and RBAC roles (admin/editor/viewer)
- Security: Credentials used for guest access are generated per-operation and cleaned up after completion

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it's a single key pair (not per operation), but we will improve it to SSH certificates with TTL.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Comment on lines +111 to +115
- **Guest helper script must be pre-installed in the VM; the operator does not install it automatically**
- *PM Sign-off:* [TBD]

- **SSH access must be configured on the VM with the `filerestore` user; the operator does not configure guest SSH automatically**
- *PM Sign-off:* [TBD]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth mentioning we provide basic setup scripts (U/S) covering both of the above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

- XFS snapshot mounting requires special options to avoid filesystem identifier collisions
- LVM-based volume snapshots have identifier collision issues that need explicit handling
- Storage mode mismatches between snapshot sources and cluster defaults can cause restore failures, requiring tests across different storage configurations
- Windows guest connectivity support for guest helper execution

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More explicitly - OpenSSH Server (any alternative?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will mention this

- [x] **API Extensions**
- *List new or modified APIs:*
- New file restore API — users create restore requests specifying a target VM, backup source (PVC or VolumeSnapshot), and file paths. Status provides phase tracking and completion information.
- *Testing impact:* All create, read, update, and delete operations on the file restore API must be tested. Restore phase transitions must be validated. Status must be verified for both success and failure paths.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRUD on the CR I guess?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

- *See environment requirements in Section II.3 and testing tools in Section II.3.1*

- [x] **Topology Considerations**
- *Describe topology requirements:* Multi-node cluster required for hotplug volume attachment. Cross-namespace PVC restore requires namespace-level RBAC verification.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-node cluster required for hotplug volume attachment?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible hallucination :)

#### **1. Scope of Testing**

This STP covers the Dev Preview phase of the File-Level Restore feature for CNV 5.0.
Testing validates automatic and manual restore modes, Linux and Windows guest support, PVC and

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note automatic mode should have the main focus.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

- *Details:* Validate compatibility with KubeVirt DeclarativeHotplugVolumes feature gate. Verify file restore API backward compatibility.

- [x] **Upgrade Testing** — Validates upgrade paths from previous versions, data migration, and configuration preservation
- *Details:* Verify operator upgrade preserves existing file restore resources and their status. First release; no prior version to upgrade from, but HCO integration upgrade path must be validated.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No HCO integration in dev preview.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

- *Details:* Verify operator upgrade preserves existing file restore resources and their status. First release; no prior version to upgrade from, but HCO integration upgrade path must be validated.

- [x] **Dependencies** — Blocked by deliverables from other components/products
- *Details:* Depends on HCO team to add file restore operator as an HCO-managed component for downstream delivery (CNV-89642). HCO integration includes operator lifecycle management and certificate rotation.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No HCO integration in dev preview.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

#### **3.1. Testing Tools & Frameworks**

- **Test Framework:**
- Upstream: Go/Ginkgo e2e tests in kubevirt/vm-file-restore-operator repository using kubevirtci for cluster management.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The U/S e2e test is currently minimal. Should we suggest coverage extension here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what Tier 1 tests will cover

- [x] DeclarativeHotplugVolumes feature gate is enabled in KubeVirt CR
- [x] Guest helper scripts are available for installation on test VMs (Linux and Windows)
- [x] VolumeSnapshot-capable StorageClass is configured and functional
- [ ] HCO integration for downstream delivery is available (CNV-89642)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it a blocker?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I thought we would need it to have a proper deploy of the operator on the test cluster, but I think we have work around now ?

Signed-off-by: Emanuele Prella <eprella@redhat.com>
Signed-off-by: Emanuele Prella <eprella@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants