Skip to content

Demonstrate spfs clean can corrupt repository#1282

Closed
jrray wants to merge 1 commit intomainfrom
clean-repo-corruption
Closed

Demonstrate spfs clean can corrupt repository#1282
jrray wants to merge 1 commit intomainfrom
clean-repo-corruption

Conversation

@jrray
Copy link
Copy Markdown
Collaborator

@jrray jrray commented Oct 21, 2025

Clean can end up violating the invariant that all child objects of a parent must exist. This test demonstrates this with a manifest and a layer but the problem applies to other relationships as well.

Clean can end up violating the invariant that all child objects of a
parent must exist. This test demonstrates this with a manifest and a
layer but the problem applies to other relationships as well.

Signed-off-by: J Robert Ray <jrray@jrray.org>
@jrray jrray added the bug Something isn't working label Oct 21, 2025
@jrray jrray self-assigned this Oct 24, 2025
jrray added a commit that referenced this pull request Oct 24, 2025
To address the problem demonstrated in #1282, clean needs to delete
things in a top-level order, and skip cleaning children of anything
deemed uncleanable.

The plan is to make iter_objects (now iter_items) return more detailed
information including what the item's parent(s) are, so that the
consumer of the stream can build up a graph of items. Since iter_items
can be called over RPC and used as a stream, it isn't practical to
compute the entire graph and return it in a single response. But the
stream items can be used to build up the graph incrementally.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 24, 2025
To address the problem demonstrated in #1282, clean needs to delete
things in a top-level order, and skip cleaning children of anything
deemed uncleanable.

The plan is to make iter_objects (now iter_items) return more detailed
information including what the item's parent(s) are, so that the
consumer of the stream can build up a graph of items. Since iter_items
can be called over RPC and used as a stream, it isn't practical to
compute the entire graph and return it in a single response. But the
stream items can be used to build up the graph incrementally.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 24, 2025
This is only true when there are no concurrent writers!

This is the test from #1282 but passes as of the changes in this PR.

Signed-off-by: J Robert Ray <jrray@jrray.org>
@jrray
Copy link
Copy Markdown
Collaborator Author

jrray commented Oct 24, 2025

Closing now that #1288 can pass this test.

@jrray jrray closed this Oct 24, 2025
jrray added a commit that referenced this pull request Oct 24, 2025
To address the problem demonstrated in #1282, clean needs to delete
things in a top-level order, and skip cleaning children of anything
deemed uncleanable.

The plan is to make iter_objects (now iter_items) return more detailed
information including what the item's parent(s) are, so that the
consumer of the stream can build up a graph of items. Since iter_items
can be called over RPC and used as a stream, it isn't practical to
compute the entire graph and return it in a single response. But the
stream items can be used to build up the graph incrementally.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 24, 2025
This is only true when there are no concurrent writers!

This is the test from #1282 but passes as of the changes in this PR.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 24, 2025
To address the problem demonstrated in #1282, clean needs to delete
things in a top-level order, and skip cleaning children of anything
deemed uncleanable.

The plan is to make iter_objects (now iter_items) return more detailed
information including what the item's parent(s) are, so that the
consumer of the stream can build up a graph of items. Since iter_items
can be called over RPC and used as a stream, it isn't practical to
compute the entire graph and return it in a single response. But the
stream items can be used to build up the graph incrementally.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 24, 2025
This is only true when there are no concurrent writers!

This is the test from #1282 but passes as of the changes in this PR.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 24, 2025
To address the problem demonstrated in #1282, clean needs to delete
things in a top-level order, and skip cleaning children of anything
deemed uncleanable.

The plan is to make iter_objects (now iter_items) return more detailed
information including what the item's parent(s) are, so that the
consumer of the stream can build up a graph of items. Since iter_items
can be called over RPC and used as a stream, it isn't practical to
compute the entire graph and return it in a single response. But the
stream items can be used to build up the graph incrementally.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 24, 2025
This is only true when there are no concurrent writers!

This is the test from #1282 but passes as of the changes in this PR.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 25, 2025
To address the problem demonstrated in #1282, clean needs to delete
things in a top-level order, and skip cleaning children of anything
deemed uncleanable.

The plan is to make iter_objects (now iter_items) return more detailed
information including what the item's parent(s) are, so that the
consumer of the stream can build up a graph of items. Since iter_items
can be called over RPC and used as a stream, it isn't practical to
compute the entire graph and return it in a single response. But the
stream items can be used to build up the graph incrementally.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 25, 2025
This is only true when there are no concurrent writers!

This is the test from #1282 but passes as of the changes in this PR.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 25, 2025
To address the problem demonstrated in #1282, clean needs to delete
things in a top-level order, and skip cleaning children of anything
deemed uncleanable.

The plan is to make iter_objects (now iter_items) return more detailed
information including what the item's parent(s) are, so that the
consumer of the stream can build up a graph of items. Since iter_items
can be called over RPC and used as a stream, it isn't practical to
compute the entire graph and return it in a single response. But the
stream items can be used to build up the graph incrementally.

Signed-off-by: J Robert Ray <jrray@jrray.org>
jrray added a commit that referenced this pull request Oct 25, 2025
This is only true when there are no concurrent writers!

This is the test from #1282 but passes as of the changes in this PR.

Signed-off-by: J Robert Ray <jrray@jrray.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant