oci: Add varlink APIs using "splitdirfdstream"#309
Conversation
|
how will this work with podman-container-tools/container-libs#651 ? I don't think the container-tools are going to add again varlink as a dependency for this use case only |
|
One thing to bear in mind is there's two levels to this, the splitdirfdstream (replacing splitfdstream) and the higher level IPC mechanism by which that is passed around as representations of OCI layers. Only the latter involves varlink, and we could choose to keep using jsonrpc-fdpass for the latter in container-tools if you prefer. Sorry about the proposed strategy change, but as I dug in more it just feels right - and I somehow again just missed the fd passing support in the varlink ecosystem. I guess one thing that changed is zlink is relatively new, and is well maintained and good code. (The varlink/rust project went through a messy time) Would varlink be heavier than jsonrpc-fdpass? Hmm, let me see...I spent some tokens on this varlink/go vs jsonrpc-fdpass-go: binary size comparison🤖 Assisted-by: OpenCode (Claude Sonnet 4.6) Measured against the
With a realistic service stub, both libraries are similar in weight (~68–80 KiB). That said, varlink/go#43 needs doing. |
Or, it'd probably work to use gRPC for most metadata/controlplane but pass fds over a separate negotiated socket. I don't have a really strong opinion. One other thing I'd say here that I think is a big cleanup is that our parsing of Also the converse is now true - it should now be straightforward for external tooling to push content into composefs-rs in an efficient way. |
This is a blocker for the containers/storage integration. I have no preference whether it is varlink or jsonrpc, but should we hold this until we know we can use it in containers/storage too? |
ac7e9f6 to
5ad1961
Compare
I generated a PR in varlink/go#44 I think we don't need to block this strictly speaking, it seems in the end better for us to temporarily use a patched varlink/go than to use the custom jsonrpc-fdpass right? Also, if we can take this track to finally replace the current experimental-image-proxy protocol with one thing it'd be overall a large win. |
@mheon are you fine with that? |
|
Sorry for not following this one closely. Are we talking about doing a hard dependency on Varlink in order to use composefs with c/storage? Do we know what that's going to do to our binary size? |
yes, to add it as the RPC to communicate between Go and Rust. @cgwalters made a comparison here: |
5ad1961 to
b50f5dd
Compare
a5d52a9 to
fb82369
Compare
|
OK, this one I think is ready for review/merge. |
|
I don't think we have a fundamental disagreement with Varlink as an IPC protocol, though I am worried about protocol design. What kind of long term stability are we expecting with this? Are we going to have to do a hard pinning of composefs-rs against c/storage versions to ensure both ends are talking the same version of the protocol? |
Right, this is a confusing topic. At the current time, containers-storage uses We have an effort to fully replace that project with this one, including a new Rust The trajectory is then to ship the But that's not (directly) related to this PR or the PR Giuseppe is working on, which aim to expose varlink APIs on both ends. I think the trajectory that would help the most is to replace In that flow, it's more the other way around again: bootc/composefs(-rs) would be calling into skopeo ➡️ container-libs via varlink. |
fb82369 to
ee3f351
Compare
Pull request was converted to draft
ee3f351 to
b5feb85
Compare
b5feb85 to
df2b109
Compare
|
OK! I've gone over and cleaned this up more. I think it's ready for a wider review - I've trimmed out some garbage etc. One thing I had my agent do was inject random "dummy" fds into the producer stream to ensure that the consumer side was correctly reading the indices (and not just e.g. hardcoding 0 to access the dirfd), that shook out some bugs. |
| /// Use reflink/hardlink zero-copy transfer (requires same filesystem and root). | ||
| #[clap(long)] | ||
| zerocopy: bool, | ||
| }, |
There was a problem hiding this comment.
Can you split out the oci copy cli support into a separate commit?
| /// | ||
| /// The source repository is selected by the global `--repo`/`--user`/ | ||
| /// `--system` flags. The destination is `--to`. Both repositories must use | ||
| /// the same hash algorithm. |
There was a problem hiding this comment.
Is this really the natural approach? I would personally expect --repo to be the destination repo.
| /// | ||
| /// The source repository is selected by the global `--repo`/`--user`/ | ||
| /// `--system` flags. The destination is `--to`. Both repositories must use | ||
| /// the same hash algorithm. |
There was a problem hiding this comment.
This says the algorithm must be the same, but later comments (in the implementation) seems to say it supports conversion.
| paths.push(format!("{home}/.local/share/containers/storage")); | ||
| } | ||
|
|
||
| paths.push("/var/lib/containers/storage".to_string()); |
There was a problem hiding this comment.
Shouldn't this read storage.conf and get the graphroot and the additional image dirs from there. And also respect the CONTAINERS_STROAGE_CONF env var.
There was a problem hiding this comment.
Yes. That said, this code is basically a "placeholder", the goal is to back it with podman/skopeo.
|
I did a highlevel pass through this code, but man, there is a lot of code. Also, I think @giuseppe need to review the storage related parts. |
4c9b64b to
a3efaf7
Compare
Replace splitfdstream with the new splitdirfdstream format, which passes directory fd indices plus filenames instead of one fd per file. This works well with varlink fd-passing (which has limited fd capacity) and simplifies the producer/consumer protocol. The containers-storage import path now always goes through splitdirfdstream as an intermediary, giving us a single tested abstraction for both in-process and IPC layer transfer. Varlink endpoints (org.composefs.Oci) are added for pull and push of OCI layers, paving the way for integration with external storage stacks like containers-storage and containerd. Generated-by: OpenCode (Claude Opus 4.8)
a3efaf7 to
e752c18
Compare
Add a CLI subcommand to copy an OCI image (and all its layers) from one composefs repository to another using the splitdirfdstream transport. Cross-algorithm copies are supported (e.g. sha256 source to sha512 destination); the --zerocopy flag requires matching algorithms since it hardlinks objects in-place with shared fs-verity digests. The global --repo flag selects the destination repository and --from specifies the source, consistent with how --repo works elsewhere in the CLI. Generated-by: OpenCode (Claude Opus 4.8) Signed-off-by: Colin Walters <walters@verbum.org>
e752c18 to
5c233c7
Compare
First, I discovered that actually fd-passing with varlink generally works well, and I was misguided in thinking we needed jsonrpc-fdpass.
Almost: one issue is that varlink doesn't have good support for passing a lot of file descriptors (which jsonrpc-fdpass was designed to handle).
But upon some reflection, I realized we don't need to pass a file descriptor per file, all use cases here are fine with a directory fd plus filename.
So here a new data stream format "splitdirfdstream" is implemented.
We first now use that internally when we're doing a direct pull from containers-storage for reflinking/hardlinkling.
But better: let's expose that data concept over varlink, where a varlink client can both pull or push container image layers that way.
This paves the way to a very clear mechanism for us to integrate with containers-storage or other storage stacks (like containerd) in an agnostic way.
We also now support
cfsctl oci copyto copy across composefs repositories which is also implemented this way.Generated-by: OpenCode (Claude Opus 4.8)