Indexes 5: Adds spk repo index subcommand for index generation and updates#1340
Indexes 5: Adds spk repo index subcommand for index generation and updates#1340dcookspi wants to merge 12 commits intoindex-4-indexed-repository-and-fbindexfrom
spk repo index subcommand for index generation and updates#1340Conversation
spk repo index subcommand for index generation and updates
9da2f7c to
d88602e
Compare
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
d88602e to
027a156
Compare
8530adc to
68bb519
Compare
68bb519 to
1e28bf4
Compare
027a156 to
cdaf8f9
Compare
1e28bf4 to
854446a
Compare
cdaf8f9 to
b70cba2
Compare
854446a to
a58142a
Compare
b70cba2 to
4698954
Compare
b32582c to
1a43f64
Compare
b8f7428 to
3871cc5
Compare
crates/spk-cli/common/src/flags.rs
Outdated
| pub use_indexes: bool, | ||
|
|
||
| /// Do not get the package data from the repo index, always use | ||
| /// the repo instead. This only applies to non-destructive repo | ||
| /// operations. This option can be configured as the default in | ||
| /// spk's config file. | ||
| #[clap(long, conflicts_with = "use_indexes")] | ||
| pub no_indexes: bool, |
There was a problem hiding this comment.
Why have both? Can a "global" option to disable index use despite what an individual repo is configured to do exist at a higher level?
There was a problem hiding this comment.
It lets a site (or user) enable indexes in the spk config file (so the default for all uses), and disable them for some command line runs, and visa versa - if a site (or user( disables indexes in the spk config file, this lets them be enabled for some command line runs.
We're likely to enable indexes in the config file, and probably use --no-indexes sometimes (if there's an issue as a workaround, or for testing something). But another site might prefer it the other way around for some reason.
There was a problem hiding this comment.
I agree with the concept as you describe the usage pattern but still dislike these two options existing here at the same level in the configuration hierarchy.
We already have a configuration pattern of some config property that can be set in a config file but overridden with an env var (or possibly a command-line option). Having these two with opposite meanings creates confusion about which gets set where and which overrules the other.
This could be a case for needing something other than one or two bool options but use an enum instead:
- an option that disables indexes globally and overrules any repo-specific setting
- an option that delegates to repo-specific settings (default)
- an option that enables indexes globally but doesn't overrule any repo-specific setting (we'd likely use this one in our config file)
- (maybe) an option that enables indexes globally and overrules any repo-specific setting, but this one feels questionable
I'd be okay with having a flag like no_indexes that acts as an alias / shortcut for picking the option that globally disables indexes, but this wouldn't map to a setting that lives in the config file.
There was a problem hiding this comment.
As per our discussion today, I've updated the spk config file structure to remove the global index settings and have them on each spk repository section, replaced the command line options with a single flag with an enum of values, and changed the defaults to use an index if one exists, except for the local repo.
96ad9df to
31d99fa
Compare
f61e97e to
83c8d12
Compare
83c8d12 to
3dcf4e3
Compare
|
Todo:
|
docs/ref/indexes.md
Outdated
|
|
||
| To update an existing index, e.g. after a new `python` package was | ||
| published, run: | ||
| `spk repo index --disable-repo local --update python` |
There was a problem hiding this comment.
A thought on usage, rather than being expected to remember to add --disable-repo local (because something bad would happen if not??), how about we make this index sub-command require specifying the repo to operate on, and error if one isn't specified.
spk repo index -r origin [--update python]
There was a problem hiding this comment.
The usage followed the existing spk repo command's subcommand, but I agree it is cumbersome.
- Update the usage of the
spk repo indexcommand to match this
There was a problem hiding this comment.
I've changed it to only take a -r <reponame> option and not all the other repo flags.
| includes data that isn't read by spk anymore), | ||
|
|
||
|
|
||
| #### Removing an old field |
There was a problem hiding this comment.
I'm not sure the section above is useful, but removing a field (or stopping to populate an existing field), are reasons to bump the version number.
There was a problem hiding this comment.
I've added another section on stopping to populate an existing field. I felt I needed to make it clearer that it isn't the same as removing a field and its implications for reading that field in older spks. I've left the stopping to read a field in as a result and for completeness. Let me know if it's too verbose or still too much.
34d1bbd to
e5ee8b6
Compare
Adds --use-indexes and --no-indexes flags to repository. Updates resolvo solver to get global variables data from an indexed repository. Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
8f7519a to
b0439ac
Compare
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
…ndles. Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
…dex use. Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
… option Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
39f3a57 to
50c68f1
Compare
times, and fixes bugs when using it to update a specific package version. Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
|
Updated the |
This adds a new
repo indexsubcommand to spk for index generation and updates. It adds the--use-indexesand--no-indexesflags for repository index usage. This also updatesresolvosolver to get global variables data from an indexed repository. This allowsresolvoto solve without needing to restart its solves.Indexing
The index is designed to help the solvers with solve times. It doesn't contain enough data to help with other spk operations like building and testing a package.
Indexing can be enabled or disabled in the spk config file. If indexing is enabled, you have to generate an index, with
spk repo index, prior to trying to use it. They are not generated on the fly (outside of automated tests).To generate an index (for the origin):
spk repo index --disable-repo localTo update an existing index, e.g. after a new python package was published:
spk repo index --disable-repo local --update pythonThe flatbuffer index data is stored in a file in the underlying spfs repo in a
index/spk/sub-directory.If index use is enabled in the config file, it can be disabled with the
--no-indexescommand line flag. If index use is disabled by default, it can be enbled with the--use-indexesflag. If index use is enabled, but no index has been generated, spk will fallback to using the underlying repo directly (it acts as it would before this change).Speed Diferences
Generating the index file on our repo (sizes below) takes about 2 minutes. Updating a package in an existing index, such as after a new build is published, takes a few seconds.
Sample solver time improvements using this indexing
The numbers come from this setup:
The indexing doesn't have a noticable impact (to users) on smaller solves. But it allows our larger solves to finish in under 10 seconds, or about 1/6th of the time they currently do. The times marked with (*) are improved further by the changes in PR6: (#1344).
This is the final 5 of 5 chained PRs for adding indexes to spk solves:
new_unchecked()constructors to spk schema objects #1337version_filterfield in index schema #1344