Skip to content

Indexes 5: Adds spk repo index subcommand for index generation and updates#1340

Open
dcookspi wants to merge 12 commits intoindex-4-indexed-repository-and-fbindexfrom
index-5-repo-cmds-config-and-cli-flags
Open

Indexes 5: Adds spk repo index subcommand for index generation and updates#1340
dcookspi wants to merge 12 commits intoindex-4-indexed-repository-and-fbindexfrom
index-5-repo-cmds-config-and-cli-flags

Conversation

@dcookspi
Copy link
Copy Markdown
Collaborator

@dcookspi dcookspi commented Mar 20, 2026

This adds a new repo index subcommand to spk for index generation and updates. It adds the --use-indexes and --no-indexes flags for repository index usage. This also updates resolvo solver to get global variables data from an indexed repository. This allows resolvo to solve without needing to restart its solves.

Indexing

The index is designed to help the solvers with solve times. It doesn't contain enough data to help with other spk operations like building and testing a package.

Indexing can be enabled or disabled in the spk config file. If indexing is enabled, you have to generate an index, with spk repo index, prior to trying to use it. They are not generated on the fly (outside of automated tests).

To generate an index (for the origin):

  • spk repo index --disable-repo local

To update an existing index, e.g. after a new python package was published:

  • spk repo index --disable-repo local --update python

The flatbuffer index data is stored in a file in the underlying spfs repo in a index/spk/ sub-directory.

If index use is enabled in the config file, it can be disabled with the --no-indexes command line flag. If index use is disabled by default, it can be enbled with the --use-indexes flag. If index use is enabled, but no index has been generated, spk will fallback to using the underlying repo directly (it acts as it would before this change).

Speed Diferences

Generating the index file on our repo (sizes below) takes about 2 minutes. Updating a package in an existing index, such as after a new build is published, takes a few seconds.

Sample solver time improvements using this indexing

The numbers come from this setup:

  • an origin repo that has 2245 packages, 23540 versions, 82517 builds (11 erroring, about 30% deprecated), and 141 global vars
  • the index loads in ~0.0003 seconds unverified, or ~0.2 seconds verified, and is about 76 MB on disk with trimmed down deprecated builds (107 MB with full deprecated builds)
  • these times are from a rough average of 3-4 runs with index verification disabled
  • a "toolset" below is a set of requests for the named DCC and our typical in-house plugins and tools
Requests        | Solution size | Num.    | Solve time  |  Indexed solve time, no retries
                | (# packages)  | Retries | (seconds)   |  (seconds)
-----------------------------------------------------------------------------------------
python          |        4      |    1    |     0.17    |   0.03 
boost-python    |        8      |    1    |     0.31    |   0.05 
python-torch    |       37      |    2    |     0.58    |   0.15 
widget toolset  |       60      |    2    |     3.44    |   0.48 
katana toolset  |      181      |   10    |    18.32    |   2.97 
nuke toolset    |      280      |   12    |    24.32    |   4.55 (*)
houdini toolset |      211      |   19    |    37.24    |   6.58 (*)
maya toolset    |      403      |   20    |    59.50    |   9.52 (*)

The indexing doesn't have a noticable impact (to users) on smaller solves. But it allows our larger solves to finish in under 10 seconds, or about 1/6th of the time they currently do. The times marked with (*) are improved further by the changes in PR6: (#1344).

This is the final 5 of 5 chained PRs for adding indexes to spk solves:

  1. Indexes 1: Change Package and related traits to not return references to fields #1336
  2. Indexes 2: Add new_unchecked() constructors to spk schema objects #1337
  3. Indexes 3: Adds flatbuffers schema and SolverPackageSpec for indexes to spk #1338
  4. Indexes 4: Adds Indexes for SPK repositories #1339
  5. this PR
  6. Indexes 6: Changes version_filter field in index schema #1344

@dcookspi dcookspi self-assigned this Mar 20, 2026
@dcookspi dcookspi added enhancement New feature or request SPI AOI Area of interest for SPI pr-chain This PR doesn't target the main branch, don't merge! labels Mar 20, 2026
@dcookspi dcookspi changed the title Indexes 5: Adds 'spk repo index' subcommand for index generation and updates Indexes 5: Adds spk repo index subcommand for index generation and updates Mar 20, 2026
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch from 9da2f7c to d88602e Compare March 20, 2026 01:22
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 20, 2026

@dcookspi dcookspi requested review from jrray and rydrman March 20, 2026 19:15
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch from d88602e to 027a156 Compare March 20, 2026 19:27
@dcookspi dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch from 8530adc to 68bb519 Compare March 20, 2026 19:30
@dcookspi dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch from 68bb519 to 1e28bf4 Compare March 20, 2026 19:53
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch from 027a156 to cdaf8f9 Compare March 20, 2026 19:55
@dcookspi dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch from 1e28bf4 to 854446a Compare March 21, 2026 01:07
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch from cdaf8f9 to b70cba2 Compare March 21, 2026 01:08
@dcookspi dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch from 854446a to a58142a Compare March 25, 2026 01:13
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch from b70cba2 to 4698954 Compare March 25, 2026 01:15
@dcookspi dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch 2 times, most recently from b32582c to 1a43f64 Compare March 27, 2026 19:30
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch 2 times, most recently from b8f7428 to 3871cc5 Compare March 27, 2026 23:28
Comment on lines +1070 to +1077
pub use_indexes: bool,

/// Do not get the package data from the repo index, always use
/// the repo instead. This only applies to non-destructive repo
/// operations. This option can be configured as the default in
/// spk's config file.
#[clap(long, conflicts_with = "use_indexes")]
pub no_indexes: bool,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have both? Can a "global" option to disable index use despite what an individual repo is configured to do exist at a higher level?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It lets a site (or user) enable indexes in the spk config file (so the default for all uses), and disable them for some command line runs, and visa versa - if a site (or user( disables indexes in the spk config file, this lets them be enabled for some command line runs.

We're likely to enable indexes in the config file, and probably use --no-indexes sometimes (if there's an issue as a workaround, or for testing something). But another site might prefer it the other way around for some reason.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the concept as you describe the usage pattern but still dislike these two options existing here at the same level in the configuration hierarchy.

We already have a configuration pattern of some config property that can be set in a config file but overridden with an env var (or possibly a command-line option). Having these two with opposite meanings creates confusion about which gets set where and which overrules the other.

This could be a case for needing something other than one or two bool options but use an enum instead:

  • an option that disables indexes globally and overrules any repo-specific setting
  • an option that delegates to repo-specific settings (default)
  • an option that enables indexes globally but doesn't overrule any repo-specific setting (we'd likely use this one in our config file)
  • (maybe) an option that enables indexes globally and overrules any repo-specific setting, but this one feels questionable

I'd be okay with having a flag like no_indexes that acts as an alias / shortcut for picking the option that globally disables indexes, but this wouldn't map to a setting that lives in the config file.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per our discussion today, I've updated the spk config file structure to remove the global index settings and have them on each spk repository section, replaced the command line options with a single flag with an enum of values, and changed the defaults to use an index if one exists, except for the local repo.

@dcookspi dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch 3 times, most recently from 96ad9df to 31d99fa Compare April 1, 2026 16:26
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch from f61e97e to 83c8d12 Compare April 1, 2026 19:08
@dcookspi dcookspi requested a review from jrray April 2, 2026 01:18
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch from 83c8d12 to 3dcf4e3 Compare April 2, 2026 01:18
@dcookspi
Copy link
Copy Markdown
Collaborator Author

dcookspi commented Apr 2, 2026

Todo:

  • Follow in from today's meeting, move the index subdirectories location and creation down into spfs


To update an existing index, e.g. after a new `python` package was
published, run:
`spk repo index --disable-repo local --update python`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thought on usage, rather than being expected to remember to add --disable-repo local (because something bad would happen if not??), how about we make this index sub-command require specifying the repo to operate on, and error if one isn't specified.

spk repo index -r origin [--update python]

Copy link
Copy Markdown
Collaborator Author

@dcookspi dcookspi Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage followed the existing spk repo command's subcommand, but I agree it is cumbersome.

  • Update the usage of the spk repo index command to match this

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it to only take a -r <reponame> option and not all the other repo flags.

includes data that isn't read by spk anymore),


#### Removing an old field
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the section above is useful, but removing a field (or stopping to populate an existing field), are reasons to bump the version number.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added another section on stopping to populate an existing field. I felt I needed to make it clearer that it isn't the same as removing a field and its implications for reading that field in older spks. I've left the stopping to read a field in as a result and for completeness. Let me know if it's too verbose or still too much.

@dcookspi dcookspi force-pushed the index-4-indexed-repository-and-fbindex branch from 34d1bbd to e5ee8b6 Compare April 8, 2026 00:37
dcookspi added 4 commits April 7, 2026 17:38
Adds --use-indexes and --no-indexes flags to repository.
Updates resolvo solver to get global variables data from an indexed repository.

Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch 2 times, most recently from 8f7519a to b0439ac Compare April 8, 2026 18:24
dcookspi added 7 commits April 9, 2026 09:49
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
…ndles.

Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
…dex use.

Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
… option

Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
@dcookspi dcookspi force-pushed the index-5-repo-cmds-config-and-cli-flags branch from 39f3a57 to 50c68f1 Compare April 9, 2026 18:17
times, and fixes bugs when using it to update a specific package
version.

Signed-off-by: David Gilligan-Cook <dcook@imageworks.com>
@dcookspi
Copy link
Copy Markdown
Collaborator Author

Updated the spk repo index --update ... option so it can be specified multiple times, and fixed a couple of bugs related to updating specific package/versions, or a deleted package/version, in the index.

@dcookspi dcookspi requested a review from jrray April 10, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request pr-chain This PR doesn't target the main branch, don't merge! SPI AOI Area of interest for SPI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants