Adding dependency graph submission workflow#3502
Conversation
epugh
left a comment
There was a problem hiding this comment.
LGTM. I occasionally have angst about using all the Github features, cause someday they will turn obviously evil and then we'll be locked in so tight with them...
Also, I love how clear your issue descriptions are!
|
For now I’m not removing anything like the OWASP task. My main intention is to make this more visible, and I think GitHub is the perfect place for that. The UI makes it easier (and honestly cooler) for people to actually pay attention to CVE info, which usually feels pretty boring. |
| uses: gradle/actions/dependency-submission@v4 | ||
| with: | ||
| build-scan-publish: true | ||
| develocity-access-key: ${{ secrets.SOLR_DEVELOCITY_ACCESS_KEY }} |
There was a problem hiding this comment.
do we really want Develocity integration for this workflow?
There was a problem hiding this comment.
Not really actually. I'll remove it.
| DEPENDENCY_GRAPH_INCLUDE_CONFIGURATIONS: "(?i)(^|:)(compileClasspath|runtimeClasspath|testCompileClasspath|testRuntimeClasspath)$" | ||
| DEPENDENCY_GRAPH_EXCLUDE_CONFIGURATIONS: "(?i)(^|:)(classpath|.*PluginClasspath|kotlinCompilerClasspath|kaptClasspath|annotationProcessor|detachedConfiguration.*)$" | ||
| DEPENDENCY_GRAPH_RUNTIME_INCLUDE_CONFIGURATIONS: "(?i)(^|:)runtimeClasspath$" | ||
| DEPENDENCY_GRAPH_RUNTIME_EXCLUDE_CONFIGURATIONS: "(?i)(^|:)testRuntimeClasspath$" |
There was a problem hiding this comment.
how did you come to choose this config?
Short answer: ChatGPT. Long answer:
Cleaner Dependency Graph/SBOM and more relevant Dependabot alerts—JUnit/Mockito/tooling CVEs won’t show up as runtime risks. We can filter by scope and fix runtime issues first. Please review https://github.com/gradle/github-dependency-graph-gradle-plugin |
malliaridis
left a comment
There was a problem hiding this comment.
I spent some time in the past with the dependency graph and I looked into the build.gradle file of the solr:packaging module and the package.gradle to figure out what is shipped and what not. So I was considering of adding our custom configurations instead of the typical runtimeClasspath, so I used only packaging configuration and was thinking of adding packaging, runtimeLibs and / or solrPlatformLibs.
But I wasn't sure if that was correct thinking. Are those covered by the configurations you already include?
|
Regarding packaging, runtimeLibs, and/or solrPlatformLibs: these custom configurations mainly exist to “do the math” of wiring things together. For example, solrPlatformLibs is defined to pull in several Solr modules: Similarly, runtimeLibs just extends from runtimeElements: So if I look at the current config for the dependency graph, it’s basically a superset — everything required to develop or run the project across all these places: |
Looks great, thank you for sorting that out! 💯 One of the reasons I had dismissed the Gradle Dependency Submission action in my investigations was precisely the large number of non-runtime dependencies it was reporting. As you’ve shown, I just hadn’t looked deep enough into the plugin’s configuration options. My approach was heading in a different direction:
The motivation for focusing on the
That’s why, at least initially, I would prefer to limit submissions to dependencies that are actually shipped, ignoring the broader I didn’t finish mapping all configurations that represent shipped artifacts, but I did confirm that // === Custom configurations used to assemble the Solr server binary distribution ===
// 1. Jetty Bootstrap JAR
// Output Path: server/start.jar
startJar
// 2. Server Libraries
// Output Path: server/lib/
// Description: Core runtime libraries (mostly Jetty-related JARs).
serverLib
// 3. Extended Server Libraries
// Output Path: server/lib/ext/
// Description: Optional runtime libs like logging (SLF4J/Log4j) and metrics (Dropwizard, etc.).
libExt
// 4. Solr Core JAR
// Output Path: server/solr-webapp/webapp/WEB-INF/lib/
solrCore
// 5. Solr Web Application Libraries
// Output Path: server/solr-webapp/webapp/WEB-INF/lib/
webappSo far I’ve only analyzed the DEPENDENCY_GRAPH_INCLUDE_CONFIGURATIONS="startJar|serverLib|libExt|solrCore|webapp"I'll look at the remaining components of the distribution archive this week. PS: I’ve also looked into Dependabot’s support for updating Gradle lock files. While basic support has been available since June, there are still unresolved issues when a project uses both lock files and version catalogs (see dependabot/dependabot-core#12557). |
|
With the configuration you shared, the total dependency count is down to 138 (from 518 under the Maven ecosystem). Dependabot alerts have also dropped to 4 from 21. Does this reduction also cover Solr modules? That said, I’m only seeing 1 high and 1 moderate vulnerability, whereas you mentioned there should be at least half a dozen—so we might still be missing some then. I should also try to generate SBOM on tar.gz to see the difference against the one which we can export from the Github. |
No, the configuration I provided only covers the
So in total, we should expect roughly 488 JARs across all shipped components.
I may have overstated it a little, but there are at least these open issues:
I also have a script ready that can automatically submit these to JIRA. I’ve held off running it so far, because I want to first clean up the dependency graph in my fork, otherwise I will be creating JIRA issues for irrelevant build-time dependencies. |
|
The one I shared earlier comes out to about 518, so not too far off from the ~488 you counted. It’s just using the usual Gradle configs (compile, runtime, etc.), so fine as a starting point. Our main concern is really the high/moderate vulnerabilities, and that count isn’t too high. This setup doesn’t add much noise—maybe a bit in the dependency graph—but from Dependabot’s side we can always scope things based on runtime vs. development. |
ppkarwasz
left a comment
There was a problem hiding this comment.
Hi @iamsanjay,
I ran some additional checks to compare the dependencies reported as Runtime in the GitHub Dependency Graph against the actual Solr binary distribution, and the results look very promising:
-
The Apache Solr 9.9.0 binary distribution ships with 421 third-party JARs, plus 24 Solr JARs, 23 duplicates, and one example JAR: 489 JARs in total. The full list is available in this gist.
-
With the configuration from this PR (plus
startJar), all of those dependencies are reported asRuntimedependencies to GitHub, which aligns nicely with what is actually shipped. -
A few dependencies are reported as
Runtimebut are not included in the binary distribution:com.carrotsearch.randomizedtesting:randomizedtesting-runner:2.8.1 com.squareup.okio:okio:3.6.0 junit:junit:4.13.2 org.apache.lucene:lucene-test-framework:9.12.2 org.hamcrest:hamcrest:3.0 org.hamcrest:hamcrest-core:3.0 org.jctools:jctools-core:4.0.5 org.openjdk.jmh:jmh-core:1.37 org.quicktheories:quicktheories:0.26These are not as important as the shipped JARs, but won't generate too much CVE noise in practice.
-
The distribution also contains 9 shaded artifacts. Since GitHub won’t warn us about CVEs in shaded dependencies, we’ll probably want to keep these manually upgraded to the latest available versions:
com.fasterxml.woodstox:woodstox-core:7.0.0 com.ibm.icu:icu4j:74.2 io.netty:netty-common:4.1.114.Final io.opentelemetry:opentelemetry-sdk-trace:1.40.0 org.apache.curator:curator-client:5.7.0 org.apache.hadoop:hadoop-client-api:3.4.0 org.apache.hadoop:hadoop-client-runtime:3.4.0 org.apache.hadoop.thirdparty:hadoop-shaded-guava:1.2.0 org.tallison.xmp:xmpcore-shaded:6.1.10 -
One interesting observation:
log4j-coreis included in the reported dependencies. Normally, artifacts published to Maven Central should not expose logging implementations (see ZOOKEEPER-4820), but in this case it’s being pulled in bysolr-prometheus-exporter. I’ll open a follow-up issue + PR to address that. Once it’s removed fromruntimeClasspath, we’ll also need to update this workflow.
Overall, this configuration looks like a solid match for what we distribute. Nice work! 💯
| DEPENDENCY_GRAPH_INCLUDE_CONFIGURATIONS: "(?i)(^|:)(compileClasspath|runtimeClasspath|testCompileClasspath|testRuntimeClasspath)$" | ||
| DEPENDENCY_GRAPH_EXCLUDE_CONFIGURATIONS: "(?i)(^|:)(classpath|.*PluginClasspath|kotlinCompilerClasspath|kaptClasspath|annotationProcessor|detachedConfiguration.*)$" | ||
| DEPENDENCY_GRAPH_RUNTIME_INCLUDE_CONFIGURATIONS: "(?i)(^|:)runtimeClasspath$" | ||
| DEPENDENCY_GRAPH_RUNTIME_EXCLUDE_CONFIGURATIONS: "(?i)(^|:)testRuntimeClasspath$" |
There was a problem hiding this comment.
I think this workflow could be simplified and clarified a bit:
- Since we’re already explicitly listing the configurations we care about, we don’t really need an additional exclude list.
- From what I understand, the regex is evaluated against the configuration name itself (not prefixed with the module name), so the
(^|:)and$anchors aren’t necessary. Empirical tests confirm this. - The
startJarconfiguration isn’t currently included, so we should add it to the list. - We should also exclude the
:solr:test-frameworkproject from the runtime dependencies to avoid pulling it in unnecessarily.
| DEPENDENCY_GRAPH_INCLUDE_CONFIGURATIONS: "(?i)(^|:)(compileClasspath|runtimeClasspath|testCompileClasspath|testRuntimeClasspath)$" | |
| DEPENDENCY_GRAPH_EXCLUDE_CONFIGURATIONS: "(?i)(^|:)(classpath|.*PluginClasspath|kotlinCompilerClasspath|kaptClasspath|annotationProcessor|detachedConfiguration.*)$" | |
| DEPENDENCY_GRAPH_RUNTIME_INCLUDE_CONFIGURATIONS: "(?i)(^|:)runtimeClasspath$" | |
| DEPENDENCY_GRAPH_RUNTIME_EXCLUDE_CONFIGURATIONS: "(?i)(^|:)testRuntimeClasspath$" | |
| DEPENDENCY_GRAPH_INCLUDE_CONFIGURATIONS: "(?i)(test)?(compile|runtime)classpath|startJar" | |
| DEPENDENCY_GRAPH_RUNTIME_EXCLUDE_PROJECTS: ":solr:test-framework" | |
| DEPENDENCY_GRAPH_RUNTIME_INCLUDE_CONFIGURATIONS: "runtimeClasspath|startJar" |
|
Hey @ppkarwasz, thanks so much for looking into this. This is exactly what we needed to cross-check, and it looks like you’ve captured all the finer details. But I have one question currently I am only running for main, Should I include branch_9x. wdyt? |
It really depends on what you want to achieve.
|
Inspired by: Lucene issue #15114
This adds GitHub’s dependency graph setup so it’s easier for us to keep an eye on dependencies.
Submits a full dependency graph (including transitives) that shows up under Insights → Dependency graph, so we can quickly check what we’re pulling in.
[Future PR] Runs dependency review on each PR, comparing against main and flagging any unexpected changes.
Hooks in Dependabot alerts so we get notified about High/Critical CVEs right away.
Overall, this makes troubleshooting, checking vulnerability status, and acting on alerts a lot smoother within the GitHub ecosystem. There’s plenty more potential here as we explore the feature further and as GitHub keeps expanding its toolkit.