Skip to content

chore: move flaky dependency audits out of lint job#227

Merged
caitlon merged 1 commit into
developfrom
ci/isolate-dependency-audit
Jun 6, 2026
Merged

chore: move flaky dependency audits out of lint job#227
caitlon merged 1 commit into
developfrom
ci/isolate-dependency-audit

Conversation

@caitlon

@caitlon caitlon commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Greptile Summary

This PR extracts the pip-audit and npm audit steps from the required lint job into a new standalone dependency-audit job, and wraps each audit command in a 3-attempt retry loop to absorb transient registry timeouts. The intent is to prevent a flaky network call from blocking the lint gate and therefore the downstream test jobs.

  • The new dependency-audit job is correctly decoupled from lint and from all test jobs, so a registry outage can no longer cascade into a CI blockage.
  • The retry logic does not differentiate between a transient network error and an actual vulnerability finding — both cause a non-zero exit — so a real CVE triggers two unnecessary retries and ultimately surfaces under the misleading "failed after 3 attempts (network?)" banner rather than the audit tool's own report.

Confidence Score: 4/5

Safe to merge — the change correctly isolates flaky network calls from the required lint gate, and the overall CI flow remains sound.

The restructuring is straightforward and achieves its goal. The retry loops treat a real vulnerability finding the same as a network error, causing redundant retries and a misleading failure message, but the job still fails correctly when a CVE is present.

The retry logic in the dependency-audit job steps (lines 82-111) is worth a second look regarding how failures are surfaced to the team.

Important Files Changed

Filename Overview
.github/workflows/ci.yml Moves pip-audit and npm audit from the lint job into a new independent dependency-audit job with 3-attempt retry loops; the retry logic does not distinguish transient network failures from genuine vulnerability findings, which can produce misleading output but does not affect overall correctness.
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
.github/workflows/ci.yml:102-111
**Retry loop can't distinguish network errors from vulnerability findings**

Both `npm audit` and `pip-audit` exit non-zero for *two* distinct reasons: a transient network error and an actual vulnerability finding. The retry loop treats both identically, so when a real CVE is detected the job will silently retry it twice more (burning ~30 s) before finally failing with the misleading banner `npm audit failed after 3 attempts` instead of the audit's own vulnerability report. The `(network?)` hint in the warning will actively mislead anyone trying to diagnose a real security hit.

The same logic applies to the `pip-audit` retry block above (lines 82-89).

Reviews (1): Last reviewed commit: "ci: move flaky dependency audits out of ..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

@caitlon caitlon added this to the v1.1-polish milestone Jun 6, 2026
@caitlon caitlon added type:tech-debt Code quality improvement area:infra CI/CD, Docker labels Jun 6, 2026
@caitlon caitlon self-assigned this Jun 6, 2026
Comment thread .github/workflows/ci.yml
Comment on lines +102 to +111
- name: Audit npm dependencies
working-directory: frontend
run: |
for attempt in 1 2 3; do
npm audit --omit=dev --audit-level=moderate && exit 0
echo "::warning::npm audit attempt $attempt failed (network?); retrying in 15s"
sleep 15
done
echo "::error::npm audit failed after 3 attempts"
exit 1

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Retry loop can't distinguish network errors from vulnerability findings

Both npm audit and pip-audit exit non-zero for two distinct reasons: a transient network error and an actual vulnerability finding. The retry loop treats both identically, so when a real CVE is detected the job will silently retry it twice more (burning ~30 s) before finally failing with the misleading banner npm audit failed after 3 attempts instead of the audit's own vulnerability report. The (network?) hint in the warning will actively mislead anyone trying to diagnose a real security hit.

The same logic applies to the pip-audit retry block above (lines 82-89).

Prompt To Fix With AI
This is a comment left during a code review.
Path: .github/workflows/ci.yml
Line: 102-111

Comment:
**Retry loop can't distinguish network errors from vulnerability findings**

Both `npm audit` and `pip-audit` exit non-zero for *two* distinct reasons: a transient network error and an actual vulnerability finding. The retry loop treats both identically, so when a real CVE is detected the job will silently retry it twice more (burning ~30 s) before finally failing with the misleading banner `npm audit failed after 3 attempts` instead of the audit's own vulnerability report. The `(network?)` hint in the warning will actively mislead anyone trying to diagnose a real security hit.

The same logic applies to the `pip-audit` retry block above (lines 82-89).

How can I resolve this? If you propose a fix, please make it concise.

@sonarqubecloud

sonarqubecloud Bot commented Jun 6, 2026

Copy link
Copy Markdown

@caitlon caitlon merged commit 6071d3a into develop Jun 6, 2026
9 checks passed
@caitlon caitlon deleted the ci/isolate-dependency-audit branch June 6, 2026 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:infra CI/CD, Docker type:tech-debt Code quality improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant