You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Code Coverage job in the CI GitHub Actions workflow (.github/workflows/ci.yaml, the coverage: job, step Run cargo-tarpaulin) is failing on main. The Run cargo-tarpaulin step has timeout-minutes: 90, and recent runs hit that cap: the job runs ~1h30m and then errors without producing/uploading coverage XML.
This has been failing across multiple recent main commits, so it is not specific to any single PR. (It is unrelated to the recent cargo-deny failure, which was already fixed in 113204c.)
All other CI jobs (build, frontend, pysimlin, simlin-serve smoke, Lint/cargo-deny) pass.
Why it matters
Masks real coverage signal: when the step times out, no coverage XML is produced, so the Upload Coverage step has nothing to send to Codecov. Coverage data on main goes stale.
Wastes CI minutes: every main push burns ~90 minutes of runner time on a job that always fails.
Erodes trust in CI: a perpetually-red job trains people to ignore the CI status.
cargo-tarpaulin's instrumentation may interact badly with the WASM build and/or the wasm-interpreter git dependency, inflating wall-clock time.
The existing inline comment in ci.yaml (just above the step) already documents prior history of this same job blowing past a 45-minute cap once the simlin-serve suite was added, which is why the cap was raised to 90 minutes. That budget now appears to be exhausted again.
Component(s) affected
CI / build: .github/workflows/ci.yaml (coverage: job, Run cargo-tarpaulin step around lines 104-150).
The inline comment already names the intended next move: switch to source-based coverage via cargo tarpaulin --engine llvm (or cargo-llvm-cov) so coverage runs at native test speed instead of ptrace-instrumented speed.
Alternatively, exclude/limit the slow WASM-backend test(s) from the coverage run, or split coverage into faster shards.
How it was discovered
Identified while fixing an unrelated cargo-deny failure on main; noticed the Code Coverage job had been red across several recent main runs. Not yet investigated beyond gh run view output (no local reproduction of the tarpaulin run).
Summary
The Code Coverage job in the
CIGitHub Actions workflow (.github/workflows/ci.yaml, thecoverage:job, step Run cargo-tarpaulin) is failing onmain. TheRun cargo-tarpaulinstep hastimeout-minutes: 90, and recent runs hit that cap: the job runs ~1h30m and then errors without producing/uploading coverage XML.This has been failing across multiple recent
maincommits, so it is not specific to any single PR. (It is unrelated to the recent cargo-deny failure, which was already fixed in 113204c.)Affected runs (both
failureonmain)61c1cb85, "engine: infer and check macro unit polymorphism (engine: macro unit polymorphism is contained, not resolved #619) (engine: infer and check macro unit polymorphism (#619) #621)" —Code Coverage in 1h30m31s,X Run cargo-tarpaulin.334c0e35, "engine: WebAssembly simulation backend with full VM parity (engine: WebAssembly simulation backend with full VM parity #620)" — same ~1h30m failure.All other CI jobs (build, frontend, pysimlin, simlin-serve smoke, Lint/cargo-deny) pass.
Why it matters
Upload Coveragestep has nothing to send to Codecov. Coverage data onmaingoes stale.mainpush burns ~90 minutes of runner time on a job that always fails.Hypotheses (not yet confirmed)
wasm-interpretergit dependency, inflating wall-clock time.The existing inline comment in
ci.yaml(just above the step) already documents prior history of this same job blowing past a 45-minute cap once the simlin-serve suite was added, which is why the cap was raised to 90 minutes. That budget now appears to be exhausted again.Component(s) affected
.github/workflows/ci.yaml(coverage:job,Run cargo-tarpaulinstep around lines 104-150).src/simlin-engineWASM-backend tests and/or thewasm-interpretergit dependency (from engine: WebAssembly simulation backend with full VM parity #620).Possible approaches for resolution
cargo tarpaulin --engine llvm(orcargo-llvm-cov) so coverage runs at native test speed instead of ptrace-instrumented speed.How it was discovered
Identified while fixing an unrelated cargo-deny failure on
main; noticed the Code Coverage job had been red across several recentmainruns. Not yet investigated beyondgh run viewoutput (no local reproduction of the tarpaulin run).