Conversation
| // Print muted and failed tests | ||
| mutedTests := runResult.MutedTests() | ||
| if len(mutedTests) > 0 { | ||
| if statistics.Total > 0 { |
There was a problem hiding this comment.
printReport function is now being called regardless the error and run result. But the statistics will only be printed if we get actual test results recorded in the result object.
Only muted tests failure is ignored, and all other errors will bubbled up
It expects any errors from the command to be returned
f4f39ca to
94b9e8d
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 814a12994c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
internal/command/run.go
Outdated
| // If the run result status is Passed but there are failed muted tests, | ||
| // the failures were suppressed due to muting. | ||
| // Therefore, we should ignore the error to prevent the build from failing. | ||
| if runResult.Status() == runner.RunStatusPassed && len(runResult.FailedMutedTests()) > 0 { |
There was a problem hiding this comment.
I wonder if it's worth mentioning that the pytest runner will capture non standard codes, early exits without a RunStatusUnknown so non-standard error codes won't be swallowed by this line.
I asked an LLM if it's possible for runner to produce a report while still reporting a non-standard exit code:
Yes, this is possible, and the current code would silently swallow that error. Here's how:
Consider a runner like pytest. Pytest has well-defined exit codes (https://docs.pytest.org/en/7.1.x/reference/exit-codes.html):
- 1 = tests failed
- 2 = user interrupted
- 3 = internal error
- 4 = usage error
- 5 = no tests collected
If pytest exits with code 2 (interrupted) but happened to flush a partial report before being interrupted, and that report only contains muted test failures, then:
1. runResult.Status() would be RunStatusPassed (failures are muted)
2. len(runResult.FailedMutedTests()) > 0 would be true
3. The function returns nil -- build passes
The interruption error (exit code 2) is swallowed.
But the LLM also said the individual runner (pytest) callers handles non standard error codes and doesn't return a report.
However, looking at the actual runner implementations, pytest and pytest_pants have an early return for non-1 exit codes:
if exitError := new(exec.ExitError); errors.As(cmdErr, &exitError) && exitError.ExitCode() != 1 {
return cmdErr
}
This means pytest/pytest_pants would bail before even parsing the report for codes != 1, so runResult would have no tests recorded, and Status() would be RunStatusUnknown, not RunStatusPassed. The muted-test suppression wouldn't trigger. So pytest is safe.
Other runners don't have this guard apparently.
But the other runners (rspec, jest, cucumber, playwright) don't have this guard. They always attempt to parse the report regardless of exit code. So if, say, rspec exits with code 7 (which is a non-standard rspec exit (https://github.com/rspec/rspec-core/issues/2956)) but produces a valid JSON report where the only failures are muted tests, this code would suppress that exit code 7 error.
In practice, exit code 7 in rspec is unlikely to produce a clean report. And most "non-standard" exit codes (signal deaths, OOM kills) would either not produce a report at all, or produce an unparseable one (triggering the parseErr path which returns cmdErr). But it's not impossible.
The gotest runner has the same early-return guard as pytest for non-1 exit codes, so it's also safe.
So the realistic risk is narrow but exists for rspec/jest/cucumber/playwright. A possible improvement would be to add a check at the caller level:
if runResult.Status() == runner.RunStatusPassed && len(runResult.FailedMutedTests()) > 0 {
// Only suppress known test-failure exit codes (typically 1)
if exitError := new(exec.ExitError); errors.As(runErr, &exitError) && exitError.ExitCode() == 1 {
return nil
}
}
This would only suppress the error when the exit code is exactly 1 (the conventional "tests failed" code), letting non-standard exit codes propagate even when muted tests are present.
There was a problem hiding this comment.
Yes, I agree with everything LLM said. There still a risk of swallowing the error for other test runners, because there is no foolprof way to distinguish test failure and other failures (some runners exit with 1 for non test failure).
So the realistic risk is narrow but exists for rspec/jest/cucumber/playwright. A possible improvement would be to add a check at the caller level
Agree that we can narrow down the risk by checking the exit code, and only swallow exit 1.
internal/command/run.go
Outdated
| // Therefore, we should ignore the error to prevent the build from failing. | ||
| if runResult.Status() == runner.RunStatusPassed && len(runResult.FailedMutedTests()) > 0 { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
I wonder if it's safer to not suppress signal errors? Should we move the ProcessSignaledError handling before this block so we don't miss non-1 errors?
LLM:
A normal test-failure exit code 1 would not match ProcessSignaledError
There was a problem hiding this comment.
Yeah good point. We should handle signal error first before suppressing the muted test failure.
gchan
left a comment
There was a problem hiding this comment.
Latest commit addresses my questions and it more tightly only swallows exit code 1. I can approve if it's blocking but would be great if another pair of eyes worked through the signal handling logic!
malclocke
left a comment
There was a problem hiding this comment.
I have a few points to discuss, but a great change overall 👍
| if runResult.Status() == runner.RunStatusFailed || runResult.Status() == runner.RunStatusError { | ||
| os.Exit(1) | ||
| if exitError := new(exec.ExitError); errors.As(runErr, &exitError) { | ||
| if exitError.ExitCode() == 1 && runResult.Status() == runner.RunStatusPassed && len(runResult.FailedMutedTests()) > 0 { |
There was a problem hiding this comment.
nit We could implement something like this on RunResult
func (r *RunResult) onlyMutedFailures() bool {
...
}The check could be a little more defensive too, like len(runResult.FailedMutedTests()) == len(runResult.FailedTestsIncludingMuted())
| case runner.RunStatusPassed: | ||
| fmt.Println("✅ All tests passed.") | ||
| if len(runResult.FailedMutedTests()) > 0 { | ||
| fmt.Println("✅ Build passed. Some muted tests failed.") |
| // execute tests | ||
| var timeline []api.Timeline | ||
| runResult, err := runTestsWithRetry(testRunner, &thisNodeTask.Tests, cfg.MaxRetries, testPlan.MutedTests, &timeline, cfg.RetryForMutedTest, cfg.FailOnNoTests) | ||
| runResult, runErr := runTestsWithRetry(testRunner, &thisNodeTask.Tests, cfg.MaxRetries, testPlan.MutedTests, &timeline, cfg.RetryForMutedTest, cfg.FailOnNoTests) |
There was a problem hiding this comment.
question there are quite a few of these s/err/xyzErr/ in this PR, is this a gofumpt rule or something else?
| fmt.Printf("Buildkite Test Engine Client: Failed to read gotestsum output, tests will not be retried: %v\n", parseErr) | ||
| // We don't want to fail the build if we fail to parse the report, | ||
| // therefore we return the command error (which can be nil), instead of the parse error. | ||
| return cmdErr |
There was a problem hiding this comment.
note this is a change of behaviour, parse error was returned previously.
| } | ||
|
|
||
| if len(r.tests) == 0 { | ||
| if len(r.tests) == 0 || r.unknownResultTestsCount() > 0 { |
There was a problem hiding this comment.
thought I noticed while reviewing that this method "fails open", e.g. the fall through case is return RunResultPassed on line 152.
Maybe we should tackle that at the same time and make an explicit case for RunResultPassed?
Problem
There were recurring bugs where
bktecreports a build as passed even though the test runner exited with a non-zero code. This is dangerous becausebktecshould exits with a non-zero if there are any failures, so customers can take action on it.We've attempted to fix the issues before, for example:
rspecexits with non-zero but no failure #436 - Return error when rspec exits with non-zero but no failureThose fixes addressed specific scenarios for individual runners but didn’t address the underlying problem.
The root cause is that
bktecswallows the runner's exit error after parsing the report. This was originally done to avoid failing the build when the only failures come from muted tests, but it has the side effect of hiding legitimate non-test failures.Changes
This PR reworks the error handling across all runners to consistently propagate the command error instead of swallowing it. Each runner now follows the same approach:
RunResultOn the caller side (the
mainpackage), if the run result is Passed but there are failed muted tests, we can assume the error originates from those muted test failures. Therefore, we should ignore them to prevent the build from failing. Otherwise the error propagates and causes the build to fail.Test
I've tested locally against RSpec. We should test it in examples pipeline before releasing this.