Skip to content

Improve timeout retry observability and test coverage#4467

Open
joshuacolvin0 wants to merge 1 commit intopmikolajczyk/nit-3341-timeout-errorsfrom
nit-3341-additions
Open

Improve timeout retry observability and test coverage#4467
joshuacolvin0 wants to merge 1 commit intopmikolajczyk/nit-3341-timeout-errorsfrom
nit-3341-additions

Conversation

@joshuacolvin0
Copy link
Member

  • Add unit tests for retry wrapper dual-counter logic (10 test cases)
  • Add log.Warn for non-timeout retries in retry wrapper
  • Replace duplicate timeout log in advanceValidations with log.Trace
  • Add comment documenting string-matching false-positive tradeoff
  • Assert proxy timeouts were consumed in system test

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

- Add unit tests for retry wrapper dual-counter logic (10 test cases)
- Add log.Warn for non-timeout retries in retry wrapper
- Replace duplicate timeout log in advanceValidations with log.Trace
- Add comment documenting string-matching false-positive tradeoff
- Assert proxy timeouts were consumed in system test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

❌ Patch coverage is 0% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 29.21%. Comparing base (431659f) to head (b3dac72).

Additional details and impacted files
@@                           Coverage Diff                            @@
##           pmikolajczyk/nit-3341-timeout-errors    #4467      +/-   ##
========================================================================
- Coverage                                 32.89%   29.21%   -3.69%     
========================================================================
  Files                                       494      494              
  Lines                                     58331    58337       +6     
========================================================================
- Hits                                      19189    17044    -2145     
- Misses                                    35805    38202    +2397     
+ Partials                                   3337     3091     -246     

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

❌ 111 Tests Failed:

Tests completed Failed Passed Skipped
4276 111 4165 0
View the top 3 failed tests by shortest run time
TestL2BlockRangeForL1
Stack Traces | -0.000s run time
... [CONTENT TRUNCATED: Keeping last 20 lines]
TRACE[03-05|04:32:07.844] Pooled new future transaction            hash=399184..b60683                   from=0x57Ff0F473737a1c161bfF9efDF016F7991585088 to=&lt;nil&gt;
TRACE[03-05|04:32:07.845] Removing old queued transactions         count=0
TRACE[03-05|04:32:07.845] Removing unpayable queued transactions   count=0
DEBUG[03-05|04:32:07.845] Executing EVM call finished              runtime="175.909µs"
TRACE[03-05|04:32:07.845] Promoting queued transactions            count=1
DEBUG[03-05|04:32:07.845] Served eth_call                          reqid=1555 duration="966.921µs"
TRACE[03-05|04:32:07.845] Handled RPC response                     reqid=1549 duration="7.874µs"
TRACE[03-05|04:32:07.845] got new header from L1                   number=96 hash=c9e897..0708c1                   header="&amp;{ParentHash:0xf5d79700476b1be63f70e738f2eb44c72076c65fe341c6b7fcdda25a28c643d8 UncleHash:0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347 Coinbase:0x0000000000000000000000000000000000000000 Root:0x2cb1ed35d2fe5f504ce3b9c740b458091a67d87d0dfed2d646c00af550e3cdf0 TxHash:0x5bc5ff32c4b71246006599d52649d0d632d5616afcf76928674067a7fb0f6b8f ReceiptHash:0xf78dfb743fbd92ade140711c8bbc542b5e307f0ab7984eff35d751969fe57efa Bloom:[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] Difficulty:+0 Number:+96 GasLimit:16473376 GasUsed:21000 Time:1772685220 Extra:[216 131 1 16 8 132 103 101 116 104 136 103 111 49 46 50 53 46 55 133 108 105 110 117 120] MixDigest:0x8c7ef8a47656f55cd886feae29da2f9d0df2ac4515f7deffafc03c28c7879e88 Nonce:[0 0 0 0 0 0 0 0] BaseFee:+491178 WithdrawalsHash:0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421 BlobGasUsed:0xc08442cc48 ExcessBlobGas:0xc08442cc50 ParentBeaconRoot:0x0000000000000000000000000000000000000000000000000000000000000000 RequestsHash:0xe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855}"
DEBUG[03-05|04:32:07.846] Executing EVM call finished              runtime="183.934µs"
DEBUG[03-05|04:32:07.846] Served eth_call                          reqid=847  duration="310.944µs"
DEBUG[03-05|04:32:07.846] Served eth_getBlockByNumber              reqid=1557 duration="200.955µs"
TRACE[03-05|04:32:07.846] Handled RPC response                     reqid=847  duration="1.022µs"
TRACE[03-05|04:32:07.846] Pooled new future transaction            hash=fd6680..67c0a6                   from=0xb386a74Dcab67b66F8AC07B4f08365d37495Dd23 to=0x4e7Fe50cDeC4515DCB7F1110782adC963E66E20b
WARN [03-05|04:32:07.846] Getting file info                        dir= error="stat : no such file or directory"
TRACE[03-05|04:32:07.846] Removing old queued transactions         count=0
TRACE[03-05|04:32:07.846] Removing unpayable queued transactions   count=0
TRACE[03-05|04:32:07.846] Handled RPC response                     reqid=1555 duration="1.372µs"
TRACE[03-05|04:32:07.846] Handled RPC response                     reqid=1557 duration=762ns
INFO [03-05|04:32:07.845] Chain head was updated                   number=2  hash=635415..154752                   root=2e6c5a..d7ac81 elapsed="261.489µs"
DEBUG[03-05|04:32:07.846] Executing EVM call finished              runtime="87.764µs"
TestBatchPosterWithDelayProofsAndBacklog
Stack Traces | -0.000s run time
=== RUN   TestBatchPosterWithDelayProofsAndBacklog
=== PAUSE TestBatchPosterWithDelayProofsAndBacklog
TestBlockValidatorSimpleOnchainUpgradeArbOs
Stack Traces | -0.000s run time
=== RUN   TestBlockValidatorSimpleOnchainUpgradeArbOs
=== PAUSE TestBlockValidatorSimpleOnchainUpgradeArbOs

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

Comment on lines +39 to +43
if r.err != nil {
promise.ProduceError(r.err)
} else {
promise.Produce(r.state)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if r.err != nil {
promise.ProduceError(r.err)
} else {
promise.Produce(r.state)
}
promise.ProduceResult(r.state, r.err)

Comment on lines +47 to +52
func (s *testSpawner) WasmModuleRoots() ([]common.Hash, error) { return nil, nil }
func (s *testSpawner) Start(context.Context) error { return nil }
func (s *testSpawner) Stop() {}
func (s *testSpawner) Name() string { return "test" }
func (s *testSpawner) StylusArchs() []rawdb.WasmTarget { return nil }
func (s *testSpawner) Capacity() int { return 1 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (s *testSpawner) WasmModuleRoots() ([]common.Hash, error) { return nil, nil }
func (s *testSpawner) Start(context.Context) error { return nil }
func (s *testSpawner) Stop() {}
func (s *testSpawner) Name() string { return "test" }
func (s *testSpawner) StylusArchs() []rawdb.WasmTarget { return nil }
func (s *testSpawner) Capacity() int { return 1 }
func (s *testSpawner) WasmModuleRoots() ([]common.Hash, error) { return nil, nil }
func (s *testSpawner) Start(context.Context) error { return nil }
func (s *testSpawner) Stop() {}
func (s *testSpawner) Name() string { return "test" }
func (s *testSpawner) StylusArchs() []rawdb.WasmTarget { return nil }
func (s *testSpawner) Capacity() int { return 1 }

{err: errTimeout},
})
_, err := launchAndAwait(t, wrapper, 1, 2)
if !errors.Is(err, errTimeout) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if !errors.Is(err, errTimeout) {
if !validator.IsTimeoutError(err) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we want to use this utility here and in the other assertions

Comment on lines +63 to +72
ctx := context.Background()
run := wrapper.LaunchWithNAllowedAttempts(nil, testRoot, allowedAttempts, allowedTimeouts)
return run.Await(ctx)
}

func setupWrapper(t *testing.T, results []launchResult) *ValidationSpawnerRetryWrapper {
t.Helper()
spawner := &testSpawner{results: results}
wrapper := NewValidationSpawnerRetryWrapper(spawner)
wrapper.StopWaiter.Start(context.Background(), wrapper)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering - shouldn't we use a common parent context and derive from it?

@@ -59,6 +59,12 @@ func (v *ValidationSpawnerRetryWrapper) LaunchWithNAllowedAttempts(entry *valida
if nonTimeoutAttempts >= allowedAttempts {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a strong comparison here - if we allow 3 attempts, then the 3rd failure shouldn't cause an error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants