Skip to content

optimistic sig share creation for managed keys#7831

Merged
AdoAdoAdo merged 33 commits into
feat/testnet-fixesfrom
optimistic-sig-share-creation
Jun 18, 2026
Merged

optimistic sig share creation for managed keys#7831
AdoAdoAdo merged 33 commits into
feat/testnet-fixesfrom
optimistic-sig-share-creation

Conversation

@ssd04

@ssd04 ssd04 commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

Reasoning behind the pull request

  • In case there is a competing block and we have to wait until we send the signatures for the next round, we should trigger signature shares creation optimistically

Testing procedure

  • Standard system test

Pre-requisites

Based on the Contributing Guidelines the PR author and the reviewers must check the following requirements are met:

  • was the PR targeted to the correct branch?
  • if this is a larger feature that probably needs more than one PR, is there a feat branch created?
  • if this is a feat branch merging, do all satellite projects have a proper tag inside go.mod?

@ssd04 ssd04 self-assigned this Apr 15, 2026
return false
}

func (sr *subroundSignature) createSignaturesForManagedKeys(ctx context.Context) bool {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pre-checks here are very similar with doSignatureJobForManagedKeys but i would keep it like this for simplicity

sr.GetHeader().GetEpoch(),
pkBytes,
)
signatureShare, err := sr.SigningHandler().SignatureShare(uint16(idx))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in each start subround, there is a Reset for all consensus state, including SigningHandler, there should't be any index collision with the signatures from the previous round; the wait time is related to the proof arrival, not with the signatures

}
}

// Wait once for the entire node if competing block detected

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't L125-128 be inside the previous if? they basically do similar loops as waitForCompetingBlockEarlyChecks

@ssd04 ssd04 Apr 15, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here there is also the check for single key, will check how to integrate them better

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored to include them in the same condition branch

sr.signatureThrottler.StartProcessing()
wg.Add(1)

go func(ctx context.Context, idx int, pk string) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this starts a goroutine, but does not handle the context, so goroutines may be hanging if the process stops

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added context check in goroutine

if err != nil {
log.Debug("sendSignatureForManagedKey.CreateSignatureShareForPublicKey", "error", err.Error())
return false
// signature share not found (optimistic signature share creation was not triggered)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if failure reason was different?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should try to create the signature share (if not already created) whatever the reason; before we tried to create it directly each time, now there is additional check

sstanculeanu
sstanculeanu previously approved these changes May 5, 2026
@codecov

codecov Bot commented May 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 76.14213% with 47 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.56%. Comparing base (6387f16) to head (58fa573).

Files with missing lines Patch % Lines
consensus/spos/bls/v2/subroundSignature.go 75.67% 13 Missing and 5 partials ⚠️
consensus/spos/consensusState.go 22.22% 13 Missing and 1 partial ⚠️
consensus/spos/bls/v2/subroundBlock.go 86.56% 6 Missing and 3 partials ⚠️
factory/crypto/signingHandler.go 62.50% 5 Missing and 1 partial ⚠️
Additional details and impacted files
@@                 Coverage Diff                  @@
##           feat/testnet-fixes    #7831    +/-   ##
====================================================
  Coverage               77.55%   77.56%            
====================================================
  Files                     884      885     +1     
  Lines                  125242   125400   +158     
====================================================
+ Hits                    97132    97261   +129     
- Misses                  21655    21677    +22     
- Partials                 6455     6462     +7     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces optimistic (early) creation of BLS signature shares for managed keys in SPoS v2, aiming to reduce latency when competing blocks force validators to delay sending signatures into the next round/subround. It also threads context.Context into signature-share creation to allow timeout/cancel behavior.

Changes:

  • Add context.Context to SigningHandler.CreateSignatureShareForPublicKey and propagate it through production code, simulator code, and tests.
  • Add optimistic signature-share creation triggered during the Block subround (on header send/receive), plus synchronization/cancel hooks via ConsensusState.
  • Update SPoS v2 Signature subround to wait for optimistic signature creation and reuse already-created shares when available.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
testscommon/consensus/signingHandlerStub.go Update stub to accept context.Context in signature-share creation.
testscommon/consensus/consensusStateMock.go Extend mock with optimistic-signature waitgroup/cancel hooks.
node/chainSimulator/process/processor.go Adapt simulator to pass a context to signature-share creation.
node/chainSimulator/process/processor_test.go Update simulator unit test stub to accept context.
factory/crypto/signingHandler.go Add context-aware signature-share creation with timeout checks.
factory/crypto/signingHandler_test.go Update signing-handler tests to pass context.
factory/crypto/errors.go Add timeout error for signature handling.
consensus/spos/worker.go Cancel optimistic signature context on Extend().
consensus/spos/interface.go Extend ConsensusStateHandler with waitgroup/cancel API for optimistic signatures.
consensus/spos/consensusState.go Add optimistic-signature waitgroup and cancel-func storage.
consensus/spos/bls/v2/subroundSignatureCompetingBlock_test.go Update stub signature-share creation callback signature.
consensus/spos/bls/v2/subroundSignature.go Wait for optimistic signature creation; reuse pre-created shares; refactor throttler check.
consensus/spos/bls/v2/subroundSignature_test.go Update tests and add coverage for “reuse existing sig share” behavior.
consensus/spos/bls/v2/subroundBlock.go Trigger optimistic signature-share creation on header send/receive; require throttler in constructor.
consensus/spos/bls/v2/subroundBlock_test.go Update constructor call sites and add tests for optimistic signature triggering.
consensus/spos/bls/v2/export_test.go Update exported test helpers for new ctx-aware APIs.
consensus/spos/bls/v2/common.go Introduce shared throttler-wait helper.
consensus/spos/bls/v2/blsSubroundsFactory.go Pass signature throttler into v2 Block subround creation.
consensus/spos/bls/v2/benchmark_verify_signatures_test.go Update benchmark to pass context.
consensus/spos/bls/v2/benchmark_send_proof_test.go Update benchmark to pass context.
consensus/spos/bls/v1/subroundSignature.go Thread context into v1 signature-share creation as well.
consensus/spos/bls/v1/subroundSignature_test.go Update v1 tests for context-aware signature-share creation.
consensus/interface.go Update SigningHandler interface signature for context-aware signature-share creation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 146 to +154
if message == nil {
return nil, ErrNilMessage
}

select {
case <-ctx.Done():
return nil, ErrTimeIsOut
default:
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +220 to +241
func (sr *subroundSignature) waitForSingatures() {
done := make(chan struct{})
go func() {
sr.SignaturesWaitGroup().Wait()
close(done)
}()

timeLeft := sr.RoundHandler().RemainingTime(sr.RoundHandler().TimeStamp(), time.Duration(sr.EndTime()))

select {
case <-done:
sr.SignaturesCtxCancel()
return
case <-time.After(timeLeft):
log.Debug("timeout while waiting for signatures to be created")
return
}
}

func (sr *subroundSignature) doSignatureJobForManagedKeys(ctx context.Context) bool {
// wait for optimistic signatures creation to finish
sr.waitForSingatures()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +366 to +382
sigCtx, cancel := context.WithTimeout(ctx, timeLeft)
sr.SetSignaturesCtxCancelFunc(cancel)

for idx, pk := range sr.ConsensusGroup() {
pkBytes := []byte(pk)
if !sr.IsKeyManagedBySelf(pkBytes) {
continue
}

err := checkGoRoutinesThrottler(ctx, sr.signatureThrottler)
if err != nil {
log.Debug("triggerCreateSignaturesForManagedKeys.checkGoRoutinesThrottler", "err", err)
return false
}
sr.signatureThrottler.StartProcessing()
sr.SignaturesWaitGroup().Add(1)

Comment thread consensus/spos/bls/v2/subroundBlock.go Outdated
Comment on lines 714 to 718
sr.triggerCreateSignaturesForManagedKeys(context.Background())

ctx, cancel := context.WithTimeout(context.Background(), sr.RoundHandler().TimeDuration())
defer cancel()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the other context is for process block and then it's cancelled, i think it's better to rely here only on the internal sig timeout context

Comment on lines +529 to +531
func (cns *ConsensusState) SignaturesWaitGroup() *sync.WaitGroup {
return cns.signaturesWaitGroup
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was done on another PR

Comment on lines +534 to +539
func (cns *ConsensusState) SetSignaturesCtxCancelFunc(cancelFunc context.CancelFunc) {
cns.mutState.Lock()
defer cns.mutState.Unlock()

cns.signaturesTimeoutCtxCancel = cancelFunc
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread factory/crypto/signingHandler.go Outdated
Comment on lines 175 to 181
select {
case <-ctx.Done():
return nil, ErrTimeIsOut
default:
}

sh.data.sigShares[index] = sigShareBytes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread consensus/spos/bls/v2/subroundBlock.go Outdated
return
}
sr.signatureThrottler.StartProcessing()
sr.SignaturesWaitGroup().Add(1)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can read the wg before the loop, and then reuse that one (defensive in case there is a reset of the subround mid process, e.g if we promote the trigger to run on a goroutine or if the timeout crosses the round boundary)

same for the header hash and epoch and header nil check (if the round boundary is crossed, these may be different if read in the routine).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread consensus/spos/bls/v2/subroundBlock.go Outdated
continue
}

err := checkGoRoutinesThrottler(sigCtx, sr.signatureThrottler)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be blocking here when number of keys in consensus managed by the node is > than throttler allowed routines.

the check will wait for the delta of routines to finish before releasing the access for the validation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

triggered on a goroutine

// will try to create it
log.Debug("sendSignatureForManagedKey.SignatureShare: sig not already created, will try to create it", "error", err)

signatureShare, err = sr.SigningHandler().CreateSignatureShareForPublicKey(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we create the sig share if it is missing, but do we still have time to create it?

}

sh.data.sigShares[index] = sigShareBytes
// check again before setting signatures shares data

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we already have the signature, bypassing its storage will not save a lot of time, maybe store it.

Comment thread consensus/spos/bls/v2/subroundBlock.go Outdated
@@ -360,12 +360,21 @@ func (sr *subroundBlock) sendBlockHeader(
}

func (sr *subroundBlock) triggerCreateSignaturesForManagedKeys(ctx context.Context) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we now trigger this on a goroutine, maybe give the headerHash and headerHandler as argument, so that it is not read inside the goroutine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

currentHash := sr.GetData()
currentEpoch := sr.GetHeader().GetEpoch()

sigSubroundEndTime := time.Duration(float64(sr.RoundHandler().TimeDuration()) * srSignatureEndTime)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also if triggered on a goroutine, maybe check that the round on the header is still the round in the RoundHandler, otherwise exit without triggering signing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread consensus/spos/bls/v2/subroundBlock.go Outdated
log.Debug("triggerCreateSignaturesForManagedKeys: context done", "timeLeft", timeLeft)
go func() {
for _, pk := range keys {
log.Info("aaaa")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comment

select {
case <-sigCtx.Done():
log.Debug("triggerCreateSignaturesForManagedKeys: context done", "timeLeft", timeLeft)
go func() {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that was the intention, to trigger asynchronously, due to goroutines throttler which might block on limit reached

Comment thread consensus/spos/bls/v2/subroundBlock.go Outdated
return
default:
}
log.Info("bbb")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Comment thread consensus/spos/bls/v2/subroundBlock.go Outdated
defer sr.signatureThrottler.EndProcessing()
defer wg.Done()

log.Info("ccc")

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


pkBytes := []byte(pk)
currentHash := sr.GetData()
log.Debug("step 1: multi keys signatures creation has been triggered", "num", len(keys))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps call cancel()?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or call it on defer after creation

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the context was set to be cancelled on signature subround, after waiting with timeout, here we are just triggering the creation

default:
}

err := checkGoRoutinesThrottler(ctx, sr.signatureThrottler)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be sigCtx here?

sstanculeanu
sstanculeanu previously approved these changes Jun 17, 2026
Comment thread consensus/spos/bls/v2/subroundBlock.go
@AdoAdoAdo AdoAdoAdo merged commit 850ffe1 into feat/testnet-fixes Jun 18, 2026
10 of 11 checks passed
@AdoAdoAdo AdoAdoAdo deleted the optimistic-sig-share-creation branch June 18, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants