cuprated: Add graceful shutdown, pt2: service error propagation#586
Open
redsh4de wants to merge 5 commits into
Open
cuprated: Add graceful shutdown, pt2: service error propagation#586redsh4de wants to merge 5 commits into
redsh4de wants to merge 5 commits into
Conversation
aac614a to
1a41630
Compare
91f6723 to
e3d9960
Compare
0cf53be to
15bc617
Compare
8adc7f1 to
79e3e61
Compare
d46811f to
d6c42d4
Compare
9bffcac to
cc55d00
Compare
cc55d00 to
fdcd3c7
Compare
fd7f75d to
a07fe8f
Compare
4 tasks
a07fe8f to
8c68467
Compare
56ea9f1 to
720e15f
Compare
redsh4de
commented
May 25, 2026
ea492e5 to
34bec3d
Compare
34bec3d to
d1e4a2a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Depends on pt1, PR #585
Building on part 1, this replaces the panic-on-error patterns in cuprated's services with error propagation that makes use of the graceful shutdown mechanism
Why
So we dont panic and insta-crash upon a error, shutdown should be graceful. Internal errors still caused panics before this via PANIC_CRITICAL_SERVICE_ERROR
Where
cuprated:blockchain/error.rs(new):BlockValidationErrorHardFork(HardForkError),Other(ExtendedConsensusError)BlockManagerErrorValidation(BlockValidationError),Internal(#[from] tower::BoxError)IncomingBlockErrorValidation(BlockValidationError),Internal(tower::BoxError),Orphan,UnknownTransactions(_, _),TooManyTxs,ChannelClosedtxpool/error.rs(new):TxValidationErrorParse(io::Error),Consensus(ExtendedConsensusError),DuplicateTransaction,RelayRule(RelayRuleError)IncomingTxErrorValidation(#[from] TxValidationError),Internal(#[from] tower::BoxError)monitor.rs- addTaskExecutor::spawn_critical,panic_messagehelper.logging.rs- return the log guard, hold it until shutdown completes.constants.rs- renamePANIC_CRITICAL_SERVICE_ERRORtoCRITICAL_SERVICE_ERRORlib.rs-Node::launchreturnsResult; on init failure it cancels partially spawned subsystems before returning the error.blockchain.rs-check_add_genesisreturnsResult.blockchain/manager.rs- run loop returnsResult,spawn_criticalfor syncer + manager.blockchain/manager/handler.rs-.expect(...)->?throughout, handlers returnResult<_, BlockManagerError>orResult<_, tower::BoxError>,handle_commandroutesInternalvia?andValidationvia response channel. Invariant violations kept as panics.txpool/manager.rs- uniform DB-error escalation: all 6 handlers propagateTxPoolErrorvia?. Reorderpromote_txandremove_tx_from_poolto write to DB first, then apply in memory.txpool/incoming_tx.rs-IncomingTxHandler::initreturnsResult.tor.rs,p2p.rs-transport_*_config/initialize_clearnet_p2p/initialize_tor_p2preturnResult, Tor logs+skips on failure.rpc/server.rs-init_rpc_serversreturnsResult, per-server task usesspawn- If a RPC server fails to start, the error is logged but doesn't initiate shutdown.cuprate-typesTxConversionErrorre-exported.How
1. spawn_critical wraps each subsystem's future:
Ok(())Err(_)panic_message-> trigger shutdown2. Added layered errors.
Per subsystem: a typed
ValidationErrorfor peer-fault paths, plus a union withValidationandInternal(tower::BoxError)arms.Fromimpls route (ExtendedConsensusError::DBErr-> Internal; else -> Validation). The manager matches between the two:Internal(_)?->spawn_criticalValidation(_)