api: estimate counts by default, cache /blockchain burn totals#144
Open
NayiemW wants to merge 1 commit into
Open
api: estimate counts by default, cache /blockchain burn totals#144NayiemW wants to merge 1 commit into
NayiemW wants to merge 1 commit into
Conversation
estimateTotalCount in defaults previously evaluated to false when CORE_API_ESTIMATED_TOTAL_COUNT was unset, which routed every list endpoint through a real COUNT(*) inside a REPEATABLE READ transaction on the api connection. statement_timeout (3s) kills COUNT(*) on a mainnet-sized blocks or transactions table; the catch returns totalCount: 0 with an empty data array under a 200, so clients see silently empty results. Flip the default to true and treat the env var as an explicit opt-out (=false). BlockchainController.index calls two unbounded SUMs over transactions per request, each ~30s on mainnet. Under explorer polling these pile up holding api connections (load average 8+ observed under load). Cache the result in-process with a 10-minute TTL and a single-flight background refresh — values only change on blocks carrying burned fees or burn transactions, and the endpoint's main consumers are dashboards where that staleness is fine.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two performance fixes for the public API that surface on long-running mainnet nodes. Both came out of debugging why
api.solar.orgwas returning empty lists and 500s under explorer poll load.1. Default
estimateTotalCounttotruepackages/api/src/defaults.tshad:With the env var unset (the case on most installs) this evaluates to
false, which meansAbstractRepository.listByExpressionskips the EXPLAIN-based estimate and runs a realCOUNT(*)inside aREPEATABLE READtransaction on theapiconnection. The api connection hasstatement_timeout = apiConnectionTimeout(3000ms by default).SELECT COUNT(*) FROM blockson a 16M-row mainnet table takes ~30s — it always times out, hits the outer catch inlistByExpression, and returns:The controller wraps that into a 200 response with
totalCount: 0and an emptydataarray. Every list endpoint (/blocks,/transactions,/blocks/missed, etc.) silently returns no data, while/blocks/:idkeeps working because it usesfindManyByExpressionwhich doesn't go through this path. Single-row lookups OK, lists broken.This patch flips the default to
trueand turns the env var into an explicit opt-out:totalCountIsEstimateis already exposed in the response meta, so clients that care can distinguish. Anyone who needs the exact count on a small node or in tests can still setCORE_API_ESTIMATED_TOTAL_COUNT=false.2. Cache
/blockchainburn totalsBlockchainController.indexrecomputes the burn totals on every request:getFeesBurnedisSELECT COALESCE(SUM(burned_fee), 0) FROM transactions— an unbounded SUM. On a 13M-row table that's a 25-40s parallel seq scan. Postgres won't switch to index-only scan on the existingtransactions_burned_feeindex until the visibility map is current, which doesn't happen between autovacuums.Under explorer poll load (every dashboard hits
/blockchainfor the "API is healthy" indicator, supply, and burn stats) these queries pile up holding api connections. Load average reached 8+ when bumping the timeout to let them complete, because each request kicked off another 40s scan while previous ones were still running. Keeping the 3s timeout makes them 500 instead of pile up, but the explorer still shows "Unable to reach Solar API".The two SUM results only change when a new block with burned fees or a burn transaction is forged. Cache them in-process with a 10-minute TTL and a single-flight refresh guard so concurrent requests can't trigger multiple background refreshes. Foreground requests always return immediately from the cache; the refresh runs at most once every TTL window regardless of load. On a cold cache (process start) the first response is zeros for ~30s while the first refresh runs — acceptable for an endpoint where the alternative is 500ing forever.
height,id, andsupplyare still computed fresh on each request — those are cheap.Testing
Verified on
api.solar.org(Solar Core 4.3.1):/blocks?limit=1→{ totalCount: 0, data: [] }./blockchain→ 500 after 3s./blocks?limit=1→totalCount: 16,266,352with real data in ~60ms./blockchain→ 200 in ~25ms, returning cached burn totals (refreshed in the background).