Merge upstream 17 6 v1#43
Merged
Merged
Conversation
By popular demand, this modifies fd_alloc to support larger allocations before falling back to the slow locking underlying allocator of last resort (fd_wksp). This couldn't be done simply by updating the sizeclass config as the underlying fd_alloc memory layout couldn't support the required larger superblock footprints. Thus, fd_alloc was tweaked to be more like fd_vinyl_data (which already had a notion of sizeclasses large enough to hold account state as large as 10 MiB). Most important, fd_alloc_superblock_t is now where block sizeclass metadata is stored (previously it was specified in the fd_alloc_hdr_t of allocated blocks in the sizeclass). This supports many more sizeclasses (because there is more space in the fd_alloc_superblock_t to store it) and much larger sizeclass block sizes (because moving the sizeclass frees up the space in the fd_alloc_hdr_t needed to support larger superblock footprints). This also supports better allocator diagnostics (because it is now possible to find every live user allocation in an fd_alloc instance without user assistance). Additionally, like fd_vinyl_data, user allocations are treated distinctly from superblock allocations. This allows more compact superblock nesting and streamlined allocation recursion. Accordingly, fd_alloc_superblock_t was modified to include the block sizeclass and fd_alloc_superblock_t are blocks are now all guaranteed to have 8 byte alignment. Though it naively looks like it got 8 bytes larger, because of the previous alignment requirements and the implicitly preprended fd_alloc_hdr_t, it is practically comparable to smaller now even though it holds more info. Likewise, fd_alloc_hdr_t was modified to encode a 2-bit type (user small, user large, nested superblock or root superblock) and, for allocations contained in a superblock (user small or nested superblock), 6-bit block_idx and offset to the containing superblock (26-bits encoded in 24-bits). TL;DR This allows sizeclass blocks as large as ~32 MiB to be supported while keeping the allocator memory footprint overhead comparable. Various cleanups in support of the above: - Added gen_szc_cfg sizeclass configuration generation for all the above. This has been tuned to do reduce the number of sizeclasses (hence preallocations) for the relative amount of overallocation from rounding up to the nearest block size. - fd_alloc was turned into a quasi-opaque handle instead of a fully opaque handle. Alignment requirements were also made less stringent. This makes it easier to inline various operations, streamline various unit tests and use different sizeclass configurations. - Two compile time sizeclass configurations are provided: small (which carves small allocations out of 256 KiB wksp partitions where allocations less than ~37KiB in size are considered smalll ... similar to previous) and large (64 MiB partitions and ~10 MiB small allocations). Default is large. - The use of 16-byte wide atomics on x86-like platforms was eliminated. This makes the fd_alloc shared memory layout and so forth identical across target platform of the same endianness. This was made possible by limiting the maximum size workspace backing an fd_alloc to 1 PiB (which in turns allows an aligned workspace global address and a suitably wide lockfree ABA version number to be encoded in a ulong). - Accordingly, FD_ALLOC_MAGIC was updated (and is now independent of build target). - fd_alloc_delete allows more fine grained control over how aggressively to cleanup left over allocations in the underlying workspace. - The necessary iterations for fd_alloc_preferred_sizeclass needs is explicitly provided by the sizeclass configuration. - Added minor block_set optimizations including atomic operations slightly optimized to use op_then_fetch style (instead of fetch_then_op) and the fd_alloc_block_set_all API that returns the full block set for a given block count. - Compile-time FD_ALLOC_STYLE define to enable alignment pad clearing for more robust and faster discovery of user allocations. - fd_alloc_is_empty was simplified to use fd_wksp_tag functionality. - ASAN and MSAN support was cleaned up and fixed to work correctly under concurrent load. - fd_alloc_fprintf updated for all the above, with more extensive diagnostics, including a strong guarantee it will find all current allocations under the conditions of an idle fd_alloc with alignment padding clearing (FD_ALLOC_STYLE==1). - Updated test_alloc to match the above (including verifying sizeclassing and ability to do concurrent testing under ASAN and MSAN options). - Usual drive-by include and comment cleanups.
currently we would crash if NULL were to be passed to the function
By popular demand, updated the sizeclass configuration for fd_vinyl_data to be less aggressive about preeemptively allocating blocks. This uses superblocks that are 2 to 4 times smaller than previously and 4 times fewer sizeclasses, which implies there will be roughly an order of magnitude less preemptive allocation. The tradeoff is that superblocks hold 2 to 4 times fewer allocation blocks and a deeper nesting (which implies some increase in computational overhead). Likewise, this increases the worst case rounding up of allocations to sizeclass block sizes by roughly a a factor of 2 (which is a bit of a mixed bag ... a few percent extra overhead on fixed sized allocations but a cheaper ability to do incremental resizing). The volume footprint increased somewhat with this. But gen_szc_cfg can be easily modified to use the old volume footprint (or even a smaller or more power-of-2-ish volume footprint).
Use a static signing domain instead of a random key. This makes it easier to prove that there are no signing confusion bugs.
* fix(secp256r1): add one more point p256_scalarmulbase() is called with blocksize=6, so it scans, in constant time, 43 windows * 32 = 1376 affine points of the base point table. The table had only 1375 entries, so every secp256r1 signature verify read one point (64 bytes) past the end of fd_secp256r1_base_point_table. Add the missing 2^252*32*G entry and a static assert pinning the table size. * Update src/ballet/secp256r1/fd_secp256r1_s2n.c Co-authored-by: David Rubin <drubin+git@jumptrading.com> --------- Co-authored-by: David Rubin <drubin+git@jumptrading.com>
Allow remembering Make config using shell variables
Add support for subscribing to votes via the RPC tile. https://solana.com/docs/rpc/websocket/votesubscribe
sendmmsg was only ever sending to one dest, might as well use the simpler sendmsg syscall then
no security/correctness impact, just a consistency nit
Move non-parametric tests to new scheduler unit test. Improves fuzzers exec/s by ~30%
This is algorithmically a fairly sloppy way to ingest /proc/interrupts, but it doesn't matter, this monitoring code barely uses any CPU.
find -exec returns 0 even if one of the commands failed. This would mask actual errors.
This commit adds two fuzzers: - `gossvf_tile` fuzzes the gossvf tile in isolation by mocking its in/out links in a 8 peer environment. Fast ~1050 exec/s - gossvf_gossip_pair: Tests the gossvf and the gossip tile together. Two peers are talking to each other and we use "fault injection" to test with invalid inputs. In addition we let the fuzzer do things like skip ahead in time to have fast fuzz-throughput as we are not bounded by a real network and want to test the time-based events.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Merge upstream Firedancer changes into the Tickoni branch (17.6 v1 sync). This pull request brings 62 upstream commits across 354 files into the Tickoni tree, including: new/updated Firedancer infrastructure tiles (events, gui/guih split, RPC WebSocket, sspeer_selector, txncache, gossip, sched, alloc), seccomp policy updates, regenerated metrics and GUI assets, and a set of correctness/stability fixes across quic, tower, loader, repair, backtest, and config subsystems. No Tickoni-owned framework tiles (
src/tickoni/orsrc/app/tickoni/) are changed.Key upstream changes included:
events: wire topology automatically, generate schema structs/serializers, add signed_vote and tower events, usefd_clock_tilefor wallclock conversion, addIN_KIND_EVENTingestion path, fix connection racegui/guih: splitdisco/guianddiscoh/guih, regenerate seccomp policies, update metrics and GUI assetsrpc: add WebSocket support (voteSubscribe,slotSubscribe), serve tpu/tpuForwards, fix double memcpy on WS sendsspeer_selector: cap max slots behindtxncache: support canceling fork treesfd_alloc: minor upgrades and sizeclass regenerationgossip: multi-tile fuzzers, use local IP for outbound sourcesched: accept empty shred batches, split fuzzer/unit-testtower: report confirmation event telemetry, fix incorrectFD_TESTrepair: fix tile union member type confusionsock: fix no rserve & repair sockettopo: fixenable_block_production=falsehttp: usesendmsginstead ofsendmmsgcodeql: reduce noise in security and quality queriessecp256r1: add missing curve pointloader v3: fix non-conformant logsconfig: renamerepair_intake_listen_portin TOML, addsource activatescript, reduce verbosityaccdb: pipeline recovery under new accdbfeatures: clean upwarp_timestamp_againdiag: add TLB shootdowns to metrics, add diag tile to backtest/snapshot-load toposseccomp: fix filter generation failure propagationType of change
Related work
Risk & impact
gui,guih,rpc, andmetricstiles — mismatches could cause tile sandbox failures at startup.fd_allocsizeclass config was regenerated; any workspace layout assumptions tied to allocator sizeclasses should be re-verified.voteSubscribe,slotSubscribe) expand the external API surface.fd_clock_tilefor wallclock; replay timestamp behavior may differ from prior approach.tkings,tknorm,tkdedu,tkpoly,tkaudt,tkrepl, etc.) contracts are changed by this merge.How to test
just buildormake— confirm the full build succeeds with no new errors or warnings.just test-unit/zig build test— confirm unit tests pass (note: README badges show unit tests currently failing; verify no new failures were introduced by this merge).gui,guih,rpc, andeventtiles pass seccomp filter initialization.just test-integrationor equivalent — confirm integration tests pass.Runtime / contract changes (if applicable)
Generated code / artifacts (if applicable)
make -C src/disco/metrics metrics)cd src/flamenco/features && make generate)make -C src/flamenco/runtime/tests protobufs)Build / config / docs changes (if applicable)
justfile/tooling updatedFiredancer scope (if applicable)
Firedancer notes
This PR is a periodic upstream sync of Firedancer infrastructure code into the Tickoni branch. All changes originate from the upstream Firedancer repository and were merged in bulk. The touched areas are:
src/disco/(events, gui, guih, metrics, quic, topo, net),src/discof/(gossip, replay, repair, restore, rpc, tower, txsend),src/discoh/(guih),src/flamenco/(accdb, features, runtime, progcache),src/util/alloc/,src/vinyl/data/,src/waltz/http/, build config, and contrib tooling. No upstream Firedancer issue/PR was created for this sync — changes were taken from upstream as-is. No Solana validator semantics were introduced into Tickoni framework code.Checklist
Implementation
Tests
just tests-allcommand executed successfullyIf tests were not added, explain why
Upstream sync — tests are owned by upstream Firedancer and are included where upstream added them (e.g.,
test_sspeer_selector.c,test_sched.c,test_tower_tile.c,test_rpc_tile.c,test_proc_interrupts.c,test_ghost.c,test_tower.c). No new Tickoni-specific behavior was introduced.Observability / operations (if applicable)
Security & privacy (if applicable)
Release notes
Release note (if needed)
Syncs 62 upstream Firedancer infrastructure commits into Tickoni (17.6 v1), including event topology auto-wiring, GUI/guih split, RPC WebSocket subscriptions, alloc and sspeer_selector fixes, regenerated metrics and seccomp policies, and a broad set of correctness fixes across gossip, tower, repair, quic, and runtime subsystems.