Skip to content

Add RPC ingestion load test driven by synthetic apply-load ledger bundles#741

Open
cjonas9 wants to merge 71 commits into
mainfrom
apply-load
Open

Add RPC ingestion load test driven by synthetic apply-load ledger bundles#741
cjonas9 wants to merge 71 commits into
mainfrom
apply-load

Conversation

@cjonas9

@cjonas9 cjonas9 commented May 15, 2026

Copy link
Copy Markdown
Contributor

What

This is a PR implementing a repeatable CI ingestion load test on a full database of 7 days of ledgers. The approximate design is here:
Screenshot 2026-05-14 at 12 59 14 PM

This GHA workflow for this test, currently, is triggered on pushes to this branch (apply-load), but will later be modified to trigger on any release or on PR comments stating "run load test".

The workflow benchmarks RPC ingestion end-to-end on an ephemeral c5.2xlarge: it launches the box, pulls a mainnet-scale golden DB (~307GB, 1-week retention window), a BUILD_TESTS stellar-core, and three apply-load ledger bundles from S3 (sha-verified). After the box downloads and decompresses this data, its gp3 volume is throttled to 125 MiB/s, ingests the bundles, and posts a per-profile results table to the run summary / PR.

Main Pieces:

  • integrationtest/ingest_loadtest_test.go::TestIngestSyntheticLedgers: byte-concatenates N bundles into one continuous stream (the backend rebases ledger seqs per ledger, so per-bundle seq resets are harmless), ingests onto the golden DB with retention trimming live, verifies exact classic/soroban op counts via parallel getTransactions walkers, and reports per-profile wall-clock/ledgers-sec/ms-ledger/latency quantiles.
  • loadtest/testdata/apply-load-v27-*-cfg: config files specifying three O3 target tx profiles, 1,000 ledgers each: sac (1,000 soroban TPL), oz (900), soroswap (250). All generate these + 1,000 classic payments/ledger to create ledger bundles (for local usage or S3) offline by stellar-core apply-load.
  • .github/workflows/load-test.yml: push-triggered orchestrator. OIDC-assumes into AWS, launches an ephemeral c5.2xlarge (Ubuntu 22.04, 500GB gp3) with the runner script as user-data (shipped verbatim, TARGET_SHA/RUN_ID passed via a two-line env preamble), waits for SSM registration, delegates polling to the script, writes the results table to the step summary (and PR comment when one exists), fails the job on a fail verdict or timeout, and always terminates the instance.
  • run-load-test.sh: both halves of the run in one self-contained script, coordinated by a /tmp marker protocol.
    • instance (user-data on the box): installs the toolchain, streams the golden DB + BUILD_TESTS core + all bundles from S3, builds the repo at the target SHA, throttles root volume MiB/s, runs the test, and writes results.md plus an ok/fail verdict.
    • orchestrate (on the GHA runner): polls the box over SSM, drives the gp3 downshift handshake (500 -> 125 MiB/s after downloads complete, so fetches are fast but the benchmark runs on throttled I/O), and relays verdict + results as step outputs.

Why

CI testing of RPC ingestion performance; benchmarking. This also serves as an automated regression testing framework, though future work should expand this to report some metric that allows one to compare a run's results to historical results.

Known limitations

This is purely intended as a test of RPC's ingestion pipeline and seeks to see how it handles load in isolation (i.e. without captive core running). Future work should also seek to automatically refresh the S3 DB + ledger bundles on some pre-determined cadence.

Copilot AI review requested due to automatic review settings May 15, 2026 22:40
@github-actions

Copy link
Copy Markdown
Contributor

⏳ Load test launching on i-063ed1e3a29f001e3 (commit 241bdf833edbfc18d3c312a7109168d171324aa1).
Workflow run: https://github.com/stellar/stellar-rpc/actions/runs/25944865928
Posting results when the run finishes (~15 min).

@@ -0,0 +1,344 @@
#!/usr/bin/env bash

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we require go to be installed to run rpc and the ingestion load test. so I wonder if most of the logic in this bash script could live in a go file. I think that it would be easier to understand and maintain a go script than a large bash script.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely. There's definitely some required shell, but cramming it all into one shell script is super excessive and messy (though I did like that it kept everything for the instance held in one place as user data). I'll see about refactoring some of this out

@github-actions

Copy link
Copy Markdown
Contributor

⏳ Load test launching on i-000ee1a1274fa17a4 (commit 9bef1157b7073a5d472b8bdc297cdc4e9c5169d2).
Workflow run: https://github.com/stellar/stellar-rpc/actions/runs/27727364842
Posting results when the run finishes.

@socket-security

socket-security Bot commented Jun 17, 2026

Copy link
Copy Markdown

@github-actions

Copy link
Copy Markdown
Contributor

Ingest load test failed (run 27727364842 on 9bef1157b7073a5d472b8bdc297cdc4e9c5169d2)

make build-libs failed: exit status 2

@github-actions

Copy link
Copy Markdown
Contributor

⏳ Load test launching on i-0e82b023245865ca1 (commit 8b39ed5024f66ac96a7798af5e031445c9d466cf).
Workflow run: https://github.com/stellar/stellar-rpc/actions/runs/27728512552
Posting results when the run finishes.

@github-actions

Copy link
Copy Markdown
Contributor

Ingest load test failed (run 27728512552 on 8b39ed5024f66ac96a7798af5e031445c9d466cf)

volume throttle could not be confirmed

@github-actions

Copy link
Copy Markdown
Contributor

⏳ Load test launching on i-0b40cd7da9dfe4d32 (commit 714b1fab586ddf7d68ea309062c8a28434520d09).
Workflow run: https://github.com/stellar/stellar-rpc/actions/runs/27732589868
Posting results when the run finishes.

@github-actions

Copy link
Copy Markdown
Contributor

📈 Ingest load test — 714b1fa

Profile Ledgers Wall-clock Ledgers/sec ms/ledger p50 / p95 / p99 ms
apply-load-v27-oz 1000 1250.049s 0.80 1251.30 1174.999 / 1749.997 / 2025
apply-load-v27-sac 1000 1156.650s 0.86 1156.65 1174.997 / 1250 / 1299.999
apply-load-v27-soroswap 1000 827.400s 1.21 827.40 825.001 / 900.001 / 974.999
Metric Value
Ledgers replayed 3000
Initial DB ledger count 120960
Overall throughput 0.93 ledgers/sec
Overall ingest wall-clock 3234.099s
Per-ledger p50 / p95 / p99 1100 / 1449.999 / 1900 ms
Golden DB fetch+decompress 1180s
stellar-core v27.0.0
Workflow run #27732589868

@github-actions

Copy link
Copy Markdown
Contributor

⏳ Load test launching on i-0ae39eb14de75bdbe (commit 5058ad3976a196cdcd99984762f2caa516a21508).
Workflow run: https://github.com/stellar/stellar-rpc/actions/runs/27769496808
Posting results when the run finishes.

@github-actions

Copy link
Copy Markdown
Contributor

📈 Ingest load test — 5058ad3

Profile Ledgers Wall-clock Ledgers/sec ms/ledger p50 / p95 / p99 ms
apply-load-v27-oz 1000 1240.774s 0.81 1242.02 1150.001 / 1725 / 2050
apply-load-v27-sac 1000 1139.350s 0.88 1139.35 1149.998 / 1225.001 / 1275.001
apply-load-v27-soroswap 1000 825.875s 1.21 825.87 825.001 / 900.001 / 950
Metric Value
Ledgers replayed 3000
Initial DB ledger count 120960
Overall throughput 0.94 ledgers/sec
Overall ingest wall-clock 3205.999s
Per-ledger p50 / p95 / p99 1099.998 / 1449.999 / 1900 ms
Golden DB fetch+decompress 1451s
stellar-core v27.0.0
Workflow run #27769496808

@github-actions

Copy link
Copy Markdown
Contributor

⏳ Load test launching on i-05d0c32462a3c7da3 (commit 3b33626afda7a66fce20d76ea7dc083c2492ad65).
Workflow run: https://github.com/stellar/stellar-rpc/actions/runs/27782520036
Posting results when the run finishes.

@github-actions

Copy link
Copy Markdown
Contributor

⏳ Load test launching on i-01e1ab4d654e6efdf (commit fdb1926ffedb8ae4aa533faffeec6338ef973af7).
Workflow run: https://github.com/stellar/stellar-rpc/actions/runs/27789957778
Posting results when the run finishes.

@github-actions

Copy link
Copy Markdown
Contributor

📈 Ingest load test — 3b33626

Profile Ledgers Wall-clock Ledgers/sec ms/ledger p50 / p95 / p99 ms
apply-load-v27-oz 1000 1234.525s 0.81 1235.76 1150 / 1700.001 / 1925
apply-load-v27-sac 1000 1138.699s 0.88 1138.70 1149.999 / 1225 / 1275
apply-load-v27-soroswap 1000 829.349s 1.21 829.35 825.002 / 900.001 / 974.999
Metric Value
Ledgers replayed 3000
Initial DB ledger count 120960
Overall throughput 0.94 ledgers/sec
Overall ingest wall-clock 3202.573s
Per-ledger p50 / p95 / p99 1099.998 / 1450.001 / 1824.999 ms
Golden DB fetch+decompress 2440s
stellar-core v27.0.0
Workflow run #27782520036

@github-actions

Copy link
Copy Markdown
Contributor

📈 Ingest load test — fdb1926

Profile Ledgers Wall-clock Ledgers/sec ms/ledger p50 / p95 / p99 ms
apply-load-v27-oz 1000 1234.300s 0.81 1235.54 1150 / 1674.999 / 1925
apply-load-v27-sac 1000 1137.950s 0.88 1137.95 1149.999 / 1225 / 1275
apply-load-v27-soroswap 1000 829.175s 1.21 829.17 849.998 / 900.001 / 975
Metric Value
Ledgers replayed 3000
Initial DB ledger count 120960
Overall throughput 0.94 ledgers/sec
Overall ingest wall-clock 3201.424s
Per-ledger p50 / p95 / p99 1099.998 / 1450 / 1824.999 ms
Golden DB fetch+decompress 2446s
stellar-core v27.0.0
Workflow run #27789957778

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Release eval: Add repeatable Core apply-load integration test

5 participants