Skip to content

internal/db2: Reduce temp-file I/O on account-filtered /trades queries#181

Open
tamirms wants to merge 2 commits into
mainfrom
fix-trade-union-all-composite-index
Open

internal/db2: Reduce temp-file I/O on account-filtered /trades queries#181
tamirms wants to merge 2 commits into
mainfrom
fix-trade-union-all-composite-index

Conversation

@tamirms

@tamirms tamirms commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Change createTradesSQL to use UNION ALL + per-branch LIMIT on the account/offer/pool filter variants. The subqueries are disjoint by protocol invariant (self-matching prevention + single-sided LP trades), so UNION ALL is semantically equivalent and skips an expensive dedup sort.
  • Add composite indexes htrd_by_{base,counter}_account_op_order so each subquery satisfies WHERE + ORDER BY from an index range scan.
  • This was the dominant contributor to slow-query time and replication lag on the pubnet Horizon read replica.

Impact (measured on staging, 324M rows)

Account class Before (UNION + single-col indexes) After (UNION ALL + composite indexes)
Hot (~11% of trades) 174,662 ms, ~107 GB temp written 4.6 ms, 0 temp
Mid (~0.67%) n/a 6.75 ms, 0 temp
Sparse (~0.09%) n/a 3.4 ms, 0 temp

Correctness: analytical disjointness proof

Empirically verified on staging (three SELECT count(*) scans of history_trades all returned 0), and analytically confirmed against internal/ingest/processors/trades_processor.go:

  • base_account_id = counter_account_id: impossible. Orderbook trades have seller != buyer (stellar-core self-match prevention). LP trades have base_account_id = NULL (the counterparty is the pool).
  • base_offer_id = counter_offer_id: impossible. Orderbook trades match two distinct offers; the TOID-encoded synthetic counter offer id is in a disjoint encoding domain from raw offer ids. LP trades have base_offer_id = NULL.
  • base_liquidity_pool_id = counter_liquidity_pool_id: impossible. Orderbook trades have both NULL. LP trades have exactly one side set to the pool id.

The existing htrd_by_base_account / htrd_by_counter_account single-column indexes are retained; a follow-up PR will drop them after confirming the composites cover their workload.

🤖 Generated with Claude Code

Change the subquery UNION in createTradesSQL to UNION ALL and push the
outer LIMIT into each subquery. The two branches are disjoint by protocol
invariant (an account cannot match its own offer, a trade matches two
distinct offers, and an LP trade has the pool on exactly one side), so
UNION ALL is semantically equivalent and skips the expensive dedup sort
that was generating roughly 100 GB of temp writes per execution at
pubnet scale.

Add composite indexes on (base_account_id, history_operation_id, "order")
and (counter_account_id, ...) so each subquery can satisfy both the WHERE
filter and ORDER BY from an index range scan, giving a density-independent
spill-free plan. The existing single-column indexes are retained until
follow-up observation confirms the composites cover all usage.

Validated on staging (324M row history_trades): execution time drops
from 174 s (baseline) to 4.6 ms / 6.75 ms / 3.4 ms across hot, mid-tier,
and sparse account classes; zero temp-file I/O.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 20, 2026 23:23

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes account/offer/liquidity-pool filtered /trades queries by rewriting the generated SQL to avoid expensive dedup sorts and by adding supporting composite indexes, targeting a major source of PostgreSQL temp-file I/O and replica lag on pubnet-scale datasets.

Changes:

  • Update createTradesSQL to use UNION ALL (instead of UNION) for base/counter filter branches and apply per-branch LIMIT before the final order/limit.
  • Add composite indexes on (base_account_id, history_operation_id, "order") and (counter_account_id, history_operation_id, "order") using CREATE INDEX CONCURRENTLY.
  • Add a unit test to prevent regressions back to UNION, and update embedded migration bindata + changelog.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
internal/db2/schema/migrations/71_trades_account_composite_indexes.sql Adds concurrent composite indexes to support efficient account-filtered trade pagination.
internal/db2/schema/bindata.go Regenerates embedded schema assets to include migration 71.
internal/db2/history/trade_test.go Adds a regression test asserting account/offer/pool variants use UNION ALL.
internal/db2/history/trade.go Rewrites unioned account/offer/pool query generation to UNION ALL and pushes LIMIT into each branch.
CHANGELOG.md Documents the query optimization and the new migration/indexes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The previous commit regenerated bindata.go with a go-bindata version that
emits explicit &bintree{x, y} struct literals. gofmt -s canonicalizes
those to the implicit {x, y} form, which is what the CHECK CI job
expects. No semantic change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tamirms tamirms changed the title Reduce temp-file I/O on account-filtered /trades queries internal/db2: Reduce temp-file I/O on account-filtered /trades queries Apr 20, 2026
@tamirms tamirms requested a review from a team April 21, 2026 00:37
@urvisavla urvisavla moved this to Needs Review in Platform Scrum Apr 21, 2026
@urvisavla urvisavla added this to the platform sprint 70 milestone Apr 21, 2026
@tamirms tamirms moved this from Needs Review to Blocked in Platform Scrum May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Blocked

Development

Successfully merging this pull request may close these issues.

4 participants