Skip to content

ryanata/ledger-go

Repository files navigation

ledger-go

A Postgres-backed double-entry ledger written in Go. Transfers across multiple assets, an idempotent HTTP API with byte-stable replay, and a transactional outbox with two relay strategies (poll and logical replication).

HTTP /transfers
   │
   │ Idempotency-Key
   ▼
┌──────────────────┐  TX1: reserve key (in_flight) + COMMIT
│  HTTP handler    │  TX2: post_transfer + Idem.Complete + COMMIT
│  (Go, pgx/v5)    │  TX3 (failure only): mark completed with error code
└──────────┬───────┘
           ▼
┌─────────────────────────────────────────────────────────┐
│  Postgres 17                                             │
│   ├── post_transfer(...)        ← lock-ordered SQL fn   │
│   ├── accounts, transfers,                              │
│   │   entries, idempotency_keys                         │
│   └── outbox  (LIST-partitioned by published_at IS NULL)│
└─────────────────────────────────────────────────────────┘
           │
           ▼
┌──────────────────┐    poll  : FOR UPDATE SKIP LOCKED
│  outbox relay    │    wal   : pglogrepl + pgoutput v1
└──────────────────┘

Design overview

  • Entries are append-only. Once posted, an entry cannot be updated or deleted; reversals are new transfers that net to zero. A BEFORE UPDATE/DELETE trigger enforces this.
  • Money is BIGINT minor units. No floats anywhere on the money path.
  • Transfers are lock-ordered. post_transfer locks the touched accounts in canonical UUID byte order before validating, so concurrent transfers on overlapping account sets cannot deadlock.
  • Idempotency is two transactions. TX1 commits the Idempotency-Key reservation; TX2 runs post_transfer and marks the key completed in the same transaction. On failure, TX3 records the typed error code (422/404/etc.) so retries get the same response.
  • Replay is byte-stable. Cached response bodies are stored as BYTEA, not JSONB (JSONB re-canonicalizes on read and breaks byte equality, which clients hash for retry verification).
  • The outbox is LIST-partitioned by (published_at IS NULL). An UPDATE that publishes a row physically migrates it from outbox_unpublished to outbox_published, so the unpublished partition stays small (sized by the relay's lag, not by lifetime throughput). See docs/outbox.md.

Schema invariants

Six invariants are enforced by triggers, checks, and the schema itself, and re-verified post-hoc by cmd/reconcile:

  • I1 — every transfer's per-asset legs sum to zero.
  • I2 — entries are append-only once posted.
  • I3account.balance ≡ sum of posted entries for that account.
  • I3b — total posted amount per asset is zero (closed system).
  • I4 — accounts that disallow it never go negative.
  • I5 — posted transfers have only posted entries (and vice versa).
  • Plus a structural check: every transfer has ≥ 2 entries.

./bin/reconcile runs all six against the live database and exits non-zero on any violation.

Tests

make test         # integration (real Postgres, with -race)
make prop         # gopter property tests, 200 sequences each
make prop-deep    # property tests, 50,000 sequences (slow)
make tla          # TLC on the idempotency spec (~13 s, requires Java 17)
make reconcile    # all six invariants

Property tests (test/property/property_test.go)

Three properties:

  • Conservation — sum of all balances per asset equals the world account's negative seed.
  • No deadlocks under contention — N goroutines, M shared accounts, zero deadlock errors observed.
  • Reversal restores pre-state — for every posted transfer, posting its reversal restores the snapshot taken before the original.

Integration tests (internal/httpapi/server_test.go)

Seven real-Postgres tests including:

  • TestIdempotentReplayReturnsSameResponse — second call with the same key returns a byte-identical body.
  • TestInsufficientFundsReplayedReturnsSameError — failed transfers cache their 422 and replay it instead of re-running the business logic.
  • TestConcurrentRetriesProduceExactlyOneTransfer — N goroutines hit the same Idempotency-Key; exactly one HTTP 201, the rest are 200 replays or 409 in_flight, and the database has exactly one row in transfers.

TLA+ specification (specs/Idempotency.tla)

The 2-tx idempotency protocol is modelled abstractly: a key transitions through absent → in_flight → completed, four protocol actions (Begin, TryInsert, Lookup, Complete) are interleaved across multiple clients, and four invariants are checked:

Invariant Property
Inv_NoCrossBleed A stored hash always belongs to the modeled set.
Inv_OnceCompletedSticks Completed records never revert.
Inv_ConflictImpliesDiff A conflict outcome implies stored hash ≠ offered hash.
Inv_OkImpliesCompleted An ok outcome implies the store reached completed.

TLC, with Clients={c1,c2,c3} Keys={k1} Hashes={h1,h2} MaxActions=2, exhaustively explores the reachable state graph:

7,832,011 states generated, 2,285,334 distinct states found, 0 left on queue.
The depth of the complete state graph search is 26.
Finished in 13s. No error has been found.

Full output in specs/tlc_run.txt. Reproduce with make tla (requires brew install openjdk@17).

Performance

See BENCHMARKS.md for methodology and full results. Single laptop (M2, 8 GB), Postgres 17 in Docker, synchronous_commit=on, 80-conn pool.

Scenario Achieved rps p50 p99 p99.9 Errors
Uniform 300 rps × 60 s 299.5 2.5 ms 37 ms 158 ms 0
Zipfian s=1.2, 150 rps × 60 s 149.8 3.3 ms 20 ms 79 ms 0
Storm 50 rps × 60 s, 1 hot account 50.0 5.9 ms 15 ms 38 ms 0
Overload uniform 1000 rps × 30 s 995.4 6.9 ms 189 ms 270 ms 0
Saturation, 2000 rps × 30 s 1411.6 667 ms 1.16 s 1.19 s 0

143,701 transfers, 100% success, all six reconciliation invariants pass after every run.

API

POST /transfers
Idempotency-Key: <client-supplied unique key>
{
  "asset_code": "USD",
  "legs": [
    { "account_id": "...", "amount": -100 },
    { "account_id": "...", "amount":  100 }
  ],
  "metadata": {"order_id": "abc"}
}

Response codes:

Code Meaning
201 Transfer posted. Body contains transfer_id and status.
200 Idempotent replay. Body is byte-identical to the original 201 body. Idempotent-Replay: true header set.
409 Idempotency-Key reused with a different request body, or a concurrent retry is still in flight.
422 Business rule violated (insufficient funds, asset mismatch, legs don't sum to zero).
400 Malformed request.

GET /accounts/{id}/balance returns balance, asset, and version. GET /healthz returns ok if the pool is healthy. GET /metrics is Prometheus.

Layout

cmd/
  api/            HTTP server
  migrate/        applies migrations
  seed/           creates accounts + funds them via the world account
  loadgen/        fixed-rate, HDR-quantile load generator
  reconcile/      runs the six invariants, exits 1 on failure
  outboxrelay/    -mode poll | wal
internal/
  httpapi/        handlers, 2-tx idempotency, Prometheus metrics
  idempotency/    BeginOrLookup / Complete (Postgres-backed)
  ledger/         service layer, typed errors
  outbox/         poll.go + wal.go (logical replication)
migrations/
  001_core.sql       schema, invariants, append-only trigger, partitioned outbox
  002_transfer_fn.sql post_transfer / reverse_transfer
specs/
  Idempotency.tla    TLA+ spec
  Idempotency.cfg    TLC config
  tlc_run.txt        cached run output
test/
  property/          gopter property tests
docs/
  outbox.md          LIST partition rationale, poll-vs-WAL tradeoffs
loadtest/
  results/           JSON HDR snapshots from each scenario

Running

Requires Docker (for Postgres), Go 1.23, and (for make tla) Java 17.

make up               # start Postgres on :5433
make migrate          # apply schema
make seed             # 10,000 accounts + world account
make api &            # HTTP API on :8080

make test             # unit + integration (with -race)
make prop             # property tests, 200 sequences each
make tla              # TLC on the idempotency spec (~13 s)

make reconcile        # all six invariants (run any time)

make load-uniform     # 300 rps × 60 s, uniform random
make load-zipfian     # 150 rps × 60 s, Zipfian s=1.2
make load-storm       # 50 rps × 60 s, single hot account
make load-overload    # 2000 rps × 30 s, intentional saturation

make ci runs the subset that GitHub Actions would run.

Non-goals

  • Not exactly-once. The outbox is at-least-once with durable ordering by id; downstream consumers dedupe by event id. Nothing that delivers messages to a remote system over an unreliable network can be exactly-once.
  • Not multi-region. Single-primary Postgres. A regional partition stalls the relay.
  • No floats for money. BIGINT minor units, full stop.
  • No ORM on the money path. pgx/v5 directly; the PL/pgSQL function is the abstraction layer.
  • No Redis as the source of truth for idempotency. Postgres is. Redis is fine as a cache layer in front, but the contract holds without it.

License

MIT. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors