Skip to content

michaelmillar/pg-blast-radius

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pg-blast-radius logo

pg-blast-radius

Forecast what your PostgreSQL migration will block, for how long, on your actual workload.

What it does · Quick start · How it compares · CI


demo.mp4

pg-blast-radius reads your migration SQL, connects read-only to your database, and reports:

  • Blocked query families from pg_stat_statements, showing which production queries queue up and how many
  • Duration forecasts as fast / slow / worst-case ranges, not single-point estimates
  • Rollout recipes with step-by-step safer SQL when a migration should be split
  • Confidence ledger stating what is known from docs, observed from catalog, inferred from workload, and assumed

Other tools tell you "this takes ACCESS EXCLUSIVE." This one tells you "this will queue 14,200 queries/min on your busiest table for an estimated 6..12 minutes."

What it does

$ pg-blast-radius analyse migration.sql --stats-file prod-stats.json

  orders (34.0 GB, ~892M rows)
    Lock: ACCESS EXCLUSIVE (blocks all reads and writes)
    Duration: 16m (fast)  37m (slow)  37m (worst)
    Blocked queries: 3 families, 14202 calls/min combined
      SELECT ... FROM orders WHERE customer_id = $1   8100/min  ~129255 queued (fast)
      INSERT INTO orders (...)                        4800/min  ~76596 queued (fast)
      UPDATE orders SET status = $1 WHERE id = $2     1302/min  ~20777 queued (fast)
    Confidence: query load MEASURED, lock hold INFERRED
    3 statements combined

  EXTREME  ALTER COLUMN TYPE on "orders"."total" to numeric triggers table rewrite
    Estimated: 6m (fast)  12m (slow)  12m (worst)
    Rollout recipe: Expand/migrate/contract for "orders"."total"
      1. [expand]    ADD COLUMN "total_new" numeric
      2. [backfill]  UPDATE in batches
      3. [validate]  CREATE TRIGGER to sync during migration
      4. [switch]    Application reads from new column
      5. [contract]  DROP old column, trigger, rename

  EXTREME  CREATE INDEX "idx_orders_status" without CONCURRENTLY
    Estimated: 7m (fast)  19m (slow)  19m (worst)
    Rollout recipe: CREATE INDEX CONCURRENTLY

  EXTREME  ADD FOREIGN KEY scans 34 GB table
    Estimated: 3m (fast)  6m (slow)  6m (worst)
    Rollout recipe: ADD ... NOT VALID + VALIDATE CONSTRAINT

  Overall: EXTREME RISK | Confidence: ESTIMATED
  3 statements, 3 safer alternatives suggested.

Without a database connection, it still analyses lock modes, rewrite risk, and generates recipes. With one, it tells you exactly what will hurt.

Quick start

cargo install pg-blast-radius

pg-blast-radius analyse migration.sql

pg-blast-radius analyse migration.sql --dsn postgres://readonly@prod-replica:5432/mydb

The --dsn connection is read-only. It queries pg_stat_user_tables for sizes, pg_stat_statements for query workload, and pg_stat_activity for transaction baseline. No writes, no superuser.

Offline stats

Export once, use in CI without database access:

pg-blast-radius collect-stats --dsn postgres://readonly@prod-replica/mydb > prod-stats.json

pg-blast-radius analyse migration.sql --stats-file prod-stats.json

Analysis modes

Capability Static (no DB) With catalog With workload
Lock mode prediction Yes Yes Yes
Table rewrite detection Yes Yes Yes
Risk level Conservative Size-aware Size-aware
Duration forecast No fast/slow/worst fast/slow/worst + lock delay
Blocked query families No No Yes
Queue depth estimates No No Yes
Rollout recipes Yes Yes Yes
Confidence STATIC ESTIMATED MEASURED

Rules

Operation What it detects
ADD COLUMN Default volatility, NOT NULL, PG version-dependent rewrite
ALTER COLUMN TYPE Binary format change, rewrite detection
SET NOT NULL Full scan risk, PG 12+ safe path
ADD CONSTRAINT CHECK/FK/UNIQUE/PK NOT VALID detection, lock analysis
VALIDATE CONSTRAINT Non-blocking lock confirmation
CREATE INDEX CONCURRENTLY detection, SHARE lock warning
DROP INDEX CONCURRENTLY detection
DROP COLUMN Lock warning, application breakage risk
DROP CONSTRAINT Lock warning, FK dual-table locking
RENAME COLUMN/TABLE Lock warning, application breakage risk
ATTACH PARTITION Scan risk, pre-validated CHECK optimisation
TRUNCATE Destructive operation, ACCESS EXCLUSIVE warning
VACUUM / VACUUM FULL Non-blocking vs ACCESS EXCLUSIVE rewrite
ANALYZE Non-blocking statistics collection
REINDEX CONCURRENTLY detection, lock warning
REFRESH MATERIALIZED VIEW CONCURRENTLY detection, lock warning

How it compares

Most migration tools lint syntax or execute changes safely. pg-blast-radius does neither. It forecasts operational impact from your actual production workload.

pg-blast-radius squawk Eugene pgfence Atlas lint pg-schema-diff
Rules 28 31 12+ 15+ 20+ N/A (generator)
Lock mode detection Full parser Syntax rules Parser + trace Parser Schema-aware Lock-minimised SQL
Workload-aware Yes (pg_stat_statements) No No Table size only No No
Duration forecast fast/slow/worst No No No No No
Blocked query families Yes No No No No No
Queue depth estimates Yes No No No No No
Safe rewrite recipes Yes Partial No Yes Yes Auto-generated
Confidence ledger Yes No No No No No

Strengths: Tells you which queries queue up, how many per minute, and for how long, on your actual workload. Explicit about what it knows vs what it assumes.

Weaknesses: Fewer lint rules than squawk. No trace/replay mode (yet). Requires pg_stat_statements for full workload analysis.

Use squawk in pre-commit for fast linting. Use pg-blast-radius in CI for operational risk assessment.

CLI

pg-blast-radius analyse <files...>
pg-blast-radius collect-stats --dsn <dsn>
Flag Default Purpose
--pg-version 16 PostgreSQL version to assume
--format terminal terminal or json
--fail-level high Exit non-zero if any finding meets this level
--dsn none Database connection (read-only) for catalog + workload
--stats-file none Pre-collected stats JSON (alternative to --dsn)
--io-throughput none IO throughput range in MB/s (e.g. 200:800)

Exit codes

Code Meaning
0 All findings below --fail-level
1 At least one finding meets --fail-level
2 Parse error or invalid input

CI integration

GitHub Actions

- name: Check migration risk
  run: |
    pg-blast-radius analyse migrations/*.sql \
      --stats-file .pg-stats.json \
      --format json \
      --fail-level high

JSON output

pg-blast-radius analyse migration.sql --format json --stats-file stats.json

Returns a JSON array with findings, blast_radius (including blocked_queries), overall_risk, and overall_confidence.

Building from source

git clone https://github.com/michaelmillar/pg-blast-radius.git
cd pg-blast-radius
cargo build --release

Requires Rust stable and a C compiler (for libpg_query).

Status

52 tests passing. Production-ready for static and catalog-aware analysis. Workload-aware forecasting validated against real Postgres 16 workloads.

Not yet implemented: trace/replay mode, custom rules.

Licence

MIT

About

Workload-aware blast radius forecaster for PostgreSQL migrations. Blocked query forecasting, duration ranges, rollout recipes, confidence accounting.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors