diff --git a/docs/runbook/postgres-backup-and-recovery.md b/docs/runbook/postgres-backup-and-recovery.md new file mode 100644 index 0000000..d7ea120 --- /dev/null +++ b/docs/runbook/postgres-backup-and-recovery.md @@ -0,0 +1,243 @@ +# Runbook: PostgreSQL Backup and Recovery + +## Purpose +Install PostgreSQL 16 on the Havenhold app host, configure daily automated backups with 7-day local retention, and provide a tested restore procedure. This runbook covers two phases: + +- **D0 phase (H-007):** PostgreSQL setup and cron configuration. Completed before app code is ever deployed. Steps A and C only. +- **Pre-launch phase (H-010):** First deployment of app code, schema migration, seed, and restore test. Completed during the H-010 initial deploy cycle, before the app process is started for the first time. Steps B, D, and E. + +The restore test (Step E) can only be run after `prisma migrate deploy` has created the schema. Do not attempt it at D0 time on an empty database. + +## Security Handling +- This runbook is git-tracked and must stay sanitized. +- Do not commit `DB_PASSWORD`, connection strings, `.pgpass` contents, or operator-only CIDRs. +- Store `DB_PASSWORD` in approved team secret storage (e.g. 1Password, AWS Secrets Manager). +- Backup files contain full database contents — `/var/backups/havenhold/` is restricted to `postgres:postgres` (mode 750). Do not scp backups over unencrypted channels. +- `.pgpass` files are mode 600; never commit them. + +## Prerequisites +- Host hardening baseline complete (H-005) — admin user, UFW, and fail2ban active. +- Nginx/TLS setup complete (H-006) — HTTPS serving the app. +- Lightsail firewall has no inbound rule for port 5432 (PostgreSQL must not be externally reachable). +- Repo checked out locally with `infra/postgres-setup.sh`, `infra/postgres-backup.sh`, and `infra/postgres-restore.sh` present. +- A strong, randomly generated `DB_PASSWORD` stored in team secret storage. + +## Deploy Steps + +> **Phase label key** +> - `[D0]` — run during H-007, before any app code is deployed +> - `[H-010]` — run during H-010 initial deploy, with app code present but app process not yet started + +### Step A `[D0]`: Install and configure PostgreSQL + +Copy and run the setup script on the host: + +```bash +KEY_PATH=/path/to/key.pem +STATIC_IP=203.0.113.10 +ADMIN_USER=adminuser + +scp -i "$KEY_PATH" infra/postgres-setup.sh "$ADMIN_USER"@"$STATIC_IP":/tmp/postgres-setup.sh +ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ + "DB_PASSWORD='' sudo -E bash /tmp/postgres-setup.sh" +``` + +Replace `` with the password from team secret storage. The script is idempotent — safe to re-run after fixing any failed step. + +### Step B `[H-010]`: Deploy app code, set DATABASE_URL, run migrations + +This step is performed during the H-010 initial deploy cycle. App code must be on the server before running these commands. The app process is **not started** yet. + +```bash +# On host — edit the app env file with the DB connection string +ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" +sudo nano /opt/havenhold/server/.env +# Set: DATABASE_URL="postgresql://havenhold:@127.0.0.1:5432/havenhold" +``` + +Run Prisma migrations to create the schema (no app process required): + +```bash +ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ + "cd /opt/havenhold/server && npx prisma migrate deploy" +``` + +Seed demo data so the backup/restore test has rows to validate: + +```bash +ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ + "cd /opt/havenhold/server && npx prisma db seed" +``` + +### Step C `[D0]`: Install daily backup cron job + +```bash +scp -i "$KEY_PATH" infra/postgres-backup.sh "$ADMIN_USER"@"$STATIC_IP":/tmp/postgres-backup.sh +ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ + "INSTALL_CRON=true sudo -E bash /tmp/postgres-backup.sh" +``` + +Default schedule: **02:30 UTC daily**. Override with `CRON_HOUR` and `CRON_MINUTE` env vars. + +### Step D `[H-010]`: Run first manual backup + +Run after Step B (migrations + seed) so the backup captures a schema with real rows: + +```bash +ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ + "sudo -u postgres /usr/local/bin/postgres-backup.sh" +``` + +Expected: `Backup written: /var/backups/havenhold/havenhold_.sql` + +### Step E `[H-010]`: Test restore (required AC — restore tested once) + +The app process has not been started yet at this point, so there are no active connections to terminate. Run restore directly: + +```bash +scp -i "$KEY_PATH" infra/postgres-restore.sh "$ADMIN_USER"@"$STATIC_IP":/tmp/postgres-restore.sh +ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ + "sudo bash /tmp/postgres-restore.sh" +``` + +The script auto-selects the most recent backup, prompts for confirmation, drops and recreates the database, restores, and prints row counts. Fill in the [Restore Tested Once](#restore-tested-once-h-010-pre-launch-confirmation) table below. + +After the restore test passes, the app can be started for the first time: + +```bash +ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ + "sudo systemctl start havenhold-app" +``` + +## D0 Verification Checklist (H-007 — infra only) + +Complete these before any app code is deployed: + +- [ ] PostgreSQL 16 installed and active (`systemctl status postgresql`). +- [ ] Database `havenhold` exists and app user `havenhold` can connect from `127.0.0.1`. +- [ ] PostgreSQL binds to localhost only — no external port 5432 exposure. +- [ ] Lightsail firewall has no inbound 5432 rule; UFW has no 5432 rule. +- [ ] `/etc/cron.d/havenhold-backup` installed with `CRON_TZ=UTC` and `postgres` as the run user. + +## Pre-Launch Checklist (H-010 — complete before starting the app process) + +Complete these during the H-010 initial deploy, after app code is on the server but before the app process is started: + +- [ ] `DATABASE_URL` set in `/opt/havenhold/server/.env`. +- [ ] `npx prisma migrate deploy` completed without errors. +- [ ] `npx prisma db seed` completed — demo patient/user rows exist. +- [ ] First manual backup written to `/var/backups/havenhold/` with non-zero size. +- [ ] `postgres-restore.sh` run against the most recent backup; row-count validation passed. +- [ ] [Restore Tested Once](#restore-tested-once-h-010-pre-launch-confirmation) table filled in and committed. + +## Evidence Commands + +Run on the host unless noted: + +```bash +# PostgreSQL service health +systemctl status postgresql --no-pager + +# Database and user exist +sudo -u postgres psql -c "\l" | grep havenhold +sudo -u postgres psql -c "\du" | grep havenhold + +# Localhost-only binding (expect 127.0.0.1:5432 only, no 0.0.0.0) +sudo ss -tulpen | grep 5432 + +# UFW has no 5432 rule (no output = correct) +sudo ufw status verbose | grep 5432 || echo "No 5432 rule — correct" + +# Lightsail firewall (run locally, not on host) +aws lightsail get-instance-port-states --instance-name havenhold-app-01 \ + | grep -i fromport | grep 5432 || echo "No 5432 Lightsail rule — correct" + +# Backup directory and files +ls -lh /var/backups/havenhold/ + +# Cron job +sudo cat /etc/cron.d/havenhold-backup + +# Backup log (after first run) +sudo tail -30 /var/log/havenhold-backup.log + +# Quick pg_dump smoke test (dumps to /dev/null — non-destructive) +sudo -u postgres pg_dump --no-password havenhold > /dev/null && echo "pg_dump OK" +``` + +## Restore Tested Once (H-010 Pre-Launch Confirmation) + +Fill in this table during the H-010 pre-launch restore test (Step E) and commit the change before starting the app process for the first time. + +| Field | Value | +|---|---| +| Date of test | _________________ | +| Operator | _________________ | +| Backup file used | _________________ | +| Backup file size | _________________ | +| Restore duration (approx) | _________________ | +| `Patient` row count post-restore | _________________ | +| `User` row count post-restore | _________________ | +| `Medication` row count post-restore | _________________ | +| Script exit code | _________________ | +| Pass / Fail | _________________ | + +## Operations: Periodic Backup Verification + +### Check last backup ran successfully + +```bash +sudo tail -20 /var/log/havenhold-backup.log +ls -lht /var/backups/havenhold/ | head -5 +``` + +### Check cron job is active + +```bash +sudo cat /etc/cron.d/havenhold-backup +systemctl status cron --no-pager +``` + +### Force a manual backup + +```bash +sudo -u postgres DB_NAME=havenhold BACKUP_DIR=/var/backups/havenhold \ + /usr/local/bin/postgres-backup.sh +``` + +### List all backups with sizes + +```bash +ls -lh /var/backups/havenhold/ +``` + +### Check what would be pruned (backups older than 7 days) + +```bash +find /var/backups/havenhold/ -name "havenhold_*.sql" -mtime +7 -ls +``` + +### Restore from a specific backup file + +```bash +BACKUP_FILE=/var/backups/havenhold/havenhold_20260528T023001Z.sql \ + sudo -E bash /tmp/postgres-restore.sh +``` + +## Rollback / Recovery + +- **`postgres-setup.sh` failed mid-way:** Re-run — the script is idempotent. +- **pg_dump fails in cron:** Check `/var/log/havenhold-backup.log`. Common causes: disk full (`df -h`), PostgreSQL service stopped (`systemctl status postgresql`). Resolve and run manually to verify. +- **`postgres-restore.sh` fails at DROP — active connections persist:** Stop the application first (`sudo systemctl stop havenhold-app`), then retry. +- **Restore SQL errors (`ON_ERROR_STOP` exits):** The backup may be corrupt or from an incompatible schema version. Try the previous day's backup from `/var/backups/havenhold/`. +- **`pg_hba.conf` conflict on a partially provisioned host:** The validation step at the end of `postgres-setup.sh` catches this. Inspect the file manually (`sudo cat /etc/postgresql/16/main/pg_hba.conf`) and remove duplicate or conflicting entries before re-running. +- **All local backups lost (disk failure):** No S3 offload exists at D0. Restore the Lightsail instance snapshot from the AWS console, then re-run `postgres-setup.sh` with the original `DB_PASSWORD`. Pending migrations can be replayed with `npx prisma migrate deploy`. +- **Disk pressure from backups:** Emergency prune: `find /var/backups/havenhold/ -name "*.sql" -mtime +2 -delete` (keeps last 2 days). Adjust `RETENTION_DAYS` in `/etc/cron.d/havenhold-backup` for the long term. + +## Notes + +- PostgreSQL is intentionally not exposed outside localhost. There is no pgAdmin or external connection. All DBA work is done via `sudo -u postgres psql` on the host. +- Backup files are plain SQL (`pg_dump -Fp`). They can be inspected with `less` or `grep`. To upgrade to compressed custom format (`-Fc`) in a future ticket, restore uses `pg_restore -d havenhold file.dump` instead of `psql -f file.sql`. +- S3 backup offload is deferred to a follow-up ticket. Until then, Lightsail instance snapshots are the only off-host recovery path. +- At typical Havenhold data volumes (< 10 MB/day), 7 days of backups costs under 100 MB of disk. Adjust via `RETENTION_DAYS` in `/etc/cron.d/havenhold-backup`. +- After any restore, run `npx prisma migrate deploy` to verify the schema migration state is current. diff --git a/infra/postgres-backup.sh b/infra/postgres-backup.sh new file mode 100755 index 0000000..f78f710 --- /dev/null +++ b/infra/postgres-backup.sh @@ -0,0 +1,123 @@ +#!/usr/bin/env bash +# infra/postgres-backup.sh — pg_dump daily backup for Havenhold +# +# BACKUP MODE (default — run as postgres OS user or via sudo -u postgres): +# sudo -u postgres DB_NAME=havenhold BACKUP_DIR=/var/backups/havenhold \ +# /usr/local/bin/postgres-backup.sh +# +# CRON INSTALL MODE (run once as root to schedule daily backups): +# Operator invocation (from repo root): +# KEY_PATH=/path/to/key.pem +# STATIC_IP=203.0.113.10 +# ADMIN_USER=adminuser +# scp -i "$KEY_PATH" infra/postgres-backup.sh "$ADMIN_USER"@"$STATIC_IP":/tmp/postgres-backup.sh +# ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ +# "INSTALL_CRON=true sudo -E bash /tmp/postgres-backup.sh" + +set -euo pipefail + +DB_NAME="${DB_NAME:-havenhold}" +BACKUP_DIR="${BACKUP_DIR:-/var/backups/havenhold}" +RETENTION_DAYS="${RETENTION_DAYS:-7}" +INSTALL_CRON="${INSTALL_CRON:-false}" +CRON_HOUR="${CRON_HOUR:-2}" +CRON_MINUTE="${CRON_MINUTE:-30}" + +# Validate inputs before any value is written into the cron file. +# A newline in DB_NAME or BACKUP_DIR would inject additional cron lines +# (executed by the cron daemon as separate commands, potentially as root). +[[ "${DB_NAME}" =~ ^[a-zA-Z0-9_]+$ ]] \ + || { echo "ERROR: DB_NAME '${DB_NAME}' contains invalid characters"; exit 1; } +[[ "${BACKUP_DIR}" =~ ^[a-zA-Z0-9/_-]+$ ]] \ + || { echo "ERROR: BACKUP_DIR '${BACKUP_DIR}' contains invalid characters"; exit 1; } +[[ "${RETENTION_DAYS}" =~ ^[0-9]+$ ]] \ + || { echo "ERROR: RETENTION_DAYS '${RETENTION_DAYS}' must be a number"; exit 1; } +[[ "${CRON_HOUR}" =~ ^([0-9]|1[0-9]|2[0-3])$ ]] \ + || { echo "ERROR: CRON_HOUR '${CRON_HOUR}' must be 0-23"; exit 1; } +[[ "${CRON_MINUTE}" =~ ^([0-9]|[1-5][0-9])$ ]] \ + || { echo "ERROR: CRON_MINUTE '${CRON_MINUTE}' must be 0-59"; exit 1; } + +log() { echo "[postgres-backup] $(date -u +'%H:%M:%S') $*"; } + +# --------------------------------------------------------------------------- +# CRON INSTALL MODE +# --------------------------------------------------------------------------- +if [[ "${INSTALL_CRON}" == "true" ]]; then + if [[ "$(id -u)" -ne 0 ]]; then + exec sudo -E bash "$0" "$@" + fi + + SCRIPT_DEST="/usr/local/bin/postgres-backup.sh" + log "Installing backup script to ${SCRIPT_DEST}" + cp "$(realpath "$0")" "${SCRIPT_DEST}" + chmod 750 "${SCRIPT_DEST}" + chown root:postgres "${SCRIPT_DEST}" + + CRON_FILE="/etc/cron.d/havenhold-backup" + # Build the desired content first so we can compare and only write on change. + # Path-presence check alone is not enough — CRON_HOUR, CRON_MINUTE, RETENTION_DAYS + # etc. may have changed since the last install. + NEW_CRON_CONTENT="$(cat <> /var/log/havenhold-backup.log 2>&1 +EOF + )" + + if [[ -f "${CRON_FILE}" ]] && [[ "$(cat "${CRON_FILE}")" == "${NEW_CRON_CONTENT}" ]]; then + log "Cron job already up to date — skipping" + else + printf '%s\n' "${NEW_CRON_CONTENT}" > "${CRON_FILE}" + chmod 644 "${CRON_FILE}" + log "Cron job written: ${CRON_FILE} (schedule: ${CRON_HOUR}:$(printf '%02d' "${CRON_MINUTE}") UTC daily)" + fi + + log "Cron install complete." + log "Test with: sudo -u postgres ${SCRIPT_DEST}" + log "Logs at: sudo tail -f /var/log/havenhold-backup.log" + exit 0 +fi + +# --------------------------------------------------------------------------- +# BACKUP MODE +# --------------------------------------------------------------------------- +log "Step 1/4: Verify backup directory" +if [[ ! -d "${BACKUP_DIR}" ]]; then + log "ERROR: ${BACKUP_DIR} does not exist. Run postgres-setup.sh first." >&2 + exit 1 +fi + +log "Step 2/4: Running pg_dump for '${DB_NAME}'" +TIMESTAMP="$(date -u +'%Y%m%dT%H%M%SZ')" +BACKUP_FILE="${BACKUP_DIR}/${DB_NAME}_${TIMESTAMP}.sql" + +# Peer auth: script runs as postgres OS user — no password or PGPASSWORD needed. +# --no-password ensures we fail fast rather than hang if auth is misconfigured. +pg_dump \ + --format=plain \ + --no-password \ + --file="${BACKUP_FILE}" \ + "${DB_NAME}" + +log "Step 3/4: Verify backup is non-empty" +if [[ ! -s "${BACKUP_FILE}" ]]; then + log "ERROR: Backup file is empty — removing and aborting." >&2 + rm -f "${BACKUP_FILE}" + exit 1 +fi +BACKUP_SIZE="$(du -sh "${BACKUP_FILE}" | cut -f1)" +log "Backup written: ${BACKUP_FILE} (${BACKUP_SIZE})" + +log "Step 4/4: Pruning backups older than ${RETENTION_DAYS} days" +PRUNED=0 +while IFS= read -r -d '' OLD_FILE; do + rm -f "${OLD_FILE}" + log "Pruned: ${OLD_FILE}" + PRUNED=$((PRUNED + 1)) +done < <(find "${BACKUP_DIR}" -maxdepth 1 -name "${DB_NAME}_*.sql" \ + -mtime "+${RETENTION_DAYS}" -print0) +log "Pruned ${PRUNED} old backup(s)" + +log "Backup complete: ${BACKUP_FILE}" diff --git a/infra/postgres-restore.sh b/infra/postgres-restore.sh new file mode 100755 index 0000000..dcd16a7 --- /dev/null +++ b/infra/postgres-restore.sh @@ -0,0 +1,143 @@ +#!/usr/bin/env bash +# infra/postgres-restore.sh — restore Havenhold database from a pg_dump backup +# +# Operator invocation (from repo root): +# KEY_PATH=/path/to/key.pem +# STATIC_IP=203.0.113.10 +# ADMIN_USER=adminuser +# scp -i "$KEY_PATH" infra/postgres-restore.sh "$ADMIN_USER"@"$STATIC_IP":/tmp/postgres-restore.sh +# +# # Restore from most recent backup (auto-select): +# ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ +# "sudo bash /tmp/postgres-restore.sh" +# +# # Restore from a specific backup file: +# ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ +# "BACKUP_FILE=/var/backups/havenhold/havenhold_20260528T023001Z.sql \ +# sudo -E bash /tmp/postgres-restore.sh" +# +# WARNING: This script DROPS and RECREATES the database. All current data is replaced. +# Stop the application before running: sudo systemctl stop havenhold-app + +set -euo pipefail + +DB_NAME="${DB_NAME:-havenhold}" +DB_USER="${DB_USER:-havenhold}" +BACKUP_DIR="${BACKUP_DIR:-/var/backups/havenhold}" +BACKUP_FILE="${BACKUP_FILE:-}" +SKIP_CONFIRM="${SKIP_CONFIRM:-false}" + +# Validate identifiers: values with -- prefixes would be interpreted as flags by +# dropdb/createdb; newlines would break psql variable passing or config writes. +[[ "${DB_NAME}" =~ ^[a-zA-Z0-9_]+$ ]] \ + || { echo "ERROR: DB_NAME '${DB_NAME}' contains invalid characters"; exit 1; } +[[ "${DB_USER}" =~ ^[a-zA-Z0-9_]+$ ]] \ + || { echo "ERROR: DB_USER '${DB_USER}' contains invalid characters"; exit 1; } + +if [[ "$(id -u)" -ne 0 ]]; then + exec sudo -E bash "$0" "$@" +fi + +log() { echo "[postgres-restore] $(date -u +'%H:%M:%S') $*"; } + +# --------------------------------------------------------------------------- +# Step 1/6: Locate backup file +# --------------------------------------------------------------------------- +log "Step 1/6: Locate backup file" +if [[ -n "${BACKUP_FILE}" ]]; then + if [[ ! -f "${BACKUP_FILE}" ]]; then + log "ERROR: BACKUP_FILE='${BACKUP_FILE}' not found." >&2 + exit 1 + fi + TARGET="${BACKUP_FILE}" + log "Using specified backup: ${TARGET}" +else + TARGET="$(find "${BACKUP_DIR}" -maxdepth 1 -name "${DB_NAME}_*.sql" \ + -printf '%T@ %p\n' 2>/dev/null \ + | sort -n | tail -1 | cut -d' ' -f2-)" + if [[ -z "${TARGET}" ]]; then + log "ERROR: No backups found in ${BACKUP_DIR} matching ${DB_NAME}_*.sql" >&2 + exit 1 + fi + log "Auto-selected most recent backup: ${TARGET}" +fi + +# --------------------------------------------------------------------------- +# Step 2/6: Verify backup is non-empty +# --------------------------------------------------------------------------- +log "Step 2/6: Verify backup file" +if [[ ! -s "${TARGET}" ]]; then + log "ERROR: Backup file is empty or unreadable: ${TARGET}" >&2 + exit 1 +fi +TARGET_SIZE="$(du -sh "${TARGET}" | cut -f1)" +log "Backup: ${TARGET} (${TARGET_SIZE})" + +# --------------------------------------------------------------------------- +# Step 3/6: Confirmation prompt +# --------------------------------------------------------------------------- +log "Step 3/6: Confirm restore" +if [[ "${SKIP_CONFIRM}" != "true" ]]; then + echo "" + echo " WARNING: This will DROP and RECREATE the '${DB_NAME}' database." + echo " All current data will be permanently replaced with the backup." + echo " Backup file: ${TARGET}" + echo "" + read -r -p " Type 'yes' to confirm restore: " CONFIRM + if [[ "${CONFIRM}" != "yes" ]]; then + log "Restore aborted by operator." + exit 0 + fi +fi + +# --------------------------------------------------------------------------- +# Step 4/6: Terminate connections, drop and recreate database +# --------------------------------------------------------------------------- +log "Step 4/6: Terminate connections, drop and recreate '${DB_NAME}'" +sudo -u postgres psql -v ON_ERROR_STOP=1 \ + -v "db_name=${DB_NAME}" \ + -d postgres \ + <<'ENDSQL' +SELECT pg_terminate_backend(pid) +FROM pg_stat_activity +WHERE datname = :'db_name' AND pid <> pg_backend_pid(); +ENDSQL + +sudo -u postgres dropdb --if-exists -- "${DB_NAME}" +sudo -u postgres createdb -O "${DB_USER}" -- "${DB_NAME}" +log "Database '${DB_NAME}' recreated with owner '${DB_USER}'" + +# --------------------------------------------------------------------------- +# Step 5/6: Restore from backup +# --------------------------------------------------------------------------- +log "Step 5/6: Restoring from ${TARGET}" +sudo -u postgres psql \ + --set ON_ERROR_STOP=on \ + -d "${DB_NAME}" \ + -f "${TARGET}" +log "Restore complete" + +# --------------------------------------------------------------------------- +# Step 6/6: Post-restore row-count validation +# --------------------------------------------------------------------------- +log "Step 6/6: Post-restore validation" +sudo -u postgres psql -d "${DB_NAME}" -t -A <<'SQL' +SELECT 'Patient rows: ' || COUNT(*) FROM "Patient"; +SELECT 'User rows: ' || COUNT(*) FROM "User"; +SELECT 'Medication rows: ' || COUNT(*) FROM "Medication"; +SQL + +PATIENT_COUNT="$(sudo -u postgres psql -d "${DB_NAME}" -t -A \ + -c 'SELECT COUNT(*) FROM "Patient";' | tr -d '[:space:]')" + +if [[ "${PATIENT_COUNT}" -lt 1 ]]; then + log "ERROR: Post-restore validation failed — Patient table is empty." >&2 + exit 1 +fi + +log "Validation passed: ${PATIENT_COUNT} patient row(s) found." +log "Restore successful from: ${TARGET}" +log "" +log "Next steps:" +log " 1. Restart the application: sudo systemctl start havenhold-app" +log " 2. Verify migration state: npx prisma migrate deploy" diff --git a/infra/postgres-setup.sh b/infra/postgres-setup.sh new file mode 100755 index 0000000..b3138ce --- /dev/null +++ b/infra/postgres-setup.sh @@ -0,0 +1,159 @@ +#!/usr/bin/env bash +# infra/postgres-setup.sh — idempotent PostgreSQL 16 setup for Havenhold +# Operator invocation (from repo root): +# KEY_PATH=/path/to/key.pem +# STATIC_IP=203.0.113.10 +# ADMIN_USER=adminuser +# scp -i "$KEY_PATH" infra/postgres-setup.sh "$ADMIN_USER"@"$STATIC_IP":/tmp/postgres-setup.sh +# ssh -i "$KEY_PATH" "$ADMIN_USER"@"$STATIC_IP" \ +# "DB_PASSWORD='' sudo -E bash /tmp/postgres-setup.sh" + +set -euo pipefail + +DB_PASSWORD="${DB_PASSWORD:?DB_PASSWORD env var required}" +DB_NAME="${DB_NAME:-havenhold}" +DB_USER="${DB_USER:-havenhold}" +BACKUP_DIR="${BACKUP_DIR:-/var/backups/havenhold}" +PG_VERSION="${PG_VERSION:-16}" + +# Validate identifiers before any value is written into config files. +# Newlines or shell metacharacters in DB_NAME/DB_USER would inject lines into +# pg_hba.conf; reject anything outside safe PostgreSQL identifier characters. +[[ "${DB_NAME}" =~ ^[a-zA-Z0-9_]+$ ]] \ + || { echo "ERROR: DB_NAME '${DB_NAME}' contains invalid characters"; exit 1; } +[[ "${DB_USER}" =~ ^[a-zA-Z0-9_]+$ ]] \ + || { echo "ERROR: DB_USER '${DB_USER}' contains invalid characters"; exit 1; } + +if [[ "$(id -u)" -ne 0 ]]; then + exec sudo -E bash "$0" "$@" +fi + +log() { echo "[postgres-setup] $(date -u +'%H:%M:%S') $*"; } + +# --------------------------------------------------------------------------- +# Step 1/7: Install PostgreSQL +# --------------------------------------------------------------------------- +log "Step 1/7: Install postgresql-${PG_VERSION}" +if dpkg -s "postgresql-${PG_VERSION}" &>/dev/null; then + log "postgresql-${PG_VERSION} already installed — skipping" +else + # Add pgdg repo if postgresql-16 is not available in default sources + if ! apt-cache show "postgresql-${PG_VERSION}" &>/dev/null; then + log "postgresql-${PG_VERSION} not in default sources — adding pgdg repo" + apt-get install -y gnupg curl lsb-release + curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc \ + | gpg --dearmor -o /usr/share/keyrings/postgresql.gpg + echo "deb [signed-by=/usr/share/keyrings/postgresql.gpg] \ +https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" \ + > /etc/apt/sources.list.d/pgdg.list + log "pgdg repo added" + fi + apt-get update -q + DEBIAN_FRONTEND=noninteractive apt-get install -y \ + "postgresql-${PG_VERSION}" \ + "postgresql-client-${PG_VERSION}" +fi + +# --------------------------------------------------------------------------- +# Step 2/7: Ensure service is enabled and running +# --------------------------------------------------------------------------- +log "Step 2/7: Enable and start postgresql" +systemctl enable postgresql +systemctl start postgresql + +# --------------------------------------------------------------------------- +# Step 3/7: Lock to localhost only +# --------------------------------------------------------------------------- +log "Step 3/7: Set listen_addresses = 'localhost'" +PG_CONF="/etc/postgresql/${PG_VERSION}/main/postgresql.conf" +if grep -q "^listen_addresses = 'localhost'" "${PG_CONF}"; then + log "listen_addresses already set to localhost — skipping" +else + sed -i "s/^#\?listen_addresses\s*=.*/listen_addresses = 'localhost'/" "${PG_CONF}" + log "listen_addresses set to localhost" +fi + +# --------------------------------------------------------------------------- +# Step 4/7: Add pg_hba entries (IPv4 + IPv6 localhost) +# --------------------------------------------------------------------------- +log "Step 4/7: Configure pg_hba.conf for ${DB_USER}@127.0.0.1 and @::1" +PG_HBA="/etc/postgresql/${PG_VERSION}/main/pg_hba.conf" +# Always remove and rewrite the block so rerunning with changed DB_NAME/DB_USER +# takes effect. '/# havenhold-app/,+2d' deletes the marker plus the two host lines. +sed -i '/# havenhold-app/,+2d' "${PG_HBA}" +cat >> "${PG_HBA}" <