Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ This repo applies these by default:
| Defense | What it does |
|---|---|
| `SHA256SUMS` published with every release | Every `output/*.txt` file's hash is committed alongside the file. Consumers can verify before applying. |
| TF module verifies hashes by default | `verify_checksums = true` (default). Module fetches `SHA256SUMS` at plan time and fails the apply on any hash mismatch. Defense-in-depth against single-file tampering between commits. |
| Weekly tagging (`v<YYYY.MM.DD>`) | Each publication gets an immutable tag. Pin to the tag. |
| GitHub Actions deps pinned by SHA | All third-party actions use commit SHA pins (not floating tags), preventing transitive action supply-chain attacks. |
| `permissions: contents: write` only | The weekly workflow has the minimum permissions needed; no secrets, no tokens beyond GITHUB_TOKEN. |
Expand All @@ -68,7 +69,7 @@ This neutralizes most attacks because tampering between publications doesn't rea
### Stricter (regulated / high-stakes)

- Pin `?ref=<sha>` to a specific commit SHA
- Verify `SHA256SUMS` at fetch time (planned in PR 4 — TF module hash verification)
- Keep `verify_checksums = true` in the Terraform module (default) — fetches `SHA256SUMS` at plan time, fails the apply on hash mismatch
- Bump the SHA via a dedicated review PR with explicit security-team approval

### Most paranoid (airgapped / nation-state threat model)
Expand Down Expand Up @@ -133,6 +134,6 @@ If you find a tampering pattern, a vulnerability in the publication pipeline, or

## Future work

- **PR 4 (planned):** TF module hash verification — fetch `SHA256SUMS` at plan time, compare against the hash of each fetched feed, fail the plan on mismatch. Defense-in-depth against single-file tampering between publications.
- **Optional:** SLSA provenance attestation via GitHub OIDC + Sigstore. Adds verifiable "this artifact was built from this workflow run on this commit." Not needed for v1.
- **Optional:** Independent watchdog repo on a separate identity that publishes the same files via the same logic. Defeats single-account compromise. Real defense, not yet implemented.
- **Optional:** `validate_against_upstream` mode for the TF module — at plan time, fetch the live Databricks JSON directly and cross-check our published feed against it. Catches our-feed tampering but adds runtime dependency on Databricks.
122 changes: 109 additions & 13 deletions terraform/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ resource "azurerm_storage_account_network_rules" "data" {
| `source_base_url` | string | `https://bhavink.github.io/databricksIPranges/output` | Base URL serving the per-region `.txt` feeds. Override for forks or self-hosted mirrors |
| `source_files` | list(string) | `[]` | Local CIDR-per-line file paths. Non-empty = airgapped/vendored mode (no network) |
| `min_cidr_count` | number | `1` | Refuse to apply below this. Guards against feed-empty lockouts. Set `0` to disable |
| `verify_checksums` | bool | `true` | Fetch `SHA256SUMS` from `source_base_url` and verify each feed's hash against it. No-op in `source_files` mode |

## Outputs

Expand Down Expand Up @@ -169,6 +170,8 @@ terraform output cidrs | head # spot-check first few
| `Feed contained non-CIDR lines` | URL serves HTML/JSON, not text | Verify `source_base_url` points at the `output/` directory, not the JSON endpoint |
| `cloud must be one of: aws, azure, gcp` | Typo on `cloud` input | Use lowercase, exact match |
| `regions must contain only lowercase letters, digits, and hyphens` | Region has spaces, uppercase, or other chars | Use the exact region name from `<source_base_url>/` |
| `SHA256 mismatch for ...` | Feed body's hash doesn't match `SHA256SUMS` entry. Possible tampering, possible stale manifest | Compare hashes manually (see [SECURITY.md](../SECURITY.md)). If your fork doesn't publish `SHA256SUMS`, set `verify_checksums = false` |
| `Failed to fetch SHA256SUMS at ...` | URL returns non-200 — fork without manifest, or temporarily down | Set `verify_checksums = false` if your source intentionally doesn't publish one |

### Deeper diagnostics

Expand Down Expand Up @@ -196,19 +199,112 @@ CI runs the same on every PR touching `terraform/` — see `.github/workflows/te

Coverage:

| Behaviour | Test |
|---|---|
| Single-file happy path | `happy_path_single_file` |
| Multi-file union | `multi_file_union` |
| Comment + blank line stripping | `strips_comments_and_blanks` |
| Deduplication | `deduplicates` |
| Cloud input validation | `rejects_invalid_cloud` |
| Region format validation | `rejects_invalid_region_format` |
| Lockout guard (`min_cidr_count`) | `rejects_below_min_cidr_count` |
| Lockout guard disabled | `min_cidr_count_zero_allows_empty` |
| Non-CIDR content detection | `rejects_non_cidr_content` |

Tests use `source_files` against committed fixtures — no network required, runs in seconds.
| Behaviour | Test file | Test |
|---|---|---|
| Single-file happy path | `module.tftest.hcl` | `happy_path_single_file` |
| Multi-file union | `module.tftest.hcl` | `multi_file_union` |
| Comment + blank line stripping | `module.tftest.hcl` | `strips_comments_and_blanks` |
| Deduplication | `module.tftest.hcl` | `deduplicates` |
| Cloud input validation | `module.tftest.hcl` | `rejects_invalid_cloud` |
| Region format validation | `module.tftest.hcl` | `rejects_invalid_region_format` |
| Lockout guard (`min_cidr_count`) | `module.tftest.hcl` | `rejects_below_min_cidr_count` |
| Lockout guard disabled | `module.tftest.hcl` | `min_cidr_count_zero_allows_empty` |
| Non-CIDR content detection | `module.tftest.hcl` | `rejects_non_cidr_content` |
| Hash matches → pass | `checksums.tftest.hcl` | `verify_passes_when_hash_matches` |
| Hash mismatch → fail | `checksums.tftest.hcl` | `verify_fails_on_hash_mismatch` |
| File not in manifest → fail | `checksums.tftest.hcl` | `verify_fails_when_file_missing_from_manifest` |
| `verify_checksums = false` skips fetch | `checksums.tftest.hcl` | `verify_disabled_skips_fetch` |
| Local mode silently skips verify | `checksums.tftest.hcl` | `verify_silently_skipped_in_local_mode` |
| Manifest with blanks/comments tolerated | `checksums.tftest.hcl` | `manifest_with_blank_and_comment_lines_is_tolerated` |

Parsing tests use local fixtures; checksum tests use `mock_provider` `override_data` — no network required either way, full suite runs in seconds.

---

## Validating locally

Three layers — pick what you need. All three exiting `0` proves the entire chain (publish → fetch → verify → parse → emit) works end-to-end.

### 1. Run the test suite (no network)

```bash
cd terraform
terraform fmt -check -recursive
terraform init -backend=false
terraform validate
terraform test
# Expect: Success! 15 passed, 0 failed.
```

> Requires Terraform `>= 1.6` (for the `test` framework). On Homebrew macOS, `homebrew/core` only ships 1.5.7 — switch to `hashicorp/tap/terraform`, install `opentofu` (drop-in, Apache-2.0), or grab a binary directly from `releases.hashicorp.com`.

### 2. Manual integrity check against the live URL

Verifies the published `SHA256SUMS` matches the published feeds — independent of the TF module:

```bash
mkdir -p /tmp/dbx-verify && cd /tmp/dbx-verify
curl -sO https://bhavink.github.io/databricksIPranges/output/SHA256SUMS
curl -sO https://bhavink.github.io/databricksIPranges/output/azure-eastus.txt
curl -sO https://bhavink.github.io/databricksIPranges/output/aws-us-east-1.txt
shasum -a 256 -c SHA256SUMS --ignore-missing
# Expect: azure-eastus.txt: OK
# aws-us-east-1.txt: OK
```

### 3. End-to-end smoke test — real `terraform plan` against the live URL

Exercises the full module flow (fetch SHA256SUMS → fetch feed → verify hash → parse → emit). Use this when adopting the module to confirm your environment can reach the source and verify integrity.

```bash
mkdir -p /tmp/dbx-smoke && cat > /tmp/dbx-smoke/main.tf <<'EOF'
terraform {
required_version = ">= 1.6"
}

module "dbx_ips" {
source = "github.com/bhavink/databricksIPranges//terraform?ref=main"
cloud = "azure"
regions = ["eastus"]
# verify_checksums defaults to true — exercises the full path
}

output "cidr_count" { value = module.dbx_ips.cidr_count }
output "first_three_cidrs" { value = slice(module.dbx_ips.cidrs, 0, 3) }
output "source" { value = module.dbx_ips.source }
EOF

terraform -chdir=/tmp/dbx-smoke init
terraform -chdir=/tmp/dbx-smoke plan
```

A successful plan output looks like:

```
module.dbx_ips.data.http.checksums[0]: Read complete after 0s [id=...SHA256SUMS]
module.dbx_ips.data.http.feed["azure-eastus.txt"]: Read complete after 0s [id=...azure-eastus.txt]

Changes to Outputs:
+ cidr_count = 144
+ first_three_cidrs = [
+ "128.203.118.160/28",
+ "128.203.119.128/25",
+ "128.203.119.16/28",
]
+ source = ["https://bhavink.github.io/databricksIPranges/output/azure-eastus.txt"]
```

If you see this, every postcondition passed: HTTP 200 + hash matches manifest + every line is a valid CIDR + `cidr_count >= min_cidr_count`. The chain is sound.

### Bonus — prove tamper detection trips

To watch the fail-closed behaviour fire, point the module at a source that doesn't publish a CIDR feed:

```bash
sed -i '' 's|github.com/bhavink/databricksIPranges//terraform?ref=main|github.com/bhavink/databricksIPranges//terraform?ref=main"\n source_base_url = "https://example.com|' /tmp/dbx-smoke/main.tf
terraform -chdir=/tmp/dbx-smoke plan
# Expect: clean failure with "Failed to fetch SHA256SUMS at https://example.com/SHA256SUMS — HTTP 404"
```

---

Expand Down
40 changes: 40 additions & 0 deletions terraform/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,38 @@ locals {
] : [
"${var.cloud}.txt"
]

# Hash verification only meaningful in URL mode. In local mode the customer
# controls the file bytes, so verifying against an upstream SHA256SUMS
# would be confusing (different sources, different hashes).
do_verify = var.verify_checksums && !local.use_local
}

data "http" "checksums" {
count = local.do_verify ? 1 : 0
url = "${var.source_base_url}/SHA256SUMS"

retry {
attempts = 3
min_delay_ms = 500
}

lifecycle {
postcondition {
condition = self.status_code == 200
error_message = "Failed to fetch SHA256SUMS at ${self.url} — HTTP ${self.status_code}. If your source_base_url doesn't publish SHA256SUMS, set verify_checksums = false to disable hash verification."
}
}
}

locals {
# Parse SHA256SUMS body (GNU format: "<64-hex> <filename>") into a map
# keyed by filename. Tolerant of blank lines and `#` comments.
expected_hashes = local.do_verify ? {
for line in split("\n", data.http.checksums[0].response_body) :
regex("^[0-9a-f]{64} (.+)$", trimspace(line))[0] => substr(trimspace(line), 0, 64)
if can(regex("^[0-9a-f]{64} ", trimspace(line)))
} : {}
}

data "http" "feed" {
Expand All @@ -22,6 +54,14 @@ data "http" "feed" {
condition = self.status_code == 200
error_message = "Failed to fetch ${self.url} — HTTP ${self.status_code}. Common causes: (1) misspelled region name, (2) region has no published feed (per-region files are emitted only when ≥1 CIDR exists), (3) wrong source_base_url. Browse available feeds at ${var.source_base_url}/."
}

postcondition {
# Either verification is off, or the file's hash matches the manifest.
# Using lookup() with empty default makes a missing-from-manifest
# condition fail with a clear message rather than a TF map-lookup error.
condition = !local.do_verify || sha256(self.response_body) == lookup(local.expected_hashes, replace(self.url, "${var.source_base_url}/", ""), "")
error_message = "SHA256 mismatch for ${self.url}. Expected ${lookup(local.expected_hashes, replace(self.url, "${var.source_base_url}/", ""), "(filename not present in SHA256SUMS)")}, got ${sha256(self.response_body)}. Possible tampering — refusing to apply. Verify SHA256SUMS at ${var.source_base_url}/SHA256SUMS or pin to a known-good ?ref=<sha>. If your source intentionally doesn't publish SHA256SUMS, set verify_checksums = false."
}
}
}

Expand Down
Loading
Loading