diff --git a/SECURITY.md b/SECURITY.md index c06adea..d3bcc61 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -49,6 +49,7 @@ This repo applies these by default: | Defense | What it does | |---|---| | `SHA256SUMS` published with every release | Every `output/*.txt` file's hash is committed alongside the file. Consumers can verify before applying. | +| TF module verifies hashes by default | `verify_checksums = true` (default). Module fetches `SHA256SUMS` at plan time and fails the apply on any hash mismatch. Defense-in-depth against single-file tampering between commits. | | Weekly tagging (`v`) | Each publication gets an immutable tag. Pin to the tag. | | GitHub Actions deps pinned by SHA | All third-party actions use commit SHA pins (not floating tags), preventing transitive action supply-chain attacks. | | `permissions: contents: write` only | The weekly workflow has the minimum permissions needed; no secrets, no tokens beyond GITHUB_TOKEN. | @@ -68,7 +69,7 @@ This neutralizes most attacks because tampering between publications doesn't rea ### Stricter (regulated / high-stakes) - Pin `?ref=` to a specific commit SHA -- Verify `SHA256SUMS` at fetch time (planned in PR 4 — TF module hash verification) +- Keep `verify_checksums = true` in the Terraform module (default) — fetches `SHA256SUMS` at plan time, fails the apply on hash mismatch - Bump the SHA via a dedicated review PR with explicit security-team approval ### Most paranoid (airgapped / nation-state threat model) @@ -133,6 +134,6 @@ If you find a tampering pattern, a vulnerability in the publication pipeline, or ## Future work -- **PR 4 (planned):** TF module hash verification — fetch `SHA256SUMS` at plan time, compare against the hash of each fetched feed, fail the plan on mismatch. Defense-in-depth against single-file tampering between publications. - **Optional:** SLSA provenance attestation via GitHub OIDC + Sigstore. Adds verifiable "this artifact was built from this workflow run on this commit." Not needed for v1. - **Optional:** Independent watchdog repo on a separate identity that publishes the same files via the same logic. Defeats single-account compromise. Real defense, not yet implemented. +- **Optional:** `validate_against_upstream` mode for the TF module — at plan time, fetch the live Databricks JSON directly and cross-check our published feed against it. Catches our-feed tampering but adds runtime dependency on Databricks. diff --git a/terraform/README.md b/terraform/README.md index 1681b7d..268128a 100644 --- a/terraform/README.md +++ b/terraform/README.md @@ -36,6 +36,7 @@ resource "azurerm_storage_account_network_rules" "data" { | `source_base_url` | string | `https://bhavink.github.io/databricksIPranges/output` | Base URL serving the per-region `.txt` feeds. Override for forks or self-hosted mirrors | | `source_files` | list(string) | `[]` | Local CIDR-per-line file paths. Non-empty = airgapped/vendored mode (no network) | | `min_cidr_count` | number | `1` | Refuse to apply below this. Guards against feed-empty lockouts. Set `0` to disable | +| `verify_checksums` | bool | `true` | Fetch `SHA256SUMS` from `source_base_url` and verify each feed's hash against it. No-op in `source_files` mode | ## Outputs @@ -169,6 +170,8 @@ terraform output cidrs | head # spot-check first few | `Feed contained non-CIDR lines` | URL serves HTML/JSON, not text | Verify `source_base_url` points at the `output/` directory, not the JSON endpoint | | `cloud must be one of: aws, azure, gcp` | Typo on `cloud` input | Use lowercase, exact match | | `regions must contain only lowercase letters, digits, and hyphens` | Region has spaces, uppercase, or other chars | Use the exact region name from `/` | +| `SHA256 mismatch for ...` | Feed body's hash doesn't match `SHA256SUMS` entry. Possible tampering, possible stale manifest | Compare hashes manually (see [SECURITY.md](../SECURITY.md)). If your fork doesn't publish `SHA256SUMS`, set `verify_checksums = false` | +| `Failed to fetch SHA256SUMS at ...` | URL returns non-200 — fork without manifest, or temporarily down | Set `verify_checksums = false` if your source intentionally doesn't publish one | ### Deeper diagnostics @@ -196,19 +199,112 @@ CI runs the same on every PR touching `terraform/` — see `.github/workflows/te Coverage: -| Behaviour | Test | -|---|---| -| Single-file happy path | `happy_path_single_file` | -| Multi-file union | `multi_file_union` | -| Comment + blank line stripping | `strips_comments_and_blanks` | -| Deduplication | `deduplicates` | -| Cloud input validation | `rejects_invalid_cloud` | -| Region format validation | `rejects_invalid_region_format` | -| Lockout guard (`min_cidr_count`) | `rejects_below_min_cidr_count` | -| Lockout guard disabled | `min_cidr_count_zero_allows_empty` | -| Non-CIDR content detection | `rejects_non_cidr_content` | - -Tests use `source_files` against committed fixtures — no network required, runs in seconds. +| Behaviour | Test file | Test | +|---|---|---| +| Single-file happy path | `module.tftest.hcl` | `happy_path_single_file` | +| Multi-file union | `module.tftest.hcl` | `multi_file_union` | +| Comment + blank line stripping | `module.tftest.hcl` | `strips_comments_and_blanks` | +| Deduplication | `module.tftest.hcl` | `deduplicates` | +| Cloud input validation | `module.tftest.hcl` | `rejects_invalid_cloud` | +| Region format validation | `module.tftest.hcl` | `rejects_invalid_region_format` | +| Lockout guard (`min_cidr_count`) | `module.tftest.hcl` | `rejects_below_min_cidr_count` | +| Lockout guard disabled | `module.tftest.hcl` | `min_cidr_count_zero_allows_empty` | +| Non-CIDR content detection | `module.tftest.hcl` | `rejects_non_cidr_content` | +| Hash matches → pass | `checksums.tftest.hcl` | `verify_passes_when_hash_matches` | +| Hash mismatch → fail | `checksums.tftest.hcl` | `verify_fails_on_hash_mismatch` | +| File not in manifest → fail | `checksums.tftest.hcl` | `verify_fails_when_file_missing_from_manifest` | +| `verify_checksums = false` skips fetch | `checksums.tftest.hcl` | `verify_disabled_skips_fetch` | +| Local mode silently skips verify | `checksums.tftest.hcl` | `verify_silently_skipped_in_local_mode` | +| Manifest with blanks/comments tolerated | `checksums.tftest.hcl` | `manifest_with_blank_and_comment_lines_is_tolerated` | + +Parsing tests use local fixtures; checksum tests use `mock_provider` `override_data` — no network required either way, full suite runs in seconds. + +--- + +## Validating locally + +Three layers — pick what you need. All three exiting `0` proves the entire chain (publish → fetch → verify → parse → emit) works end-to-end. + +### 1. Run the test suite (no network) + +```bash +cd terraform +terraform fmt -check -recursive +terraform init -backend=false +terraform validate +terraform test +# Expect: Success! 15 passed, 0 failed. +``` + +> Requires Terraform `>= 1.6` (for the `test` framework). On Homebrew macOS, `homebrew/core` only ships 1.5.7 — switch to `hashicorp/tap/terraform`, install `opentofu` (drop-in, Apache-2.0), or grab a binary directly from `releases.hashicorp.com`. + +### 2. Manual integrity check against the live URL + +Verifies the published `SHA256SUMS` matches the published feeds — independent of the TF module: + +```bash +mkdir -p /tmp/dbx-verify && cd /tmp/dbx-verify +curl -sO https://bhavink.github.io/databricksIPranges/output/SHA256SUMS +curl -sO https://bhavink.github.io/databricksIPranges/output/azure-eastus.txt +curl -sO https://bhavink.github.io/databricksIPranges/output/aws-us-east-1.txt +shasum -a 256 -c SHA256SUMS --ignore-missing +# Expect: azure-eastus.txt: OK +# aws-us-east-1.txt: OK +``` + +### 3. End-to-end smoke test — real `terraform plan` against the live URL + +Exercises the full module flow (fetch SHA256SUMS → fetch feed → verify hash → parse → emit). Use this when adopting the module to confirm your environment can reach the source and verify integrity. + +```bash +mkdir -p /tmp/dbx-smoke && cat > /tmp/dbx-smoke/main.tf <<'EOF' +terraform { + required_version = ">= 1.6" +} + +module "dbx_ips" { + source = "github.com/bhavink/databricksIPranges//terraform?ref=main" + cloud = "azure" + regions = ["eastus"] + # verify_checksums defaults to true — exercises the full path +} + +output "cidr_count" { value = module.dbx_ips.cidr_count } +output "first_three_cidrs" { value = slice(module.dbx_ips.cidrs, 0, 3) } +output "source" { value = module.dbx_ips.source } +EOF + +terraform -chdir=/tmp/dbx-smoke init +terraform -chdir=/tmp/dbx-smoke plan +``` + +A successful plan output looks like: + +``` +module.dbx_ips.data.http.checksums[0]: Read complete after 0s [id=...SHA256SUMS] +module.dbx_ips.data.http.feed["azure-eastus.txt"]: Read complete after 0s [id=...azure-eastus.txt] + +Changes to Outputs: + + cidr_count = 144 + + first_three_cidrs = [ + + "128.203.118.160/28", + + "128.203.119.128/25", + + "128.203.119.16/28", + ] + + source = ["https://bhavink.github.io/databricksIPranges/output/azure-eastus.txt"] +``` + +If you see this, every postcondition passed: HTTP 200 + hash matches manifest + every line is a valid CIDR + `cidr_count >= min_cidr_count`. The chain is sound. + +### Bonus — prove tamper detection trips + +To watch the fail-closed behaviour fire, point the module at a source that doesn't publish a CIDR feed: + +```bash +sed -i '' 's|github.com/bhavink/databricksIPranges//terraform?ref=main|github.com/bhavink/databricksIPranges//terraform?ref=main"\n source_base_url = "https://example.com|' /tmp/dbx-smoke/main.tf +terraform -chdir=/tmp/dbx-smoke plan +# Expect: clean failure with "Failed to fetch SHA256SUMS at https://example.com/SHA256SUMS — HTTP 404" +``` --- diff --git a/terraform/main.tf b/terraform/main.tf index 62b42fd..a6f7ea7 100644 --- a/terraform/main.tf +++ b/terraform/main.tf @@ -6,6 +6,38 @@ locals { ] : [ "${var.cloud}.txt" ] + + # Hash verification only meaningful in URL mode. In local mode the customer + # controls the file bytes, so verifying against an upstream SHA256SUMS + # would be confusing (different sources, different hashes). + do_verify = var.verify_checksums && !local.use_local +} + +data "http" "checksums" { + count = local.do_verify ? 1 : 0 + url = "${var.source_base_url}/SHA256SUMS" + + retry { + attempts = 3 + min_delay_ms = 500 + } + + lifecycle { + postcondition { + condition = self.status_code == 200 + error_message = "Failed to fetch SHA256SUMS at ${self.url} — HTTP ${self.status_code}. If your source_base_url doesn't publish SHA256SUMS, set verify_checksums = false to disable hash verification." + } + } +} + +locals { + # Parse SHA256SUMS body (GNU format: "<64-hex> ") into a map + # keyed by filename. Tolerant of blank lines and `#` comments. + expected_hashes = local.do_verify ? { + for line in split("\n", data.http.checksums[0].response_body) : + regex("^[0-9a-f]{64} (.+)$", trimspace(line))[0] => substr(trimspace(line), 0, 64) + if can(regex("^[0-9a-f]{64} ", trimspace(line))) + } : {} } data "http" "feed" { @@ -22,6 +54,14 @@ data "http" "feed" { condition = self.status_code == 200 error_message = "Failed to fetch ${self.url} — HTTP ${self.status_code}. Common causes: (1) misspelled region name, (2) region has no published feed (per-region files are emitted only when ≥1 CIDR exists), (3) wrong source_base_url. Browse available feeds at ${var.source_base_url}/." } + + postcondition { + # Either verification is off, or the file's hash matches the manifest. + # Using lookup() with empty default makes a missing-from-manifest + # condition fail with a clear message rather than a TF map-lookup error. + condition = !local.do_verify || sha256(self.response_body) == lookup(local.expected_hashes, replace(self.url, "${var.source_base_url}/", ""), "") + error_message = "SHA256 mismatch for ${self.url}. Expected ${lookup(local.expected_hashes, replace(self.url, "${var.source_base_url}/", ""), "(filename not present in SHA256SUMS)")}, got ${sha256(self.response_body)}. Possible tampering — refusing to apply. Verify SHA256SUMS at ${var.source_base_url}/SHA256SUMS or pin to a known-good ?ref=. If your source intentionally doesn't publish SHA256SUMS, set verify_checksums = false." + } } } diff --git a/terraform/tests/checksums.tftest.hcl b/terraform/tests/checksums.tftest.hcl new file mode 100644 index 0000000..85e5853 --- /dev/null +++ b/terraform/tests/checksums.tftest.hcl @@ -0,0 +1,178 @@ +// Mock-based tests for SHA256 verification logic. Uses override_data on the +// http data sources so we don't need network. The known hashes below were +// computed with: python3 -c 'import hashlib; print(hashlib.sha256(body).hexdigest())' + +variables { + cloud = "aws" + source_base_url = "https://example.test/output" +} + +run "verify_passes_when_hash_matches" { + command = plan + + variables { + regions = ["alpha"] + } + + override_data { + target = data.http.checksums[0] + values = { + status_code = 200 + response_body = "7179d5c23a29625799df8a2e0e6409b78ac6bf6ad09e0dabf5db6b92b7d581b2 aws-alpha.txt\n" + } + } + + override_data { + target = data.http.feed["aws-alpha.txt"] + values = { + status_code = 200 + response_body = "1.2.3.0/24\n" + url = "https://example.test/output/aws-alpha.txt" + } + } + + assert { + condition = length(output.cidrs) == 1 && output.cidrs[0] == "1.2.3.0/24" + error_message = "Expected ['1.2.3.0/24'], got ${jsonencode(output.cidrs)}" + } +} + +run "verify_fails_on_hash_mismatch" { + command = plan + + variables { + regions = ["alpha"] + } + + override_data { + target = data.http.checksums[0] + values = { + status_code = 200 + response_body = "0000000000000000000000000000000000000000000000000000000000000000 aws-alpha.txt\n" + } + } + + override_data { + target = data.http.feed["aws-alpha.txt"] + values = { + status_code = 200 + response_body = "1.2.3.0/24\n" + url = "https://example.test/output/aws-alpha.txt" + } + } + + expect_failures = [data.http.feed["aws-alpha.txt"]] +} + +run "verify_fails_when_file_missing_from_manifest" { + command = plan + + variables { + regions = ["alpha"] + } + + override_data { + target = data.http.checksums[0] + values = { + status_code = 200 + response_body = "deadbeef00000000000000000000000000000000000000000000000000000000 aws-other.txt\n" + } + } + + override_data { + target = data.http.feed["aws-alpha.txt"] + values = { + status_code = 200 + response_body = "1.2.3.0/24\n" + url = "https://example.test/output/aws-alpha.txt" + } + } + + expect_failures = [data.http.feed["aws-alpha.txt"]] +} + +run "verify_disabled_skips_fetch" { + command = plan + + variables { + regions = ["alpha"] + verify_checksums = false + } + + // No checksums data resource should be created when verify_checksums = false, + // so we only override the feed. + override_data { + target = data.http.feed["aws-alpha.txt"] + values = { + status_code = 200 + response_body = "1.2.3.0/24\n" + url = "https://example.test/output/aws-alpha.txt" + } + } + + assert { + condition = length(data.http.checksums) == 0 + error_message = "When verify_checksums = false, the checksums data resource must not be instantiated." + } + + assert { + condition = length(output.cidrs) == 1 && output.cidrs[0] == "1.2.3.0/24" + error_message = "Module must still produce CIDRs when verification is off." + } +} + +run "verify_silently_skipped_in_local_mode" { + // verify_checksums defaults to true, but local mode (source_files set) + // should suppress the SHA256SUMS fetch entirely. + command = plan + + variables { + source_files = ["tests/fixtures/region-a.txt"] + // verify_checksums omitted — defaults to true + } + + assert { + condition = length(data.http.checksums) == 0 + error_message = "Local mode must suppress the SHA256SUMS fetch even with verify_checksums = true (default)." + } + + assert { + condition = length(output.cidrs) == 3 + error_message = "Local mode must still parse the local fixture." + } +} + +run "manifest_with_blank_and_comment_lines_is_tolerated" { + command = plan + + variables { + regions = ["alpha"] + } + + override_data { + target = data.http.checksums[0] + values = { + status_code = 200 + response_body = <<-EOT + # Manifest header — should be ignored + + 7179d5c23a29625799df8a2e0e6409b78ac6bf6ad09e0dabf5db6b92b7d581b2 aws-alpha.txt + + EOT + } + } + + override_data { + target = data.http.feed["aws-alpha.txt"] + values = { + status_code = 200 + response_body = "1.2.3.0/24\n" + url = "https://example.test/output/aws-alpha.txt" + } + } + + assert { + condition = length(output.cidrs) == 1 && output.cidrs[0] == "1.2.3.0/24" + error_message = "Manifest with blanks/comments must still verify and produce expected CIDRs." + } +} diff --git a/terraform/variables.tf b/terraform/variables.tf index c0e7f6f..f4beb1d 100644 --- a/terraform/variables.tf +++ b/terraform/variables.tf @@ -46,3 +46,9 @@ variable "min_cidr_count" { error_message = "min_cidr_count must be >= 0." } } + +variable "verify_checksums" { + type = bool + description = "When fetching feeds via source_base_url, also fetch SHA256SUMS and verify each feed's sha256 against it. Defense-in-depth against single-file tampering between commits. No-op in local source_files mode (the customer controls those files)." + default = true +}