Skip to content

test+ci(cms): RFC-traceable tests, named threat suite, mutation gate#16

Merged
jamestexas merged 1 commit into
mainfrom
feat/audit-r4-completeness
Jun 17, 2026
Merged

test+ci(cms): RFC-traceable tests, named threat suite, mutation gate#16
jamestexas merged 1 commit into
mainfrom
feat/audit-r4-completeness

Conversation

@jamestexas

Copy link
Copy Markdown
Collaborator

Summary

Round 4 of the audit-equivalent hardening shifts framing from "find the next mutant" to completeness as a property.

Stacked on PR #15 — review #15 first; this PR's diff vs main includes round 3 length-boundary tests. The mergeable diff vs #15's branch is the round-4 content only.

What "complete" actually means here

After rounds 1–3, we hit 82% mutation efficacy with three real bugs fixed. But three structural gaps remained:

  1. No regression gate — efficacy could silently regress in the next PR.
  2. No spec traceability — an auditor couldn't grep "tests for RFC 5652 §5.3" and find them.
  3. No documented threat coverage — the lib's defenses were exercised by tests but never named.

This PR fills those gaps.

What's added

1. RFC-traceable compliance tests (`pkg/cms/rfc_compliance_test.go`)

Each test is named for the RFC clause it covers. Coverage spans:

RFC / Section Test
RFC 5652 §5.1 `TestRFC5652_5_1_SignedDataVersion_Whitelist`
RFC 5652 §5.3 `TestRFC5652_5_3_SignerInfoVersion_PerSIDForm`
RFC 5652 §5.3 `TestRFC5652_5_3_SignatureAlgorithm_Ed25519`
RFC 5652 §5.4 `TestRFC5652_5_4_SignedAttributes_ContentTypeRequired`
RFC 5652 §5.4 `TestRFC5652_5_4_SignedAttributes_MessageDigestRequired`
RFC 5652 §10.1 `TestRFC5652_10_1_DERLength_NoLongFormForShortValues`
RFC 5652 §10.1 `TestRFC5652_10_1_DERLength_NoLeadingZeroInLongForm`
RFC 5652 §11.1 `TestRFC5652_11_1_eContentType_MustBeIdData_WhenAttrsAbsent`
RFC 8419 §3 `TestRFC8419_3_Ed25519_RequiresSHA512_WithSignedAttrs` (passthrough)
RFC 8419 §3 `TestRFC8419_3_Ed25519_AlgorithmParametersMustBeAbsent` (passthrough)
RFC 8032 `TestRFC8032_EdDSA_DeterministicSignature`

Most overlap with existing tests by design — the spec-mapping IS the contribution. If a clause stops being tested, the named test goes red.

2. Named threat-class tests (`pkg/cms/attack_scenarios_test.go`)

Attack class Test
Cross-message signature replay `TestAttack_SignerInfoCrossMessage_Replay`
Subject-name impersonation `TestAttack_KeyConfusion_DifferentKey_SameSubject`
Trust-store bypass `TestAttack_NoTrustedRoots_Denied`
Trailing-data smuggling `TestAttack_TrailingDataInjection`
Digest-algorithm downgrade `TestAttack_AlgorithmDowngrade_DigestVsActualBytes`
Attached/detached confusion `TestAttack_AttachedEContent_RejectedForDetachedAPI`

3. Mutation testing as a CI gate

New `mutation` job in CI runs `gremlins` with `--threshold-efficacy 80`. PRs that drop test efficacy below 80% fail this gate.

4. Builder cleanup

`cms_builder_test.go` gets `SIVersionExplicit` / `SDVersionExplicit` booleans so tests can force literal-0 versions instead of the "0 = use default" sentinel. Needed for the §5.1 v0 rejection test.

Validation

Round 3 Round 4 Δ
Test efficacy (gremlins) 82.19% 82.19%
Mutator coverage 85.76% 85.76%
Statement coverage 79.2% 79.2%
Spec clauses with named test 0 11 +11
Named attack-class tests 0 6 +6
CI mutation gate none 80% threshold new

Mutation efficacy unchanged is expected — round 4 adds traceability and gating, not new code-path coverage. The earlier rounds did the bug-finding; this round documents and locks it in.

Test plan

  • CI green across all matrix entries
  • New `mutation` job passes the 80% gate
  • `go test ./pkg/cms -run 'TestRFC|TestAttack' -v` enumerates all 17 named tests
  • Future PRs that drop efficacy below 80% fail CI

🤖 Generated with Claude Code

Base automatically changed from feat/audit-r3-length-boundaries to main June 17, 2026 18:17
Round 4 of the audit-equivalent hardening shifts from per-mutant
interventions to *completeness as a property*. Three pieces:

1. RFC-traceable compliance tests (pkg/cms/rfc_compliance_test.go)

   Every test is named after the RFC and clause it covers, so an
   auditor can answer "what tests does this library have for RFC 5652
   §5.3?" with a single grep. Coverage spans:

     RFC 5652 §5.1  SignedData.Version whitelist {1, 3, 4, 5}
     RFC 5652 §5.3  SignerInfo.Version per SID form (IAS=1, SKI=3)
     RFC 5652 §5.3  SignatureAlgorithm must be Ed25519 OID
     RFC 5652 §5.4  Required signed attributes: contentType,
                    messageDigest
     RFC 5652 §10.1 DER canonical length encoding (no long-form for
                    values < 128; no leading zero in long form)
     RFC 5652 §11.1 eContentType must be id-data when signedAttrs
                    are absent (Case 2)
     RFC 8419 §3   Ed25519 + SHA-512 mandate; algorithm parameters
                   MUST be absent (passthrough delegates to existing
                   load-bearing tests)
     RFC 8032      EdDSA deterministic-signature property propagates
                   through the CMS encoder

   Most tests duplicate coverage that exists elsewhere — that's the
   point: the spec-mapping IS the contribution. Drift is now visible.

2. Named threat-class tests (pkg/cms/attack_scenarios_test.go)

   One test per documented CMS attack class, named for the attack it
   prevents. The library's defensive posture is now documented in the
   test suite itself, not just implied:

     - TestAttack_SignerInfoCrossMessage_Replay
     - TestAttack_KeyConfusion_DifferentKey_SameSubject
     - TestAttack_NoTrustedRoots_Denied
     - TestAttack_TrailingDataInjection
     - TestAttack_AlgorithmDowngrade_DigestVsActualBytes
     - TestAttack_AttachedEContent_RejectedForDetachedAPI

3. Mutation testing as a CI gate

   New 'mutation' job in .github/workflows/ci.yml runs gremlins with
   --threshold-efficacy 80 on every PR. The 80% floor reflects the
   post-round-4 baseline; surviving mutants past this threshold
   indicate test gaps that regress this branch's hardening work. New
   PRs that drop efficacy below 80% will fail this gate.

Builder cleanup:

   cms_builder_test.go now has SIVersionExplicit / SDVersionExplicit
   booleans so tests can force literal-0 versions instead of the "0 =
   use default" sentinel. Was needed for the §5.1 v0 rejection test.

Validation:

   Full suite (-race) passes at 79.2% statement coverage. Gremlins
   reports 82.19% test efficacy unchanged from round 3 — round 4 adds
   traceability and gating, not new code-path coverage.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jamestexas jamestexas force-pushed the feat/audit-r4-completeness branch from 38505d8 to 4de63bd Compare June 17, 2026 18:28
@jamestexas jamestexas merged commit 1b122d0 into main Jun 17, 2026
8 checks passed
@jamestexas jamestexas deleted the feat/audit-r4-completeness branch June 17, 2026 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant