test+ci(cms): RFC-traceable tests, named threat suite, mutation gate by jamestexas · Pull Request #16 · agentic-research/go-cms

jamestexas · 2026-06-17T18:03:22Z

Summary

Round 4 of the audit-equivalent hardening shifts framing from "find the next mutant" to completeness as a property.

Stacked on PR #15 — review #15 first; this PR's diff vs main includes round 3 length-boundary tests. The mergeable diff vs #15's branch is the round-4 content only.

What "complete" actually means here

After rounds 1–3, we hit 82% mutation efficacy with three real bugs fixed. But three structural gaps remained:

No regression gate — efficacy could silently regress in the next PR.
No spec traceability — an auditor couldn't grep "tests for RFC 5652 §5.3" and find them.
No documented threat coverage — the lib's defenses were exercised by tests but never named.

This PR fills those gaps.

What's added

1. RFC-traceable compliance tests (`pkg/cms/rfc_compliance_test.go`)

Each test is named for the RFC clause it covers. Coverage spans:

RFC / Section	Test
RFC 5652 §5.1	`TestRFC5652_5_1_SignedDataVersion_Whitelist`
RFC 5652 §5.3	`TestRFC5652_5_3_SignerInfoVersion_PerSIDForm`
RFC 5652 §5.3	`TestRFC5652_5_3_SignatureAlgorithm_Ed25519`
RFC 5652 §5.4	`TestRFC5652_5_4_SignedAttributes_ContentTypeRequired`
RFC 5652 §5.4	`TestRFC5652_5_4_SignedAttributes_MessageDigestRequired`
RFC 5652 §10.1	`TestRFC5652_10_1_DERLength_NoLongFormForShortValues`
RFC 5652 §10.1	`TestRFC5652_10_1_DERLength_NoLeadingZeroInLongForm`
RFC 5652 §11.1	`TestRFC5652_11_1_eContentType_MustBeIdData_WhenAttrsAbsent`
RFC 8419 §3	`TestRFC8419_3_Ed25519_RequiresSHA512_WithSignedAttrs` (passthrough)
RFC 8419 §3	`TestRFC8419_3_Ed25519_AlgorithmParametersMustBeAbsent` (passthrough)
RFC 8032	`TestRFC8032_EdDSA_DeterministicSignature`

Most overlap with existing tests by design — the spec-mapping IS the contribution. If a clause stops being tested, the named test goes red.

2. Named threat-class tests (`pkg/cms/attack_scenarios_test.go`)

Attack class	Test
Cross-message signature replay	`TestAttack_SignerInfoCrossMessage_Replay`
Subject-name impersonation	`TestAttack_KeyConfusion_DifferentKey_SameSubject`
Trust-store bypass	`TestAttack_NoTrustedRoots_Denied`
Trailing-data smuggling	`TestAttack_TrailingDataInjection`
Digest-algorithm downgrade	`TestAttack_AlgorithmDowngrade_DigestVsActualBytes`
Attached/detached confusion	`TestAttack_AttachedEContent_RejectedForDetachedAPI`

3. Mutation testing as a CI gate

New `mutation` job in CI runs `gremlins` with `--threshold-efficacy 80`. PRs that drop test efficacy below 80% fail this gate.

4. Builder cleanup

`cms_builder_test.go` gets `SIVersionExplicit` / `SDVersionExplicit` booleans so tests can force literal-0 versions instead of the "0 = use default" sentinel. Needed for the §5.1 v0 rejection test.

Validation

	Round 3	Round 4	Δ
Test efficacy (gremlins)	82.19%	82.19%	—
Mutator coverage	85.76%	85.76%	—
Statement coverage	79.2%	79.2%	—
Spec clauses with named test	0	11	+11
Named attack-class tests	0	6	+6
CI mutation gate	none	80% threshold	new

Mutation efficacy unchanged is expected — round 4 adds traceability and gating, not new code-path coverage. The earlier rounds did the bug-finding; this round documents and locks it in.

Test plan

CI green across all matrix entries
New `mutation` job passes the 80% gate
`go test ./pkg/cms -run 'TestRFC|TestAttack' -v` enumerates all 17 named tests
Future PRs that drop efficacy below 80% fail CI

🤖 Generated with Claude Code

Round 4 of the audit-equivalent hardening shifts from per-mutant interventions to *completeness as a property*. Three pieces: 1. RFC-traceable compliance tests (pkg/cms/rfc_compliance_test.go) Every test is named after the RFC and clause it covers, so an auditor can answer "what tests does this library have for RFC 5652 §5.3?" with a single grep. Coverage spans: RFC 5652 §5.1 SignedData.Version whitelist {1, 3, 4, 5} RFC 5652 §5.3 SignerInfo.Version per SID form (IAS=1, SKI=3) RFC 5652 §5.3 SignatureAlgorithm must be Ed25519 OID RFC 5652 §5.4 Required signed attributes: contentType, messageDigest RFC 5652 §10.1 DER canonical length encoding (no long-form for values < 128; no leading zero in long form) RFC 5652 §11.1 eContentType must be id-data when signedAttrs are absent (Case 2) RFC 8419 §3 Ed25519 + SHA-512 mandate; algorithm parameters MUST be absent (passthrough delegates to existing load-bearing tests) RFC 8032 EdDSA deterministic-signature property propagates through the CMS encoder Most tests duplicate coverage that exists elsewhere — that's the point: the spec-mapping IS the contribution. Drift is now visible. 2. Named threat-class tests (pkg/cms/attack_scenarios_test.go) One test per documented CMS attack class, named for the attack it prevents. The library's defensive posture is now documented in the test suite itself, not just implied: - TestAttack_SignerInfoCrossMessage_Replay - TestAttack_KeyConfusion_DifferentKey_SameSubject - TestAttack_NoTrustedRoots_Denied - TestAttack_TrailingDataInjection - TestAttack_AlgorithmDowngrade_DigestVsActualBytes - TestAttack_AttachedEContent_RejectedForDetachedAPI 3. Mutation testing as a CI gate New 'mutation' job in .github/workflows/ci.yml runs gremlins with --threshold-efficacy 80 on every PR. The 80% floor reflects the post-round-4 baseline; surviving mutants past this threshold indicate test gaps that regress this branch's hardening work. New PRs that drop efficacy below 80% will fail this gate. Builder cleanup: cms_builder_test.go now has SIVersionExplicit / SDVersionExplicit booleans so tests can force literal-0 versions instead of the "0 = use default" sentinel. Was needed for the §5.1 v0 rejection test. Validation: Full suite (-race) passes at 79.2% statement coverage. Gremlins reports 82.19% test efficacy unchanged from round 3 — round 4 adds traceability and gating, not new code-path coverage. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Base automatically changed from feat/audit-r3-length-boundaries to main June 17, 2026 18:17

jamestexas force-pushed the feat/audit-r4-completeness branch from 38505d8 to 4de63bd Compare June 17, 2026 18:28

jamestexas mentioned this pull request Jun 17, 2026

test+docs(cms): internal pkg tests, mutation baseline, honest README #17

Merged

4 tasks

jamestexas merged commit 1b122d0 into main Jun 17, 2026
8 checks passed

jamestexas deleted the feat/audit-r4-completeness branch June 17, 2026 18:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test+ci(cms): RFC-traceable tests, named threat suite, mutation gate#16

test+ci(cms): RFC-traceable tests, named threat suite, mutation gate#16
jamestexas merged 1 commit into
mainfrom
feat/audit-r4-completeness

jamestexas commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jamestexas commented Jun 17, 2026

Summary

What "complete" actually means here

What's added

1. RFC-traceable compliance tests (`pkg/cms/rfc_compliance_test.go`)

2. Named threat-class tests (`pkg/cms/attack_scenarios_test.go`)

3. Mutation testing as a CI gate

4. Builder cleanup

Validation

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant