docs(skills): add Gherkin scenarios to every SKILL.md requirement#58
Merged
Conversation
The spec-coverage audit walks `### Requirement:` blocks in vendor/openspec/skills/ same as it walks vendor/openspec/specs/. The spec format docs (and the new spec-review-checklist landed in PR #56) treat `#### Scenario:` blocks as a load-bearing part of the requirement shape: GIVEN/WHEN/THEN is the test author's blueprint. Without it, the requirement is too vague to pin. Before this commit, every SKILL.md had `### Requirement:` blocks WITHOUT matching scenarios — 36 requirements with no Gherkin shape. Tests still tagged them (the audit was green), but the spec format was inconsistent: spec.md files have scenarios, SKILL.md files didn't. This commit closes the inconsistency. Per-app skills (outline-admin, penpot-admin, plane-admin, surfsense-admin, twenty-admin) gain 16 scenarios across 16 requirements; security-hardening gains 20 scenarios across 20 requirements. Each scenario follows the same discipline: - Observable from outside (no DB introspection) - Names the failure mode the assertion catches - References the suite's existing constants and conventions (COOKIE_DOMAIN, PLATFORM_DOMAIN, NORMAL_USER, FOSS_USER, SURFSENSE_SEARCH_SPACE_ID, etc.) - Maps cleanly onto the existing test files that already pin the requirement (verified via existing @SPEC tags) Counts per file: outline-admin/SKILL.md 3 reqs / 3 scenarios penpot-admin/SKILL.md 3 reqs / 3 scenarios plane-admin/SKILL.md 4 reqs / 4 scenarios surfsense-admin/SKILL.md 3 reqs / 3 scenarios twenty-admin/SKILL.md 3 reqs / 3 scenarios security-hardening/SKILL.md 20 reqs / 20 scenarios ────────────────────────── ──────────────────── TOTAL 36 reqs / 36 scenarios app-rules/SKILL.md untouched — it has no `### Requirement:` blocks (it's a reference doc, not a contract). No code or test changes. Pure spec-shape harmonisation. Audit remains 88 covered/deferred/missing total. # Why this matters 1. **Future test authors have a blueprint.** When someone adds a new test against a SKILL requirement, the scenario block is the test's design doc: GIVEN this state, WHEN this trigger, THEN this observable. 2. **The spec-review checklist now applies uniformly.** PR #56's "Part A: every requirement needs a Gherkin scenario" item is enforceable against SKILL.md files as well as spec.md files. 3. **Drift detection has more surface.** A scenario whose observables stop matching reality (because the app's behaviour drifted) becomes spec-edit-shaped, not silent. # Out of scope (follow-ups) - Extending the audit script to parse and validate scenario shape (today the audit only counts `### Requirement:` blocks; a future pass could enforce "every requirement has at least one scenario" via the same grep + count). - Promoting some "outside the spec-driven loop" allowlist entries to tagged tests now that the scenario blueprints exist for them.
SSO Spec Coverage AuditContract source:
All 88 spec requirements are accounted for. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The spec-coverage audit walks
### Requirement:blocks in vendor/openspec/skills/ same as it walks vendor/openspec/specs/. The spec format docs (and the new spec-review-checklist landed in PR #56) treat#### Scenario:blocks as a load-bearing part of the requirement shape: GIVEN/WHEN/THEN is the test author's blueprint. Without it, the requirement is too vague to pin.Before this commit, every SKILL.md had
### Requirement:blocks WITHOUT matching scenarios — 36 requirements with no Gherkin shape. Tests still tagged them (the audit was green), but the spec format was inconsistent: spec.md files have scenarios, SKILL.md files didn't.This commit closes the inconsistency. Per-app skills (outline-admin, penpot-admin, plane-admin, surfsense-admin, twenty-admin) gain 16 scenarios across 16 requirements; security-hardening gains 20 scenarios across 20 requirements.
Each scenario follows the same discipline:
Counts per file:
outline-admin/SKILL.md 3 reqs / 3 scenarios
penpot-admin/SKILL.md 3 reqs / 3 scenarios
plane-admin/SKILL.md 4 reqs / 4 scenarios
surfsense-admin/SKILL.md 3 reqs / 3 scenarios
twenty-admin/SKILL.md 3 reqs / 3 scenarios
security-hardening/SKILL.md 20 reqs / 20 scenarios
────────────────────────── ────────────────────
TOTAL 36 reqs / 36 scenarios
app-rules/SKILL.md untouched — it has no
### Requirement:blocks (it's a reference doc, not a contract).No code or test changes. Pure spec-shape harmonisation. Audit remains 88 covered/deferred/missing total.
Why this matters
Future test authors have a blueprint. When someone adds a new test against a SKILL requirement, the scenario block is the test's design doc: GIVEN this state, WHEN this trigger, THEN this observable.
The spec-review checklist now applies uniformly. PR docs: add spec & test shape-review checklist (presence vs shape gap) #56's "Part A: every requirement needs a Gherkin scenario" item is enforceable against SKILL.md files as well as spec.md files.
Drift detection has more surface. A scenario whose observables stop matching reality (because the app's behaviour drifted) becomes spec-edit-shaped, not silent.
Out of scope (follow-ups)
### Requirement:blocks; a future pass could enforce "every requirement has at least one scenario" via the same grep + count).Motivation / Background
This Pull Request has been created because:
Detail
This Pull Request changes:
Additional information
TIP: Provide additional information such as screenshots, benchmarks, reference to other repositories or alternative solutions
Checklist
Before submitting the PR make sure the following are checked:
[Fix #issue-number]