Skip to content

docs(skills): add Gherkin scenarios to every SKILL.md requirement#58

Merged
awais786 merged 1 commit into
mainfrom
skill-scenarios
Jun 4, 2026
Merged

docs(skills): add Gherkin scenarios to every SKILL.md requirement#58
awais786 merged 1 commit into
mainfrom
skill-scenarios

Conversation

@awais786

@awais786 awais786 commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

The spec-coverage audit walks ### Requirement: blocks in vendor/openspec/skills/ same as it walks vendor/openspec/specs/. The spec format docs (and the new spec-review-checklist landed in PR #56) treat #### Scenario: blocks as a load-bearing part of the requirement shape: GIVEN/WHEN/THEN is the test author's blueprint. Without it, the requirement is too vague to pin.

Before this commit, every SKILL.md had ### Requirement: blocks WITHOUT matching scenarios — 36 requirements with no Gherkin shape. Tests still tagged them (the audit was green), but the spec format was inconsistent: spec.md files have scenarios, SKILL.md files didn't.

This commit closes the inconsistency. Per-app skills (outline-admin, penpot-admin, plane-admin, surfsense-admin, twenty-admin) gain 16 scenarios across 16 requirements; security-hardening gains 20 scenarios across 20 requirements.

Each scenario follows the same discipline:

  • Observable from outside (no DB introspection)
  • Names the failure mode the assertion catches
  • References the suite's existing constants and conventions (COOKIE_DOMAIN, PLATFORM_DOMAIN, NORMAL_USER, FOSS_USER, SURFSENSE_SEARCH_SPACE_ID, etc.)
  • Maps cleanly onto the existing test files that already pin the requirement (verified via existing @SPEC tags)

Counts per file:

outline-admin/SKILL.md 3 reqs / 3 scenarios
penpot-admin/SKILL.md 3 reqs / 3 scenarios
plane-admin/SKILL.md 4 reqs / 4 scenarios
surfsense-admin/SKILL.md 3 reqs / 3 scenarios
twenty-admin/SKILL.md 3 reqs / 3 scenarios
security-hardening/SKILL.md 20 reqs / 20 scenarios
────────────────────────── ────────────────────
TOTAL 36 reqs / 36 scenarios

app-rules/SKILL.md untouched — it has no ### Requirement: blocks (it's a reference doc, not a contract).

No code or test changes. Pure spec-shape harmonisation. Audit remains 88 covered/deferred/missing total.

Why this matters

  1. Future test authors have a blueprint. When someone adds a new test against a SKILL requirement, the scenario block is the test's design doc: GIVEN this state, WHEN this trigger, THEN this observable.

  2. The spec-review checklist now applies uniformly. PR docs: add spec & test shape-review checklist (presence vs shape gap) #56's "Part A: every requirement needs a Gherkin scenario" item is enforceable against SKILL.md files as well as spec.md files.

  3. Drift detection has more surface. A scenario whose observables stop matching reality (because the app's behaviour drifted) becomes spec-edit-shaped, not silent.

Out of scope (follow-ups)

  • Extending the audit script to parse and validate scenario shape (today the audit only counts ### Requirement: blocks; a future pass could enforce "every requirement has at least one scenario" via the same grep + count).
  • Promoting some "outside the spec-driven loop" allowlist entries to tagged tests now that the scenario blueprints exist for them.

Motivation / Background

This Pull Request has been created because:

  • Resolves #issue-id

Detail

This Pull Request changes:

Additional information

TIP: Provide additional information such as screenshots, benchmarks, reference to other repositories or alternative solutions

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature
  • CHANGELOG files are updated for the behavior change or additional feature (minor bug fixes and documentation changes should not be included)
  • This PR contains API changes and API documentation is updated accordingly (for critical or behavior change, please inform related parties about them).

The spec-coverage audit walks `### Requirement:` blocks in
vendor/openspec/skills/ same as it walks vendor/openspec/specs/.
The spec format docs (and the new spec-review-checklist landed in
PR #56) treat `#### Scenario:` blocks as a load-bearing part of
the requirement shape: GIVEN/WHEN/THEN is the test author's
blueprint. Without it, the requirement is too vague to pin.

Before this commit, every SKILL.md had `### Requirement:` blocks
WITHOUT matching scenarios — 36 requirements with no Gherkin
shape. Tests still tagged them (the audit was green), but the
spec format was inconsistent: spec.md files have scenarios,
SKILL.md files didn't.

This commit closes the inconsistency. Per-app skills
(outline-admin, penpot-admin, plane-admin, surfsense-admin,
twenty-admin) gain 16 scenarios across 16 requirements;
security-hardening gains 20 scenarios across 20 requirements.

Each scenario follows the same discipline:

  - Observable from outside (no DB introspection)
  - Names the failure mode the assertion catches
  - References the suite's existing constants and conventions
    (COOKIE_DOMAIN, PLATFORM_DOMAIN, NORMAL_USER, FOSS_USER,
    SURFSENSE_SEARCH_SPACE_ID, etc.)
  - Maps cleanly onto the existing test files that already pin
    the requirement (verified via existing @SPEC tags)

Counts per file:

  outline-admin/SKILL.md       3 reqs / 3 scenarios
  penpot-admin/SKILL.md        3 reqs / 3 scenarios
  plane-admin/SKILL.md         4 reqs / 4 scenarios
  surfsense-admin/SKILL.md     3 reqs / 3 scenarios
  twenty-admin/SKILL.md        3 reqs / 3 scenarios
  security-hardening/SKILL.md  20 reqs / 20 scenarios
  ──────────────────────────  ────────────────────
  TOTAL                        36 reqs / 36 scenarios

app-rules/SKILL.md untouched — it has no `### Requirement:`
blocks (it's a reference doc, not a contract).

No code or test changes. Pure spec-shape harmonisation. Audit
remains 88 covered/deferred/missing total.

# Why this matters

  1. **Future test authors have a blueprint.** When someone adds
     a new test against a SKILL requirement, the scenario block
     is the test's design doc: GIVEN this state, WHEN this
     trigger, THEN this observable.

  2. **The spec-review checklist now applies uniformly.** PR #56's
     "Part A: every requirement needs a Gherkin scenario" item is
     enforceable against SKILL.md files as well as spec.md files.

  3. **Drift detection has more surface.** A scenario whose
     observables stop matching reality (because the app's
     behaviour drifted) becomes spec-edit-shaped, not silent.

# Out of scope (follow-ups)

  - Extending the audit script to parse and validate scenario
    shape (today the audit only counts `### Requirement:` blocks;
    a future pass could enforce "every requirement has at least
    one scenario" via the same grep + count).
  - Promoting some "outside the spec-driven loop" allowlist
    entries to tagged tests now that the scenario blueprints
    exist for them.
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

SSO Spec Coverage Audit

Contract source: vendor/openspec/specs/ (snapshot vendored in this repo)

Status Count
✅ Covered (test tag found) 63
⚠️ Deferred (in deferred doc) 25
❌ Missing (no tag, no defer) 0
Total requirements 88

All 88 spec requirements are accounted for.

@awais786 awais786 merged commit d7177ef into main Jun 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant