Thanks for wanting to add or improve a skill. Here's what to know.
Each skill lives in its own folder:
skills/
<skill-name>/
SKILL.md # Required — the skill content
FORMS.md # Optional — structured input collection for agents
assets/ # Optional — literal templates and static files
scripts/ # Optional — executable helpers
references/ # Optional — deeper docs loaded on demand
evals/ # Required for repo-managed skills — test prompts for validation
evals.json
Repo-managed skills should be mirrored across all three locations:
skills/<name>/in this repo~/.claude/skills/<name>/~/.agents/skills/<name>/
If you edit a local install copy first, copy the changed files back into the repo and into the other local install so every agent sees the same skill version.
Every SKILL.md must start with a YAML front matter block:
---
name: your-skill-name
description: >
One or two sentences describing what this skill does and when the AI
should automatically invoke it. Be specific about trigger phrases and
use cases — this description is what the AI reads to decide whether
to load the skill.
---The rest of the file is free-form Markdown. Include:
- When to use — what scenarios or requests trigger this skill
- Rules / conventions — the core content the AI should follow
- Examples — good and bad, so the AI can calibrate
- Prerequisites — anything the human needs to set up first (tools, config, etc.)
- Skill folder and
namefield:kebab-case - Be specific —
git-bot-commitsis better thangitorcommits - Avoid version numbers in names; use the description to note maturity
The description is the most important field — it's how the AI decides to load the skill. Include:
- What the skill enables
- Specific trigger phrases (e.g. "Use when user says 'commit this' or 'stage changes'")
- What it enforces or prevents
Evals let you verify the skill works and measure improvement over a baseline. Every repo-managed skill in this repository must include evals/evals.json:
{
"skill_name": "your-skill-name",
"evals": [
{
"id": 0,
"prompt": "The user message to test against",
"expected_output": "What a correct response looks like — used for manual or automated grading",
"files": ["evals/files/example.md"]
}
]
}files is optional. When present, list one or more fixture files relative to skills/<name>/. A common pattern is to store those fixtures under evals/files/ so benchmark runners can copy or attach the same source inputs for both with_skill and without_skill runs.
Aim for 3–5 evals that cover distinct scenarios: happy path, edge cases, and cases where the skill should not do something.
Run evals from a temp workspace, not from this repository:
$workspace = Join-Path $env:TEMP '<skill-name>-workspace'When creating or modifying a repo-managed skill, the eval workflow must include a paired comparison:
- Resolve the installed Anthropic
skill-creatorpath first, usually under~/.agents/skills/skill-creator/or~/.claude/skills/skill-creator/, then run its benchmark scripts from there - Run each eval as
with_skill - Run the baseline as
without_skillfor new skills - For an existing skill, use either
without_skillor the previous/original skill version as the baseline, following theskill-creatorbenchmark model - Aggregate the results into
benchmark.json - Launch
eval-viewer/generate_review.pyfrom that installedskill-creatorcopy so a human can review bothOutputsandBenchmark
This repo treats that paired with_skill / without_skill comparison as part of the required devex for skill work. The benchmark artifacts live in the temp workspace; do not commit them to this repository unless the change explicitly calls for checked-in examples.
For scaffold/template skills, keep deterministic validators alongside evals. In this repo, evals/evals.json is mandatory, and validators like scripts/validate-skill-templates.ps1 are additional protection.
When a skill needs defaults for versions, paths, repository names, or support windows, prefer deriving them from a reliable source instead of baking in values that will drift.
- Good sources: git metadata, repo folder names, environment values, official JSON feeds, vendor docs APIs
- Use hardcoded examples as examples only — not as the real defaulting mechanism — when the value can be computed
Use the repo validation harness before submitting scaffold or template changes:
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1Run the validator locally first for the fastest feedback loop. GitHub Actions also runs the same script on pull requests, but CI is the backstop, not the primary authoring loop.
To compare a change against the initial imported version, run the same harness against a git ref:
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\validate-skill-templates.ps1 -Ref HEAD-
SKILL.mdhas valid front matter withnameanddescription - Skill is stack-agnostic (or clearly scoped to a specific tech in the name/description)
- Examples are generic — no personal emails, usernames, or project-specific identifiers
- At least one eval in
evals/evals.json - The skill's
evals/evals.jsonexists and itsskill_namematches the folder/frontmatter name - Any optional
filesentries inevals/evals.jsonpoint to real fixture files under the same skill folder - Skill changes were benchmarked from a temp workspace with both
with_skillandwithout_skillruns -
benchmark.jsonandeval-viewer/generate_review.pyfrom the installed Anthropicskill-creatorcopy were used so a human could compareOutputsandBenchmark -
scripts/validate-skill-templates.ps1passes for the current working tree when changing scaffold or template behavior - If CI is enabled for the branch, the GitHub Actions validation job passes too
- Skill evals are intended to run from
$env:TEMP/<skill-name>-workspace/, not from inside the repo - Changed skill files are synced across
skills/<name>/,~/.claude/skills/<name>/, and~/.agents/skills/<name>/ - Skill added to the table in
README.md