Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -437,8 +437,14 @@
{
"name": "skillopt",
"source": "./commands/skillopt",
"description": "0.4.0 (2026-06-07) BACKGROUND TRAIN + HARDER EXAMPLE — `/skillopt train --bg` detaches the run (nohup → log under ~/.sidecar/skillopt/), returns immediately; `/skillopt status` shows running-state + score/step progress, `/skillopt log` tails. The bundled `examples/toyqa` dataset swapped to 6 format-sensitive QA items (chemical symbols, ISO codes, rounding…) that an EMPTY skill answers in a full sentence → STRICT exact-match fails → a real learning gradient (the optimizer learns a 'reply with only the value' rule, then the held-out gate rises); train_size 6 · batch 3 · sel 5 for stronger signal. 0.3.0 (2026-06-07) EXECUTABLE CLI — `/skillopt` now runs directly via a `!`-exec dispatcher (`bin/skillopt.sh`, prefs-style: resolves its own cached install dir, so the user never types a long path). `/skillopt train` runs the loop in one token; `/skillopt doctor|ckpts|consume <skill.md>|help` all dispatch. Honest 0-edit runs reported as such; claude -p clarified = subscription (NOT metered). 0.2.0 (2026-06-07) sidecar-OWNED env adapter — the only domain code (run a task + score it) now ships IN the plugin at `commands/skillopt/examples/<domain>/`, NOT a clone of the upstream package; `bin/skillopt_run.py` injects sidecar adapters into the upstream hard-coded `_ENV_REGISTRY` at runtime (additive · survives `_register_builtins`) then runs the upstream trainer. Bundled reference `examples/toyqa/` = 4-item exact-match QA proving the loop runs end-to-end on local `claude -p` (no API key, no external data — target+optimizer both shell out to the Claude Code CLI = subscription, NOT metered); `examples/_base/default.yaml` vendors the upstream base config so examples are self-contained (plain `pip install skillopt` ships no `configs/`). Verified: baseline→rollout→reflect(on failures)→gate→test execute against real claude calls; an edit lands only when the optimizer judges a failure generalizable AND the held-out gate improves (no forced edit). 0.1.0 (2026-06-07) initial — /skillopt drives SkillOpt (microsoft/SkillOpt · `pip install skillopt`, arXiv:2605.23904), the text-space optimizer that trains a natural-language SKILL DOCUMENT for a frozen Claude Code agent via rollout → reflect → edit → held-out gate (DL analogy: skill.md = weights · rollout = forward · reflect = backprop · gate = validation early-stop). Subverbs — doctor (pip pkg + claude CLI + harness wiring readiness) · ckpts (list the package's bundled pretrained skill.md artifacts) · consume <skill.md> (load a trained skill into THIS session as additive system guidance) · train <config.yaml> (run the loop with Claude Code as the target harness — env TARGET_BACKEND=claude_code_exec · CLAUDE_SETTING_SOURCES=user,project so the sidecar tapes ride along as FIXED scaffolding; refuses to run without a real scoring env adapter — no fabricated scores) · help. HARD INVARIANT: only the skill.md is optimized — the model, the governance tapes (= SkillOpt's fixed prompts/*_system.md), and every *-guard safety hook stay UNCHANGED; the trained skill is a SEPARATE per-domain file injected via --append-system-prompt; /skillopt never edits a .tape or a guard (a held-out UTILITY gate is not a SAFETY check — kept orthogonal). Wraps the upstream pip CLI; does not vendor or fork it. Companion: microsoft/SkillLens (arXiv:2605.23899).",
"version": "0.4.0"
"description": "0.5.0 (2026-06-07) AUTO-USE (activate) — `/skillopt activate <skill.md>` copies a trained skill to the SSOT ~/.sidecar/skillopt/active-skill.md so the companion `skillopt-hook` auto-injects it at every SessionStart (no command needed); `/skillopt deactivate` removes it, `/skillopt agent-active on|off` toggles a one-line nudge that tells the agent to PROPOSE (never auto-run) `/skillopt train` for repeatable scored tasks. The USE-vs-TRAIN split: USE is automatic + opt-in + cheap; TRAIN stays a command/agent decision (cost-bearing). 0.4.0 (2026-06-07) BACKGROUND TRAIN + HARDER EXAMPLE — `/skillopt train --bg` detaches the run (nohup → log under ~/.sidecar/skillopt/), returns immediately; `/skillopt status` shows running-state + score/step progress, `/skillopt log` tails. The bundled `examples/toyqa` dataset swapped to 6 format-sensitive QA items (chemical symbols, ISO codes, rounding…) that an EMPTY skill answers in a full sentence → STRICT exact-match fails → a real learning gradient (the optimizer learns a 'reply with only the value' rule, then the held-out gate rises); train_size 6 · batch 3 · sel 5 for stronger signal. 0.3.0 (2026-06-07) EXECUTABLE CLI — `/skillopt` now runs directly via a `!`-exec dispatcher (`bin/skillopt.sh`, prefs-style: resolves its own cached install dir, so the user never types a long path). `/skillopt train` runs the loop in one token; `/skillopt doctor|ckpts|consume <skill.md>|help` all dispatch. Honest 0-edit runs reported as such; claude -p clarified = subscription (NOT metered). 0.2.0 (2026-06-07) sidecar-OWNED env adapter — the only domain code (run a task + score it) now ships IN the plugin at `commands/skillopt/examples/<domain>/`, NOT a clone of the upstream package; `bin/skillopt_run.py` injects sidecar adapters into the upstream hard-coded `_ENV_REGISTRY` at runtime (additive · survives `_register_builtins`) then runs the upstream trainer. Bundled reference `examples/toyqa/` = 4-item exact-match QA proving the loop runs end-to-end on local `claude -p` (no API key, no external data — target+optimizer both shell out to the Claude Code CLI = subscription, NOT metered); `examples/_base/default.yaml` vendors the upstream base config so examples are self-contained (plain `pip install skillopt` ships no `configs/`). Verified: baseline→rollout→reflect(on failures)→gate→test execute against real claude calls; an edit lands only when the optimizer judges a failure generalizable AND the held-out gate improves (no forced edit). 0.1.0 (2026-06-07) initial — /skillopt drives SkillOpt (microsoft/SkillOpt · `pip install skillopt`, arXiv:2605.23904), the text-space optimizer that trains a natural-language SKILL DOCUMENT for a frozen Claude Code agent via rollout → reflect → edit → held-out gate (DL analogy: skill.md = weights · rollout = forward · reflect = backprop · gate = validation early-stop). Subverbs — doctor (pip pkg + claude CLI + harness wiring readiness) · ckpts (list the package's bundled pretrained skill.md artifacts) · consume <skill.md> (load a trained skill into THIS session as additive system guidance) · train <config.yaml> (run the loop with Claude Code as the target harness — env TARGET_BACKEND=claude_code_exec · CLAUDE_SETTING_SOURCES=user,project so the sidecar tapes ride along as FIXED scaffolding; refuses to run without a real scoring env adapter — no fabricated scores) · help. HARD INVARIANT: only the skill.md is optimized — the model, the governance tapes (= SkillOpt's fixed prompts/*_system.md), and every *-guard safety hook stay UNCHANGED; the trained skill is a SEPARATE per-domain file injected via --append-system-prompt; /skillopt never edits a .tape or a guard (a held-out UTILITY gate is not a SAFETY check — kept orthogonal). Wraps the upstream pip CLI; does not vendor or fork it. Companion: microsoft/SkillLens (arXiv:2605.23899).",
"version": "0.5.0"
},
{
"name": "skillopt-hook",
"source": "./hooks/skillopt-hook",
"description": "0.1.0 (2026-06-07) SessionStart hook — auto-USES a trained skill without a command. If the user activated a learned skill (`/skillopt activate <skill.md>` → ~/.sidecar/skillopt/active-skill.md), injects it as additionalContext at session start so the agent applies it automatically (prefs-hook split: the command writes the SSOT, the hook auto-injects). OPT-IN + safe: silent when nothing is activated; NEVER trains (training is cost-bearing — stays a command/agent decision); always exits 0 (fail-open). Optional `~/.sidecar/skillopt/agent-active` marker adds a one-line nudge to PROPOSE (never auto-run) `/skillopt train` for repeatable auto-scorable tasks. Hook half of the skillopt USE-vs-TRAIN split; the `skillopt` command plugin is the TRAIN/activate half. NO env opt-out — the SSOT files are the switch.",
"version": "0.1.0"
},
{
"name": "sidecar",
Expand Down
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,21 @@ For the full audit trail, see `git log`.

---

## 2026-06-07 — 🎓 skillopt 0.5.0 + skillopt-hook 0.1.0 — 자동 사용(하이브리드)

"명령어 안 치고 자동으로" + "에이전트 적극 활용"을 USE-vs-TRAIN 분리로 구현.
**쓰기(USE)는 자동·opt-in·저렴 / 배우기(TRAIN)는 명령·에이전트 판단(비용)**.

- 🪝 **skillopt-hook 0.1.0** (신규 hook) — SessionStart 에서 활성 스킬을 자동 주입.
`~/.sidecar/skillopt/active-skill.md` 있으면 additionalContext 로 주입(명령 불필요),
없으면 침묵. 학습은 절대 안 함(비용) · 항상 exit 0(fail-open). 선택 마커
`~/.sidecar/skillopt/agent-active` 면 "반복+채점가능 작업엔 /skillopt train 을 제안
(자동실행 금지)" 한 줄 nudge 추가. prefs 의 command/hook 분리 패턴과 동형.
- 🎓 **skillopt 0.5.0** — `/skillopt activate <skill.md>`(SSOT 기록 → 자동 사용 ON) ·
`/skillopt deactivate`(OFF) · `/skillopt agent-active on|off`(nudge 토글) 추가.
- 검증: activate→훅 JSON(hookEventName=SessionStart + 스킬 본문) 정상 · agent-active
nudge 추가 · deactivate 후 출력 길이 0(침묵). g22 lockstep(plugin+marketplace) + CHANGELOG.

## 2026-06-07 — 🎓 skillopt 0.4.0 — 백그라운드 학습 + 더 어려운 예제 (SKILLOPT.easy §8 구현)

SKILLOPT.easy.md 의 "더 다듬을 거리" 2개를 구현.
Expand Down
2 changes: 1 addition & 1 deletion commands/skillopt/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "skillopt",
"description": "/skillopt — drive SkillOpt (microsoft/SkillOpt · `pip install skillopt`), the text-space optimizer that trains a natural-language SKILL DOCUMENT for a frozen Claude Code agent via rollout → reflect → edit → held-out gate (the DL analogy: the skill.md is the 'weights', rollout=forward, reflect=backprop, gate=validation early-stop). Subverbs — doctor (install + readiness check: pip pkg + claude CLI) · ckpts (list the bundled pretrained skill.md artifacts) · consume <skill.md> (load a trained skill into THIS session as system guidance) · train <config.yaml> (run the optimization loop with Claude Code as the target harness — env TARGET_BACKEND=claude_code_exec, CLAUDE_SETTING_SOURCES=user,project; needs a scoring env adapter) · help. ONLY the skill.md changes — the model, the sidecar governance tapes (= SkillOpt's fixed prompts/*_system.md scaffolding), and the *-guard safety hooks all stay fixed. Never auto-edits governance; it produces a SEPARATE per-domain skill.md the harness injects via --append-system-prompt. Wraps the upstream pip CLI; does not vendor it.",
"version": "0.4.0",
"version": "0.5.0",
"author": {
"name": "dancinlab"
},
Expand Down
30 changes: 30 additions & 0 deletions commands/skillopt/bin/skillopt.sh
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,30 @@ log_cmd() {
[ -n "$log" ] && tail -40 "$log" || echo "no train log yet — run: /skillopt train --bg"
}

SKDIR="${SKILLOPT_HOME:-$HOME/.sidecar/skillopt}"

activate() {
local f="${1:-}"
[ -n "$f" ] && [ -f "$f" ] || { echo "usage: /skillopt activate <path/to/skill.md>"; return 1; }
mkdir -p "$SKDIR"; cp "$f" "$SKDIR/active-skill.md"
echo "✅ activated — skillopt-hook will auto-inject this skill at session start."
echo " source : $f"
echo " active : $SKDIR/active-skill.md · off: /skillopt deactivate"
}

deactivate() {
rm -f "$SKDIR/active-skill.md"
echo "✅ deactivated — no skill is auto-injected (active-skill.md removed)."
}

agent_active() {
case "${1:-}" in
on) mkdir -p "$SKDIR"; : > "$SKDIR/agent-active"; echo "✅ agent-active ON — sessions nudge the agent to PROPOSE /skillopt train for scored tasks." ;;
off) rm -f "$SKDIR/agent-active"; echo "✅ agent-active OFF — no train-proposal nudge." ;;
*) [ -f "$SKDIR/agent-active" ] && echo "agent-active: ON" || echo "agent-active: OFF"; echo "usage: /skillopt agent-active on|off" ;;
esac
}

usage() {
cat <<'EOF'
/skillopt — train a skill document for a frozen Claude Code agent (SkillOpt driver)
Expand All @@ -88,6 +112,9 @@ usage() {
/skillopt doctor pip pkg + claude CLI + wiring
/skillopt ckpts list bundled pretrained skill.md
/skillopt consume <skill.md> print a trained skill (agent adopts it this session)
/skillopt activate <skill.md> auto-USE a skill every session (skillopt-hook injects it)
/skillopt deactivate stop auto-using (remove the active skill)
/skillopt agent-active on|off toggle the "propose train" agent nudge
/skillopt train [--config X] run rollout→reflect→edit→gate (default: examples/toyqa)
/skillopt train --bg run in the background (returns immediately)
/skillopt status background run state + score/step progress
Expand All @@ -106,6 +133,9 @@ case "$sub" in
train) train "$@" ;;
status) status ;;
log) log_cmd ;;
activate) activate "$@" ;;
deactivate) deactivate ;;
agent-active) agent_active "$@" ;;
help|-h|--help) usage ;;
*.md) consume "$sub" ;; # bare path → consume
*) echo "unknown subverb: $sub"; echo; usage; exit 2 ;;
Expand Down
9 changes: 9 additions & 0 deletions hooks/skillopt-hook/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"name": "skillopt-hook",
"description": "0.1.0 (2026-06-07) SessionStart hook — auto-USES a trained skill without a command. If the user has activated a learned skill (`/skillopt activate <skill.md>` → ~/.sidecar/skillopt/active-skill.md), this injects it as additionalContext at session start so the agent applies it automatically (the prefs-hook split: command writes the SSOT, hook auto-injects it). OPT-IN + safe: silent when nothing is activated; NEVER trains (training is cost-bearing — stays a command/agent decision); always exits 0 (fail-open, never blocks a session). An optional `~/.sidecar/skillopt/agent-active` marker adds a one-line nudge telling the agent to PROPOSE (never auto-run) `/skillopt train` for repeatable auto-scorable tasks. Hook half of the skillopt USE-vs-TRAIN split; the `/skillopt` command plugin is the TRAIN/activate half.",
"version": "0.1.0",
"author": { "name": "dancinlab" },
"repository": "https://github.com/dancinlab/sidecar",
"license": "MIT",
"keywords": ["claude-code", "hook", "session-start", "skillopt", "skill-injection", "auto-consume"]
}
39 changes: 39 additions & 0 deletions hooks/skillopt-hook/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# skillopt-hook

SessionStart hook — **auto-USES a trained skill without typing a command.**

The skillopt USE-vs-TRAIN split (same shape as the prefs command/hook split):

```
[ /skillopt activate <skill.md> ] ──▶ ~/.sidecar/skillopt/active-skill.md (SSOT)
[ SessionStart ] ──▶ skillopt-hook ──▶ inject active-skill.md as additionalContext
agent applies it automatically (no command)
```

## Behavior

- If `~/.sidecar/skillopt/active-skill.md` exists → inject it as session context.
- If `~/.sidecar/skillopt/agent-active` (opt-in marker) exists → add a one-line nudge
telling the agent to **propose** (never auto-run) `/skillopt train` for repeatable
auto-scorable tasks.
- Otherwise → **silent** (no injection).

## Safety

- **Never trains.** Training is cost-bearing (many `claude -p` calls) → it stays a
command (`/skillopt train`) or an agent proposal, never an automatic hook action.
- **Fail-open.** Always exits 0; a hiccup never blocks a session.
- **Opt-in.** Nothing is injected until the user activates a skill — no global noise.

## Manage (via the `skillopt` command plugin)

```
/skillopt activate <skill.md> turn auto-use ON (copy → active-skill.md)
/skillopt deactivate turn auto-use OFF (remove active-skill.md)
/skillopt agent-active on|off toggle the train-proposal nudge
```

Hook half of the split; `/skillopt` (the `skillopt` command plugin) is the
TRAIN/activate half. `NO env opt-out` — the SSOT files themselves are the switch.
37 changes: 37 additions & 0 deletions hooks/skillopt-hook/bin/skillopt_inject.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/bin/sh
# skillopt_inject — SessionStart hook. Emits the ACTIVE learned skill (if the user
# activated one via `/skillopt activate`) as additionalContext, so a trained skill
# is auto-used WITHOUT typing a command. NEVER trains (cost) and NEVER fails the
# session (always exit 0). Silent when nothing is activated.
set -u
cat >/dev/null 2>&1 # drain the hook payload on stdin

DIR="${SKILLOPT_HOME:-$HOME/.sidecar/skillopt}"
SKILL="$DIR/active-skill.md"
NUDGE="$DIR/agent-active" # opt-in marker for the train-proposal nudge

ctx=""
if [ -f "$SKILL" ]; then
body=$(cat "$SKILL" 2>/dev/null)
ctx="# 🎓 Active learned skill (skillopt) — apply this as task guidance when it fits:

$body"
fi
if [ -f "$NUDGE" ]; then
line="🎓 skillopt active-use: when you see a repeatable, auto-scorable task with no learned skill yet, PROPOSE \`/skillopt train\` (never auto-run it)."
if [ -n "$ctx" ]; then ctx="$ctx

$line"; else ctx="$line"; fi
fi

[ -n "$ctx" ] || exit 0 # nothing activated → stay silent

# Emit additionalContext as JSON (hookEventName must match the firing event).
if command -v python3 >/dev/null 2>&1; then
python3 - "$ctx" <<'PY'
import json, sys
print(json.dumps({"hookSpecificOutput": {"hookEventName": "SessionStart",
"additionalContext": sys.argv[1]}}))
PY
fi
exit 0
7 changes: 7 additions & 0 deletions hooks/skillopt-hook/hooks/hooks.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"hooks": {
"SessionStart": [
{ "hooks": [ { "type": "command", "command": "sh \"${CLAUDE_PLUGIN_ROOT}/bin/skillopt_inject.sh\"" } ] }
]
}
}