From 669ea327234b24e94beb4f63ad3e579f4868de8d Mon Sep 17 00:00:00 2001 From: dancinlife Date: Sun, 7 Jun 2026 05:51:58 +0900 Subject: [PATCH] =?UTF-8?q?feat(skillopt=200.5.0=20+=20skillopt-hook=200.1?= =?UTF-8?q?.0):=20auto-use=20(hybrid)=20=E2=80=94=20activate=20+=20Session?= =?UTF-8?q?Start=20inject?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit "자동으로" + "agent actively uses" via a USE-vs-TRAIN split: USE is automatic + opt-in + cheap; TRAIN stays a command/agent decision (cost-bearing). - 🪝 skillopt-hook 0.1.0 (new hook) — SessionStart injects the active learned skill (~/.sidecar/skillopt/active-skill.md) as additionalContext so a trained skill is applied with NO command; silent when nothing activated; never trains (cost); always exits 0 (fail-open). Optional ~/.sidecar/skillopt/agent-active marker adds a one-line nudge to PROPOSE (never auto-run) /skillopt train for repeatable scored tasks. Same command/hook split shape as prefs. - 🎓 skillopt 0.5.0 — /skillopt activate (write SSOT → auto-use ON) · deactivate (OFF) · agent-active on|off (toggle nudge). Verified: activate → hook emits correct JSON (hookEventName=SessionStart + skill body) · agent-active appends nudge · deactivate → empty output (silent). g22 lockstep both plugins + CHANGELOG. --- .claude-plugin/marketplace.json | 10 ++++- CHANGELOG.md | 15 +++++++ commands/skillopt/.claude-plugin/plugin.json | 2 +- commands/skillopt/bin/skillopt.sh | 30 ++++++++++++++ .../skillopt-hook/.claude-plugin/plugin.json | 9 +++++ hooks/skillopt-hook/README.md | 39 +++++++++++++++++++ hooks/skillopt-hook/bin/skillopt_inject.sh | 37 ++++++++++++++++++ hooks/skillopt-hook/hooks/hooks.json | 7 ++++ 8 files changed, 146 insertions(+), 3 deletions(-) create mode 100644 hooks/skillopt-hook/.claude-plugin/plugin.json create mode 100644 hooks/skillopt-hook/README.md create mode 100755 hooks/skillopt-hook/bin/skillopt_inject.sh create mode 100644 hooks/skillopt-hook/hooks/hooks.json diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 6f578cc..5512a86 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -437,8 +437,14 @@ { "name": "skillopt", "source": "./commands/skillopt", - "description": "0.4.0 (2026-06-07) BACKGROUND TRAIN + HARDER EXAMPLE — `/skillopt train --bg` detaches the run (nohup → log under ~/.sidecar/skillopt/), returns immediately; `/skillopt status` shows running-state + score/step progress, `/skillopt log` tails. The bundled `examples/toyqa` dataset swapped to 6 format-sensitive QA items (chemical symbols, ISO codes, rounding…) that an EMPTY skill answers in a full sentence → STRICT exact-match fails → a real learning gradient (the optimizer learns a 'reply with only the value' rule, then the held-out gate rises); train_size 6 · batch 3 · sel 5 for stronger signal. 0.3.0 (2026-06-07) EXECUTABLE CLI — `/skillopt` now runs directly via a `!`-exec dispatcher (`bin/skillopt.sh`, prefs-style: resolves its own cached install dir, so the user never types a long path). `/skillopt train` runs the loop in one token; `/skillopt doctor|ckpts|consume |help` all dispatch. Honest 0-edit runs reported as such; claude -p clarified = subscription (NOT metered). 0.2.0 (2026-06-07) sidecar-OWNED env adapter — the only domain code (run a task + score it) now ships IN the plugin at `commands/skillopt/examples//`, NOT a clone of the upstream package; `bin/skillopt_run.py` injects sidecar adapters into the upstream hard-coded `_ENV_REGISTRY` at runtime (additive · survives `_register_builtins`) then runs the upstream trainer. Bundled reference `examples/toyqa/` = 4-item exact-match QA proving the loop runs end-to-end on local `claude -p` (no API key, no external data — target+optimizer both shell out to the Claude Code CLI = subscription, NOT metered); `examples/_base/default.yaml` vendors the upstream base config so examples are self-contained (plain `pip install skillopt` ships no `configs/`). Verified: baseline→rollout→reflect(on failures)→gate→test execute against real claude calls; an edit lands only when the optimizer judges a failure generalizable AND the held-out gate improves (no forced edit). 0.1.0 (2026-06-07) initial — /skillopt drives SkillOpt (microsoft/SkillOpt · `pip install skillopt`, arXiv:2605.23904), the text-space optimizer that trains a natural-language SKILL DOCUMENT for a frozen Claude Code agent via rollout → reflect → edit → held-out gate (DL analogy: skill.md = weights · rollout = forward · reflect = backprop · gate = validation early-stop). Subverbs — doctor (pip pkg + claude CLI + harness wiring readiness) · ckpts (list the package's bundled pretrained skill.md artifacts) · consume (load a trained skill into THIS session as additive system guidance) · train (run the loop with Claude Code as the target harness — env TARGET_BACKEND=claude_code_exec · CLAUDE_SETTING_SOURCES=user,project so the sidecar tapes ride along as FIXED scaffolding; refuses to run without a real scoring env adapter — no fabricated scores) · help. HARD INVARIANT: only the skill.md is optimized — the model, the governance tapes (= SkillOpt's fixed prompts/*_system.md), and every *-guard safety hook stay UNCHANGED; the trained skill is a SEPARATE per-domain file injected via --append-system-prompt; /skillopt never edits a .tape or a guard (a held-out UTILITY gate is not a SAFETY check — kept orthogonal). Wraps the upstream pip CLI; does not vendor or fork it. Companion: microsoft/SkillLens (arXiv:2605.23899).", - "version": "0.4.0" + "description": "0.5.0 (2026-06-07) AUTO-USE (activate) — `/skillopt activate ` copies a trained skill to the SSOT ~/.sidecar/skillopt/active-skill.md so the companion `skillopt-hook` auto-injects it at every SessionStart (no command needed); `/skillopt deactivate` removes it, `/skillopt agent-active on|off` toggles a one-line nudge that tells the agent to PROPOSE (never auto-run) `/skillopt train` for repeatable scored tasks. The USE-vs-TRAIN split: USE is automatic + opt-in + cheap; TRAIN stays a command/agent decision (cost-bearing). 0.4.0 (2026-06-07) BACKGROUND TRAIN + HARDER EXAMPLE — `/skillopt train --bg` detaches the run (nohup → log under ~/.sidecar/skillopt/), returns immediately; `/skillopt status` shows running-state + score/step progress, `/skillopt log` tails. The bundled `examples/toyqa` dataset swapped to 6 format-sensitive QA items (chemical symbols, ISO codes, rounding…) that an EMPTY skill answers in a full sentence → STRICT exact-match fails → a real learning gradient (the optimizer learns a 'reply with only the value' rule, then the held-out gate rises); train_size 6 · batch 3 · sel 5 for stronger signal. 0.3.0 (2026-06-07) EXECUTABLE CLI — `/skillopt` now runs directly via a `!`-exec dispatcher (`bin/skillopt.sh`, prefs-style: resolves its own cached install dir, so the user never types a long path). `/skillopt train` runs the loop in one token; `/skillopt doctor|ckpts|consume |help` all dispatch. Honest 0-edit runs reported as such; claude -p clarified = subscription (NOT metered). 0.2.0 (2026-06-07) sidecar-OWNED env adapter — the only domain code (run a task + score it) now ships IN the plugin at `commands/skillopt/examples//`, NOT a clone of the upstream package; `bin/skillopt_run.py` injects sidecar adapters into the upstream hard-coded `_ENV_REGISTRY` at runtime (additive · survives `_register_builtins`) then runs the upstream trainer. Bundled reference `examples/toyqa/` = 4-item exact-match QA proving the loop runs end-to-end on local `claude -p` (no API key, no external data — target+optimizer both shell out to the Claude Code CLI = subscription, NOT metered); `examples/_base/default.yaml` vendors the upstream base config so examples are self-contained (plain `pip install skillopt` ships no `configs/`). Verified: baseline→rollout→reflect(on failures)→gate→test execute against real claude calls; an edit lands only when the optimizer judges a failure generalizable AND the held-out gate improves (no forced edit). 0.1.0 (2026-06-07) initial — /skillopt drives SkillOpt (microsoft/SkillOpt · `pip install skillopt`, arXiv:2605.23904), the text-space optimizer that trains a natural-language SKILL DOCUMENT for a frozen Claude Code agent via rollout → reflect → edit → held-out gate (DL analogy: skill.md = weights · rollout = forward · reflect = backprop · gate = validation early-stop). Subverbs — doctor (pip pkg + claude CLI + harness wiring readiness) · ckpts (list the package's bundled pretrained skill.md artifacts) · consume (load a trained skill into THIS session as additive system guidance) · train (run the loop with Claude Code as the target harness — env TARGET_BACKEND=claude_code_exec · CLAUDE_SETTING_SOURCES=user,project so the sidecar tapes ride along as FIXED scaffolding; refuses to run without a real scoring env adapter — no fabricated scores) · help. HARD INVARIANT: only the skill.md is optimized — the model, the governance tapes (= SkillOpt's fixed prompts/*_system.md), and every *-guard safety hook stay UNCHANGED; the trained skill is a SEPARATE per-domain file injected via --append-system-prompt; /skillopt never edits a .tape or a guard (a held-out UTILITY gate is not a SAFETY check — kept orthogonal). Wraps the upstream pip CLI; does not vendor or fork it. Companion: microsoft/SkillLens (arXiv:2605.23899).", + "version": "0.5.0" + }, + { + "name": "skillopt-hook", + "source": "./hooks/skillopt-hook", + "description": "0.1.0 (2026-06-07) SessionStart hook — auto-USES a trained skill without a command. If the user activated a learned skill (`/skillopt activate ` → ~/.sidecar/skillopt/active-skill.md), injects it as additionalContext at session start so the agent applies it automatically (prefs-hook split: the command writes the SSOT, the hook auto-injects). OPT-IN + safe: silent when nothing is activated; NEVER trains (training is cost-bearing — stays a command/agent decision); always exits 0 (fail-open). Optional `~/.sidecar/skillopt/agent-active` marker adds a one-line nudge to PROPOSE (never auto-run) `/skillopt train` for repeatable auto-scorable tasks. Hook half of the skillopt USE-vs-TRAIN split; the `skillopt` command plugin is the TRAIN/activate half. NO env opt-out — the SSOT files are the switch.", + "version": "0.1.0" }, { "name": "sidecar", diff --git a/CHANGELOG.md b/CHANGELOG.md index eb625cd..f09f11a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,21 @@ For the full audit trail, see `git log`. --- +## 2026-06-07 — 🎓 skillopt 0.5.0 + skillopt-hook 0.1.0 — 자동 사용(하이브리드) + +"명령어 안 치고 자동으로" + "에이전트 적극 활용"을 USE-vs-TRAIN 분리로 구현. +**쓰기(USE)는 자동·opt-in·저렴 / 배우기(TRAIN)는 명령·에이전트 판단(비용)**. + +- 🪝 **skillopt-hook 0.1.0** (신규 hook) — SessionStart 에서 활성 스킬을 자동 주입. + `~/.sidecar/skillopt/active-skill.md` 있으면 additionalContext 로 주입(명령 불필요), + 없으면 침묵. 학습은 절대 안 함(비용) · 항상 exit 0(fail-open). 선택 마커 + `~/.sidecar/skillopt/agent-active` 면 "반복+채점가능 작업엔 /skillopt train 을 제안 + (자동실행 금지)" 한 줄 nudge 추가. prefs 의 command/hook 분리 패턴과 동형. +- 🎓 **skillopt 0.5.0** — `/skillopt activate `(SSOT 기록 → 자동 사용 ON) · + `/skillopt deactivate`(OFF) · `/skillopt agent-active on|off`(nudge 토글) 추가. +- 검증: activate→훅 JSON(hookEventName=SessionStart + 스킬 본문) 정상 · agent-active + nudge 추가 · deactivate 후 출력 길이 0(침묵). g22 lockstep(plugin+marketplace) + CHANGELOG. + ## 2026-06-07 — 🎓 skillopt 0.4.0 — 백그라운드 학습 + 더 어려운 예제 (SKILLOPT.easy §8 구현) SKILLOPT.easy.md 의 "더 다듬을 거리" 2개를 구현. diff --git a/commands/skillopt/.claude-plugin/plugin.json b/commands/skillopt/.claude-plugin/plugin.json index fcd9954..a976d69 100644 --- a/commands/skillopt/.claude-plugin/plugin.json +++ b/commands/skillopt/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "skillopt", "description": "/skillopt — drive SkillOpt (microsoft/SkillOpt · `pip install skillopt`), the text-space optimizer that trains a natural-language SKILL DOCUMENT for a frozen Claude Code agent via rollout → reflect → edit → held-out gate (the DL analogy: the skill.md is the 'weights', rollout=forward, reflect=backprop, gate=validation early-stop). Subverbs — doctor (install + readiness check: pip pkg + claude CLI) · ckpts (list the bundled pretrained skill.md artifacts) · consume (load a trained skill into THIS session as system guidance) · train (run the optimization loop with Claude Code as the target harness — env TARGET_BACKEND=claude_code_exec, CLAUDE_SETTING_SOURCES=user,project; needs a scoring env adapter) · help. ONLY the skill.md changes — the model, the sidecar governance tapes (= SkillOpt's fixed prompts/*_system.md scaffolding), and the *-guard safety hooks all stay fixed. Never auto-edits governance; it produces a SEPARATE per-domain skill.md the harness injects via --append-system-prompt. Wraps the upstream pip CLI; does not vendor it.", - "version": "0.4.0", + "version": "0.5.0", "author": { "name": "dancinlab" }, diff --git a/commands/skillopt/bin/skillopt.sh b/commands/skillopt/bin/skillopt.sh index 9698159..588ca35 100755 --- a/commands/skillopt/bin/skillopt.sh +++ b/commands/skillopt/bin/skillopt.sh @@ -80,6 +80,30 @@ log_cmd() { [ -n "$log" ] && tail -40 "$log" || echo "no train log yet — run: /skillopt train --bg" } +SKDIR="${SKILLOPT_HOME:-$HOME/.sidecar/skillopt}" + +activate() { + local f="${1:-}" + [ -n "$f" ] && [ -f "$f" ] || { echo "usage: /skillopt activate "; return 1; } + mkdir -p "$SKDIR"; cp "$f" "$SKDIR/active-skill.md" + echo "✅ activated — skillopt-hook will auto-inject this skill at session start." + echo " source : $f" + echo " active : $SKDIR/active-skill.md · off: /skillopt deactivate" +} + +deactivate() { + rm -f "$SKDIR/active-skill.md" + echo "✅ deactivated — no skill is auto-injected (active-skill.md removed)." +} + +agent_active() { + case "${1:-}" in + on) mkdir -p "$SKDIR"; : > "$SKDIR/agent-active"; echo "✅ agent-active ON — sessions nudge the agent to PROPOSE /skillopt train for scored tasks." ;; + off) rm -f "$SKDIR/agent-active"; echo "✅ agent-active OFF — no train-proposal nudge." ;; + *) [ -f "$SKDIR/agent-active" ] && echo "agent-active: ON" || echo "agent-active: OFF"; echo "usage: /skillopt agent-active on|off" ;; + esac +} + usage() { cat <<'EOF' /skillopt — train a skill document for a frozen Claude Code agent (SkillOpt driver) @@ -88,6 +112,9 @@ usage() { /skillopt doctor pip pkg + claude CLI + wiring /skillopt ckpts list bundled pretrained skill.md /skillopt consume print a trained skill (agent adopts it this session) + /skillopt activate auto-USE a skill every session (skillopt-hook injects it) + /skillopt deactivate stop auto-using (remove the active skill) + /skillopt agent-active on|off toggle the "propose train" agent nudge /skillopt train [--config X] run rollout→reflect→edit→gate (default: examples/toyqa) /skillopt train --bg run in the background (returns immediately) /skillopt status background run state + score/step progress @@ -106,6 +133,9 @@ case "$sub" in train) train "$@" ;; status) status ;; log) log_cmd ;; + activate) activate "$@" ;; + deactivate) deactivate ;; + agent-active) agent_active "$@" ;; help|-h|--help) usage ;; *.md) consume "$sub" ;; # bare path → consume *) echo "unknown subverb: $sub"; echo; usage; exit 2 ;; diff --git a/hooks/skillopt-hook/.claude-plugin/plugin.json b/hooks/skillopt-hook/.claude-plugin/plugin.json new file mode 100644 index 0000000..51f9de0 --- /dev/null +++ b/hooks/skillopt-hook/.claude-plugin/plugin.json @@ -0,0 +1,9 @@ +{ + "name": "skillopt-hook", + "description": "0.1.0 (2026-06-07) SessionStart hook — auto-USES a trained skill without a command. If the user has activated a learned skill (`/skillopt activate ` → ~/.sidecar/skillopt/active-skill.md), this injects it as additionalContext at session start so the agent applies it automatically (the prefs-hook split: command writes the SSOT, hook auto-injects it). OPT-IN + safe: silent when nothing is activated; NEVER trains (training is cost-bearing — stays a command/agent decision); always exits 0 (fail-open, never blocks a session). An optional `~/.sidecar/skillopt/agent-active` marker adds a one-line nudge telling the agent to PROPOSE (never auto-run) `/skillopt train` for repeatable auto-scorable tasks. Hook half of the skillopt USE-vs-TRAIN split; the `/skillopt` command plugin is the TRAIN/activate half.", + "version": "0.1.0", + "author": { "name": "dancinlab" }, + "repository": "https://github.com/dancinlab/sidecar", + "license": "MIT", + "keywords": ["claude-code", "hook", "session-start", "skillopt", "skill-injection", "auto-consume"] +} diff --git a/hooks/skillopt-hook/README.md b/hooks/skillopt-hook/README.md new file mode 100644 index 0000000..7599d5f --- /dev/null +++ b/hooks/skillopt-hook/README.md @@ -0,0 +1,39 @@ +# skillopt-hook + +SessionStart hook — **auto-USES a trained skill without typing a command.** + +The skillopt USE-vs-TRAIN split (same shape as the prefs command/hook split): + +``` +[ /skillopt activate ] ──▶ ~/.sidecar/skillopt/active-skill.md (SSOT) + │ +[ SessionStart ] ──▶ skillopt-hook ──▶ inject active-skill.md as additionalContext + │ + agent applies it automatically (no command) +``` + +## Behavior + +- If `~/.sidecar/skillopt/active-skill.md` exists → inject it as session context. +- If `~/.sidecar/skillopt/agent-active` (opt-in marker) exists → add a one-line nudge + telling the agent to **propose** (never auto-run) `/skillopt train` for repeatable + auto-scorable tasks. +- Otherwise → **silent** (no injection). + +## Safety + +- **Never trains.** Training is cost-bearing (many `claude -p` calls) → it stays a + command (`/skillopt train`) or an agent proposal, never an automatic hook action. +- **Fail-open.** Always exits 0; a hiccup never blocks a session. +- **Opt-in.** Nothing is injected until the user activates a skill — no global noise. + +## Manage (via the `skillopt` command plugin) + +``` +/skillopt activate turn auto-use ON (copy → active-skill.md) +/skillopt deactivate turn auto-use OFF (remove active-skill.md) +/skillopt agent-active on|off toggle the train-proposal nudge +``` + +Hook half of the split; `/skillopt` (the `skillopt` command plugin) is the +TRAIN/activate half. `NO env opt-out` — the SSOT files themselves are the switch. diff --git a/hooks/skillopt-hook/bin/skillopt_inject.sh b/hooks/skillopt-hook/bin/skillopt_inject.sh new file mode 100755 index 0000000..3db9b5a --- /dev/null +++ b/hooks/skillopt-hook/bin/skillopt_inject.sh @@ -0,0 +1,37 @@ +#!/bin/sh +# skillopt_inject — SessionStart hook. Emits the ACTIVE learned skill (if the user +# activated one via `/skillopt activate`) as additionalContext, so a trained skill +# is auto-used WITHOUT typing a command. NEVER trains (cost) and NEVER fails the +# session (always exit 0). Silent when nothing is activated. +set -u +cat >/dev/null 2>&1 # drain the hook payload on stdin + +DIR="${SKILLOPT_HOME:-$HOME/.sidecar/skillopt}" +SKILL="$DIR/active-skill.md" +NUDGE="$DIR/agent-active" # opt-in marker for the train-proposal nudge + +ctx="" +if [ -f "$SKILL" ]; then + body=$(cat "$SKILL" 2>/dev/null) + ctx="# 🎓 Active learned skill (skillopt) — apply this as task guidance when it fits: + +$body" +fi +if [ -f "$NUDGE" ]; then + line="🎓 skillopt active-use: when you see a repeatable, auto-scorable task with no learned skill yet, PROPOSE \`/skillopt train\` (never auto-run it)." + if [ -n "$ctx" ]; then ctx="$ctx + +$line"; else ctx="$line"; fi +fi + +[ -n "$ctx" ] || exit 0 # nothing activated → stay silent + +# Emit additionalContext as JSON (hookEventName must match the firing event). +if command -v python3 >/dev/null 2>&1; then + python3 - "$ctx" <<'PY' +import json, sys +print(json.dumps({"hookSpecificOutput": {"hookEventName": "SessionStart", + "additionalContext": sys.argv[1]}})) +PY +fi +exit 0 diff --git a/hooks/skillopt-hook/hooks/hooks.json b/hooks/skillopt-hook/hooks/hooks.json new file mode 100644 index 0000000..7be545c --- /dev/null +++ b/hooks/skillopt-hook/hooks/hooks.json @@ -0,0 +1,7 @@ +{ + "hooks": { + "SessionStart": [ + { "hooks": [ { "type": "command", "command": "sh \"${CLAUDE_PLUGIN_ROOT}/bin/skillopt_inject.sh\"" } ] } + ] + } +}