feat: learning system Wave 2 quality improvements#162
Conversation
…ng, naming cleanup
- Move learning from Stop → SessionEnd hook with 3-session batching
(adaptive: 5-session batch at 15+ observations)
- Raise procedural thresholds to 3 observations + 24h temporal spread
(aligned with workflows; initial confidence 0.33 for both types)
- Fix transcript extraction for string-typed message content
- Eliminate empty-array loop noise in process_observations/create_artifacts
- Add reinforcement mechanism: local grep updates last_seen for loaded
self-learning artifacts on each session end (no LLM cost)
- Improve skill template quality: Iron Law section, activation triggers,
proper frontmatter with user-invocable/allowed-tools
- Rename artifact paths: commands from learned/ → self-learning/,
skills from learned-{slug}/ → {slug}/
- Add backwards-compatible legacy Stop hook cleanup in removeLearningHook
- Deprecate stop-update-learning (stub that exits immediately)
- Lower default max_daily_runs from 10 → 5
- Update tests (444 pass), docs, and CLI strings
…hook removal, extract artifact name helper - Fix daily cap counter format mismatch: session-end-learning wrote two-line format but background-learning reads tab-separated (same file) - Standardize env var naming: BG_LEARNER -> DEVFLOW_BG_LEARNER (matches existing DEVFLOW_BG_UPDATER convention used by memory hooks) - Use shared log-paths helper in session-end-learning (was computing its own slug with different separator, writing to different log directory) - Use json_update_field from json-parse library instead of inline jq/sed fallback for artifact reinforcement - Extract artifactName() helper in json-helper.cjs to deduplicate path parsing across learning-created and learning-new operations - Extract removeFromEvent() in learn.ts to deduplicate SessionEnd/Stop hook removal logic - Remove dead extract_user_messages() from background-learning (superseded by batch mode, referenced undefined SESSION_ID)
P0-Functionality: session-end-learning must read hook JSON from stdin (like all other hooks) instead of expecting positional args. Without this fix, CWD is always empty and the hook silently exits on every invocation, making the entire learning system non-functional. P1-Error Handling: add || true to json_field calls inside the reinforcement while-loop so a single malformed JSONL line does not crash the script under set -euo pipefail. P1-Functionality: extract session_id from hook JSON (preferred) with ls -t fallback, instead of relying solely on ls -t which could pick a different session's transcript under concurrent session endings. P1-Functionality: remove duplicate daily counter increment from background-learning (session-end-learning already increments before spawning), preventing the effective daily cap from being halved. P2-Consistency: fix configure wizard max_daily_runs default (10 -> 5) to match the new code default. P2-Naming: remove stale $SESSION_ID references from batch-mode log messages in background-learning.
Replace stale .learning-last-trigger reference with .learning-session-count and .learning-batch-ids to match CLAUDE.md and the new SessionEnd batching implementation.
scripts/hooks/session-end-learning
Outdated
| fi | ||
|
|
||
| # Write batch IDs file for background-learning to consume | ||
| cp "$SESSION_COUNT_FILE" "$BATCH_IDS_FILE" |
There was a problem hiding this comment.
Race Condition: Batch File Handoff Not Atomic
Lines 190-192 use cp + rm to move the session count file to the batch file. This is not atomic, creating a race window if two concurrent sessions trigger the hook simultaneously. Between the cp and rm, the second invocation could read a stale or partially-written batch file, resulting in duplicate LLM invocations or lost session IDs.
Flagged by: Security (85%), Architecture (85%)
Fix: Use mv instead:
mv "$SESSION_COUNT_FILE" "$BATCH_IDS_FILE"This is atomic on the same filesystem and eliminates the race window.
scripts/hooks/session-end-learning
Outdated
|
|
||
| # --- Find transcript --- | ||
| # Encode CWD for Claude's project path | ||
| ENCODED_CWD=$(echo "$CWD" | sed 's|/|-|g') |
There was a problem hiding this comment.
CWD Encoding Inconsistency
Line 63 uses sed s|/|-|g to encode the CWD, but this diverges from the established pattern in background-learning and background-memory-update which use sed s|^/|| | tr / -. While both produce similar results for typical paths, the inconsistency is fragile and could break with path variations.
Flagged by: Consistency (95%), Architecture (85%)
Fix: Align with the existing pattern:
ENCODED_CWD=$(echo "$CWD" | sed s|^/|| | tr / -)
PROJECTS_DIR="$HOME/.claude/projects/-${ENCODED_CWD}"This ensures consistency across all hook scripts that need to encode paths.
scripts/hooks/session-end-learning
Outdated
| local updated=false | ||
| local temp_log="${learning_log}.tmp" | ||
|
|
||
| while IFS= read -r line; do |
There was a problem hiding this comment.
Per-Line Subprocess Spawning in reinforce_loaded_artifacts
Lines 108-131 iterate through every line of learning-log.jsonl in a while read loop, spawning json_field (jq or node subprocess) 2-3 times per line. With 50+ observations, this creates 100-300+ subprocesses in the synchronous SessionEnd hook path, adding measurable latency (0.5-2s) to every session end.
Flagged by: Performance (92%), Complexity (85%)
Fix: Replace with a single-pass jq operation:
if [ "$_HAS_JQ" = "true" ]; then
local slugs_regex=$(echo "$loaded" | tr '\n' |)
jq -c --arg now "$now_iso" --arg slugs "$slugs_regex" '
if .status == "created" and .artifact_path != "" then
(.artifact_path | split("/") | if test("/commands/") then .[-1] | rtrimstr(".md") else .[-2] end) as $slug
| if ($slug | test($slugs)) then .last_seen = $now else . end
else . end
' "$learning_log" > "$temp_log"
fiThis reduces from N spawns to 1 and eliminates the blocking I/O.
scripts/hooks/background-learning
Outdated
| # Check temporal spread (applies to BOTH workflow and procedural) | ||
| STATUS=$(echo "$EXISTING_LINE" | json_field "status" "") | ||
| if [ "$OBS_TYPE" = "workflow" ] && [ "$STATUS" != "created" ]; then | ||
| if [ "$STATUS" != "created" ]; then |
There was a problem hiding this comment.
Duplicate Temporal Spread Calculation in process_observations
Lines 508-517 and 521-530 both compute FIRST_EPOCH and NOW_EPOCH identically for the same observation. This duplicates date parsing overhead and violates DRY - any fix to the date parsing logic must be applied twice, risking divergence.
Flagged by: Complexity (95%)
Fix: Extract a single check_temporal_spread() function:
check_temporal_spread() {
local first_seen="$1"
FIRST_EPOCH=$(date -j -f "%Y-%m-%dT%H:%M:%SZ" "$first_seen" +%s 2>/dev/null \
|| date -d "$first_seen" +%s 2>/dev/null \
|| echo "0")
NOW_EPOCH=$(date +%s)
SPREAD=$((NOW_EPOCH - FIRST_EPOCH))
}Then call once and reuse SPREAD in both blocks.
src/cli/commands/learn.ts
Outdated
| @@ -127,11 +131,11 @@ export function removeLearningHook(settingsJson: string): string { | |||
| export function hasLearningHook(settingsJson: string): boolean { | |||
There was a problem hiding this comment.
hasLearningHook Returns False for Legacy Stop Hook
This function only checks settings.hooks.SessionEnd for the new marker. Users who installed learning before this PR have the hook registered under settings.hooks.Stop with the legacy stop-update-learning marker. After upgrading the CLI (before running --disable && --enable), hasLearningHook() returns false, causing --status to show learning as disabled even though the (now-deprecated) Stop hook is still executing.
Flagged by: Regression (85%), Architecture (72%)
Fix: Either (a) have hasLearningHook also detect the legacy marker and show a "needs upgrade" state, or (b) have --enable auto-detect and upgrade the legacy hook in-place. Option (a) is clearer:
export function hasLearningHook(settingsJson: string): boolean {
const settings: Settings = JSON.parse(settingsJson);
const hasSessionEnd = settings.hooks?.SessionEnd?.some(h => h.includes(HOOK_MARKER));
const hasLegacyStop = settings.hooks?.Stop?.some(h => h.includes(LEGACY_HOOK_MARKER));
return hasSessionEnd || hasLegacyStop; // Return true for either
}
export function getLearningStatus(): string {
if (hasSessionEnd) return "enabled";
if (hasLegacyStop) return "needs upgrade (legacy Stop hook). Run: devflow learn --disable && devflow learn --enable";
return "disabled";
}| * Add the learning SessionEnd hook to settings JSON. | ||
| * Idempotent — returns unchanged JSON if hook already exists. | ||
| */ | ||
| export function addLearningHook(settingsJson: string, devflowDir: string): string { |
There was a problem hiding this comment.
addLearningHook Does Not Clean Up Legacy Stop Hook
When users run devflow learn --enable, this function only adds the new SessionEnd hook. It does not remove the legacy Stop hook that may still exist from pre-Wave-2 installations. Users who upgrade by running --enable will end up with both hooks registered, wasting a hook invocation on every session stop (the legacy Stop hook now just exits immediately).
Flagged by: Architecture (83%), Consistency (85%)
Fix: Have addLearningHook clean up the legacy hook first:
export function addLearningHook(settingsJson: string, devflowDir: string): string {
// First, clean up any legacy Stop hook
let cleaned = removeLearningHook(settingsJson);
const settings: Settings = JSON.parse(cleaned);
// Now add the new SessionEnd hook
if (!settings.hooks) settings.hooks = {};
if (!settings.hooks.SessionEnd) settings.hooks.SessionEnd = [];
const hookPath = path.join(devflowDir, "hooks", HOOK_FILE);
if (!settings.hooks.SessionEnd.includes(hookPath)) {
settings.hooks.SessionEnd.push(hookPath);
}
return JSON.stringify(settings);
}This makes --enable self-upgrading for existing users.
Code Review Summary: Wave 2 Quality Issues (60-79% Confidence)This PR introduces several consistency and style issues beyond the inline comments already posted. While individually minor, they should be addressed to maintain code quality: Consistency Issues (80-85% confidence)
Documentation Issues (90% confidence)file-organization.md Not Updated (
Security/Robustness Issues (80-82% confidence)
Performance Issue (90% confidence)Per-Line Subprocess Spawning in
Overall AssessmentBlocking Issues: 8 fixed via inline comments (race condition, CWD encoding, subprocess spawning, temporal spread duplication, hasLearningHook legacy detection, addLearningHook cleanup) Remaining Issues: The 12 items above should be addressed before merge. Most are quick fixes. Recommendation: CHANGES_REQUESTED with focus on:
Claude Code Review | 2026-03-25 |
- Replace set -euo pipefail with set -e (consistency with other hooks) - Change . to source for json-parse/log-paths sourcing (consistency) - Fix CWD encoding to match background-learning (strip leading slash) - Use ISO 8601 UTC timestamps in log() (consistency with other hooks) - Remove DEBUG guard from log() (align with unconditional logging in other hooks) - Validate session ID format before appending to batch file - Replace per-line subprocess spawning in reinforce_loaded_artifacts with single-pass jq/node operation - Replace non-atomic cp+rm with atomic mv for batch file handoff - Add disown after background process spawn (consistency with stop-update-memory) - Extract run_batch_check() function from top-level procedural code Co-Authored-By: Claude <noreply@anthropic.com>
…g enable, batch_size config - hasLearningHook now returns 'current' | 'legacy' | false to detect pre-Wave-2 users with Stop hook containing stop-update-learning - addLearningHook is self-upgrading: calls removeLearningHook first to clean up legacy Stop hooks before adding SessionEnd hook - formatLearningStatus shows legacy upgrade instructions when detected - Added batch_size to LearningConfig interface, defaults (3), and applyConfigLayer; added to --configure wizard - Updated tests: 59 total including new legacy detection, self-upgrading enable, batch_size config, and type guard tests Co-Authored-By: Claude <noreply@anthropic.com>
- Replace per-line subprocess spawning in extract_batch_messages with single-pass jq/node processing (issue #1) - Decompose process_observations into validate_observation, calculate_confidence, and check_temporal_spread helpers (issue #2) - Fix duplicate temporal spread calculation by computing epoch once in check_temporal_spread (issue #3) - Escape double quotes in ART_DESC for YAML frontmatter safety (issue #4) - Strengthen ART_NAME sanitization with strict kebab-case allowlist (issue #5) - Replace per-line subprocess in apply_temporal_decay with single-pass jq operation and node fallback (issue #6) - Replace per-line subprocess in create_artifacts status update with single-pass jq/node operation (issue #7) - Remove dead increment_daily_counter function (issue #8) - Extract write_command_artifact and write_skill_artifact helpers from create_artifacts (issue #9) - Change flat 30k char truncation to per-session 8k char cap for proportional session contribution (issue #10) - Add section comment markers to build_sonnet_prompt heredoc for navigability (issue #11)
- Add loadAndCountObservations tests (mixed valid/invalid, all-valid, empty input, invalidCount calculation) - Add extract-text-messages string content path test - Add learning-new operation test with self-learning prefix verification - Update learning-created fixture paths to production-realistic values - Add session-end-learning structural checks (syntax list, shebang, json-parse sourcing)
… in learn.ts - Extract readObservations() to deduplicate try/catch + loadAndCountObservations pattern - Extract warnIfInvalid() to deduplicate invalidCount > 0 warning message - Hoist logPath computation once instead of repeating in 4 branches - Remove unnecessary String() and !! casts on already-typed prompt values
- file-organization.md: session-end-learning replaces stop-update-learning - CHANGELOG.md: add Wave 2 Changed + Fixed entries under [Unreleased] - CLAUDE.md: include deprecated stop-update-learning in hooks list
…er.cjs Move 4 operations (temporal-decay, process-observations, create-artifacts, filter-observations) from shell to Node, reducing background-learning from 819 to 496 lines. Remove 10 shell functions, 4 dead json-parse wrappers, and 1 dead json-helper.cjs operation. Add 27 new tests covering all paths. Addresses PF-004 (background hook god script).
Summary
Wave 2 learning system quality improvements addressing 8 issues: SessionEnd migration, procedural thresholds, transcript extraction, empty-array noise elimination, reinforcement mechanism, skill template quality, artifact path renaming, and legacy hook cleanup.
Changes
Learning Pipeline
Transcript Extraction & Parsing
content[0].text, not direct fields)Reinforcement Mechanism
Skill Template Quality
Artifact Path Migration
learned/→self-learning/(consistency with plugin structure)learned/paths during cleanupLegacy Cleanup
Breaking Changes
None
Testing
learned/path detectionTesting gaps: End-to-end multi-session batching (requires >3 sessions in test environment). Integration testing with confidence persistence across session restarts (defer to Wave 3 if needed).
Related Issues
Closes Wave 2 quality epic (related: #160, #161)
Co-Authored-By: Claude noreply@anthropic.com