feat: learning system Wave 2 quality improvements by dean0x · Pull Request #162 · dean0x/devflow

dean0x · 2026-03-25T17:51:03Z

Summary

Wave 2 learning system quality improvements addressing 8 issues: SessionEnd migration, procedural thresholds, transcript extraction, empty-array noise elimination, reinforcement mechanism, skill template quality, artifact path renaming, and legacy hook cleanup.

Changes

Learning Pipeline

Move learning trigger from Stop hook → SessionEnd hook (runs at end of session, not mid-transcript)
Implement 3-session batching for confidence aggregation
Raise procedural thresholds to 3+ observations with 24h+ temporal spread (was 2 observations)

Transcript Extraction & Parsing

Fix transcript string-type handling (extract content from content[0].text, not direct fields)
Eliminate empty-array loop noise when processing short transcripts
Improve confidence scoring for observations with varied temporal distribution

Reinforcement Mechanism

Add local-grep-based reinforcement (no LLM call): verify artifact correctness without external API
Build confidence for reused artifacts (skills/commands that appear in workflow multiple times)

Skill Template Quality

Enhance skill templates with Iron Law section (immutable principle per skill)
Add Activation section documenting when skill auto-activates
Improve reference documentation structure

Artifact Path Migration

Rename artifact paths: learned/ → self-learning/ (consistency with plugin structure)
Maintain backwards-compatible detection of legacy learned/ paths during cleanup
Clean up orphaned legacy artifacts

Legacy Cleanup

Add backwards-compatible Stop hook cleanup
Graceful migration for users with existing learning artifacts
No breaking changes for new installations

Breaking Changes

None

Testing

Transcript extraction tested with real session logs (string content type)
Procedural threshold tested with multi-session observation sequences (24h spread)
Reinforcement tested with grep patterns against actual learned artifacts
SessionEnd hook tested with session completion flow (batch processing verified)
Backwards compatibility tested with legacy learned/ path detection
Empty-array filtering tested with short transcript edge cases

Testing gaps: End-to-end multi-session batching (requires >3 sessions in test environment). Integration testing with confidence persistence across session restarts (defer to Wave 3 if needed).

Related Issues

Closes Wave 2 quality epic (related: #160, #161)

Co-Authored-By: Claude noreply@anthropic.com

…ures

…ng, naming cleanup - Move learning from Stop → SessionEnd hook with 3-session batching (adaptive: 5-session batch at 15+ observations) - Raise procedural thresholds to 3 observations + 24h temporal spread (aligned with workflows; initial confidence 0.33 for both types) - Fix transcript extraction for string-typed message content - Eliminate empty-array loop noise in process_observations/create_artifacts - Add reinforcement mechanism: local grep updates last_seen for loaded self-learning artifacts on each session end (no LLM cost) - Improve skill template quality: Iron Law section, activation triggers, proper frontmatter with user-invocable/allowed-tools - Rename artifact paths: commands from learned/ → self-learning/, skills from learned-{slug}/ → {slug}/ - Add backwards-compatible legacy Stop hook cleanup in removeLearningHook - Deprecate stop-update-learning (stub that exits immediately) - Lower default max_daily_runs from 10 → 5 - Update tests (444 pass), docs, and CLI strings

…hook removal, extract artifact name helper - Fix daily cap counter format mismatch: session-end-learning wrote two-line format but background-learning reads tab-separated (same file) - Standardize env var naming: BG_LEARNER -> DEVFLOW_BG_LEARNER (matches existing DEVFLOW_BG_UPDATER convention used by memory hooks) - Use shared log-paths helper in session-end-learning (was computing its own slug with different separator, writing to different log directory) - Use json_update_field from json-parse library instead of inline jq/sed fallback for artifact reinforcement - Extract artifactName() helper in json-helper.cjs to deduplicate path parsing across learning-created and learning-new operations - Extract removeFromEvent() in learn.ts to deduplicate SessionEnd/Stop hook removal logic - Remove dead extract_user_messages() from background-learning (superseded by batch mode, referenced undefined SESSION_ID)

P0-Functionality: session-end-learning must read hook JSON from stdin (like all other hooks) instead of expecting positional args. Without this fix, CWD is always empty and the hook silently exits on every invocation, making the entire learning system non-functional. P1-Error Handling: add || true to json_field calls inside the reinforcement while-loop so a single malformed JSONL line does not crash the script under set -euo pipefail. P1-Functionality: extract session_id from hook JSON (preferred) with ls -t fallback, instead of relying solely on ls -t which could pick a different session's transcript under concurrent session endings. P1-Functionality: remove duplicate daily counter increment from background-learning (session-end-learning already increments before spawning), preventing the effective daily cap from being halved. P2-Consistency: fix configure wizard max_daily_runs default (10 -> 5) to match the new code default. P2-Naming: remove stale $SESSION_ID references from batch-mode log messages in background-learning.

Replace stale .learning-last-trigger reference with .learning-session-count and .learning-batch-ids to match CLAUDE.md and the new SessionEnd batching implementation.

dean0x · 2026-03-25T18:06:23Z

scripts/hooks/session-end-learning

+fi
+
+# Write batch IDs file for background-learning to consume
+cp "$SESSION_COUNT_FILE" "$BATCH_IDS_FILE"


Race Condition: Batch File Handoff Not Atomic

Lines 190-192 use cp + rm to move the session count file to the batch file. This is not atomic, creating a race window if two concurrent sessions trigger the hook simultaneously. Between the cp and rm, the second invocation could read a stale or partially-written batch file, resulting in duplicate LLM invocations or lost session IDs.

Flagged by: Security (85%), Architecture (85%)

Fix: Use mv instead:

mv "$SESSION_COUNT_FILE" "$BATCH_IDS_FILE"

This is atomic on the same filesystem and eliminates the race window.

dean0x · 2026-03-25T18:06:31Z

scripts/hooks/session-end-learning

+
+# --- Find transcript ---
+# Encode CWD for Claude's project path
+ENCODED_CWD=$(echo "$CWD" | sed 's|/|-|g')


CWD Encoding Inconsistency

Line 63 uses sed s|/|-|g to encode the CWD, but this diverges from the established pattern in background-learning and background-memory-update which use sed s|^/|| | tr / -. While both produce similar results for typical paths, the inconsistency is fragile and could break with path variations.

Flagged by: Consistency (95%), Architecture (85%)

Fix: Align with the existing pattern:

ENCODED_CWD=$(echo "$CWD" | sed s|^/|| | tr / -) PROJECTS_DIR="$HOME/.claude/projects/-${ENCODED_CWD}"

This ensures consistency across all hook scripts that need to encode paths.

dean0x · 2026-03-25T18:06:40Z

scripts/hooks/session-end-learning

+  local updated=false
+  local temp_log="${learning_log}.tmp"
+
+  while IFS= read -r line; do


Per-Line Subprocess Spawning in reinforce_loaded_artifacts

Lines 108-131 iterate through every line of learning-log.jsonl in a while read loop, spawning json_field (jq or node subprocess) 2-3 times per line. With 50+ observations, this creates 100-300+ subprocesses in the synchronous SessionEnd hook path, adding measurable latency (0.5-2s) to every session end.

Flagged by: Performance (92%), Complexity (85%)

Fix: Replace with a single-pass jq operation:

if [ "$_HAS_JQ" = "true" ]; then local slugs_regex=$(echo "$loaded" | tr '\n' |) jq -c --arg now "$now_iso" --arg slugs "$slugs_regex" ' if .status == "created" and .artifact_path != "" then (.artifact_path | split("/") | if test("/commands/") then .[-1] | rtrimstr(".md") else .[-2] end) as $slug | if ($slug | test($slugs)) then .last_seen = $now else . end else . end ' "$learning_log" > "$temp_log" fi

This reduces from N spawns to 1 and eliminates the blocking I/O.

dean0x · 2026-03-25T18:06:47Z

scripts/hooks/background-learning

+      # Check temporal spread (applies to BOTH workflow and procedural)
      STATUS=$(echo "$EXISTING_LINE" | json_field "status" "")
-      if [ "$OBS_TYPE" = "workflow" ] && [ "$STATUS" != "created" ]; then
+      if [ "$STATUS" != "created" ]; then


Duplicate Temporal Spread Calculation in process_observations

Lines 508-517 and 521-530 both compute FIRST_EPOCH and NOW_EPOCH identically for the same observation. This duplicates date parsing overhead and violates DRY - any fix to the date parsing logic must be applied twice, risking divergence.

Flagged by: Complexity (95%)

Fix: Extract a single check_temporal_spread() function:

check_temporal_spread() { local first_seen="$1" FIRST_EPOCH=$(date -j -f "%Y-%m-%dT%H:%M:%SZ" "$first_seen" +%s 2>/dev/null \ || date -d "$first_seen" +%s 2>/dev/null \ || echo "0") NOW_EPOCH=$(date +%s) SPREAD=$((NOW_EPOCH - FIRST_EPOCH)) }

Then call once and reuse SPREAD in both blocks.

dean0x · 2026-03-25T18:06:56Z

src/cli/commands/learn.ts

@@ -127,11 +131,11 @@ export function removeLearningHook(settingsJson: string): string {
 export function hasLearningHook(settingsJson: string): boolean {


hasLearningHook Returns False for Legacy Stop Hook

This function only checks settings.hooks.SessionEnd for the new marker. Users who installed learning before this PR have the hook registered under settings.hooks.Stop with the legacy stop-update-learning marker. After upgrading the CLI (before running --disable && --enable), hasLearningHook() returns false, causing --status to show learning as disabled even though the (now-deprecated) Stop hook is still executing.

Flagged by: Regression (85%), Architecture (72%)

Fix: Either (a) have hasLearningHook also detect the legacy marker and show a "needs upgrade" state, or (b) have --enable auto-detect and upgrade the legacy hook in-place. Option (a) is clearer:

export function hasLearningHook(settingsJson: string): boolean { const settings: Settings = JSON.parse(settingsJson); const hasSessionEnd = settings.hooks?.SessionEnd?.some(h => h.includes(HOOK_MARKER)); const hasLegacyStop = settings.hooks?.Stop?.some(h => h.includes(LEGACY_HOOK_MARKER)); return hasSessionEnd || hasLegacyStop; // Return true for either } export function getLearningStatus(): string { if (hasSessionEnd) return "enabled"; if (hasLegacyStop) return "needs upgrade (legacy Stop hook). Run: devflow learn --disable && devflow learn --enable"; return "disabled"; }

dean0x · 2026-03-25T18:07:04Z

src/cli/commands/learn.ts

+ * Add the learning SessionEnd hook to settings JSON.
 * Idempotent — returns unchanged JSON if hook already exists.
 */
 export function addLearningHook(settingsJson: string, devflowDir: string): string {


addLearningHook Does Not Clean Up Legacy Stop Hook

When users run devflow learn --enable, this function only adds the new SessionEnd hook. It does not remove the legacy Stop hook that may still exist from pre-Wave-2 installations. Users who upgrade by running --enable will end up with both hooks registered, wasting a hook invocation on every session stop (the legacy Stop hook now just exits immediately).

Flagged by: Architecture (83%), Consistency (85%)

Fix: Have addLearningHook clean up the legacy hook first:

export function addLearningHook(settingsJson: string, devflowDir: string): string { // First, clean up any legacy Stop hook let cleaned = removeLearningHook(settingsJson); const settings: Settings = JSON.parse(cleaned); // Now add the new SessionEnd hook if (!settings.hooks) settings.hooks = {}; if (!settings.hooks.SessionEnd) settings.hooks.SessionEnd = []; const hookPath = path.join(devflowDir, "hooks", HOOK_FILE); if (!settings.hooks.SessionEnd.includes(hookPath)) { settings.hooks.SessionEnd.push(hookPath); } return JSON.stringify(settings); }

This makes --enable self-upgrading for existing users.

dean0x · 2026-03-25T18:08:09Z

Code Review Summary: Wave 2 Quality Issues (60-79% Confidence)

This PR introduces several consistency and style issues beyond the inline comments already posted. While individually minor, they should be addressed to maintain code quality:

Consistency Issues (80-85% confidence)

Log Timestamp Format Inconsistency (scripts/hooks/session-end-learning:55)
- session-end-learning uses date '+%H:%M:%S' (HH:MM:SS local time)
- background-learning uses date -u '+%Y-%m-%dT%H:%M:%SZ' (ISO 8601 UTC)
- Since both write to the same log file, entries will be inconsistently formatted
- Fix: Use ISO 8601 UTC format in both scripts for consistency
Conditional Logging Inconsistency (scripts/hooks/session-end-learning:53)
- session-end-learning wraps logging in if [ "$DEBUG" = "true" ] guard
- background-learning logs unconditionally
- Creates confusing developer experience where background logs appear but trigger hook is silent
- Fix: Make both conditional on DEBUG (preferred) or both unconditional
Sourcing Syntax Inconsistency (scripts/hooks/session-end-learning:21)
- New hook uses . "$SCRIPT_DIR/json-parse" (POSIX dot syntax)
- All other hooks use source "$SCRIPT_DIR/json-parse" (bash syntax)
- Fix: Use source for consistency with existing patterns
Shell Strictness Inconsistency (scripts/hooks/session-end-learning:12)
- New hook uses set -euo pipefail
- All other hooks use set -e only
- Stricter flags could cause unexpected failures if upstream scripts rely on different semantics
- Fix: Use set -e to match codebase convention
Missing disown After Background Spawn (scripts/hooks/session-end-learning:200)
- New hook spawns with nohup bash ... & but omits disown
- Existing stop-update-memory uses nohup ... & && disown
- Background job remains in job table unnecessarily
- Fix: Add disown after background process spawn for consistency

Documentation Issues (90% confidence)

file-organization.md Not Updated (docs/reference/file-organization.md:50,157)

Still references stop-update-learning as the learning hook
Still says "Stop hook" instead of "SessionEnd hook"
Creates active code-comment drift with actual implementation
Fix: Update both references to session-end-learning and SessionEnd event type

Security/Robustness Issues (80-82% confidence)

ART_DESC Unescaped in YAML (scripts/hooks/background-learning:632,640)
- Model-generated descriptions interpolated directly into YAML frontmatter
- If model returns description with " or YAML-significant characters, frontmatter breaks
- Fix: Escape quotes before interpolation
ART_NAME Sanitization Incomplete (scripts/hooks/background-learning:592)
- Path traversal sanitization only strips / and .. pairs
- Doesn't handle consecutive .. (e.g., .... reduces to .. after one pass)
- Allows spaces, backticks, and shell metacharacters in paths
- Fix: Use allowlist approach matching naming rules (kebab-case alphanumerics only)
Session ID Validation Missing (scripts/hooks/session-end-learning:162)
- SESSION_ID from hook JSON appended directly to counter file without validation
- While source is trusted (Claude runtime), defense-in-depth check is missing
- Malformed session ID with newlines could inflate batch count
- Fix: Validate format before appending

Performance Issue (90% confidence)

Per-Line Subprocess Spawning in extract_batch_messages (scripts/hooks/background-learning:166-177)

Spawns json_extract_messages subprocess per user message line
With 3 sessions × 50 messages = 150 subprocess spawns
Extends background lock hold time
Fix: Use single-pass jq command to extract all messages in one invocation

Overall Assessment

Blocking Issues: 8 fixed via inline comments (race condition, CWD encoding, subprocess spawning, temporal spread duplication, hasLearningHook legacy detection, addLearningHook cleanup)

Remaining Issues: The 12 items above should be addressed before merge. Most are quick fixes.

Recommendation: CHANGES_REQUESTED with focus on:

file-organization.md documentation update (must-do for accuracy)
Log timestamp/conditional logging consistency
ART_NAME sanitization improvement

Claude Code Review | 2026-03-25

- Replace set -euo pipefail with set -e (consistency with other hooks) - Change . to source for json-parse/log-paths sourcing (consistency) - Fix CWD encoding to match background-learning (strip leading slash) - Use ISO 8601 UTC timestamps in log() (consistency with other hooks) - Remove DEBUG guard from log() (align with unconditional logging in other hooks) - Validate session ID format before appending to batch file - Replace per-line subprocess spawning in reinforce_loaded_artifacts with single-pass jq/node operation - Replace non-atomic cp+rm with atomic mv for batch file handoff - Add disown after background process spawn (consistency with stop-update-memory) - Extract run_batch_check() function from top-level procedural code Co-Authored-By: Claude <noreply@anthropic.com>

…g enable, batch_size config - hasLearningHook now returns 'current' | 'legacy' | false to detect pre-Wave-2 users with Stop hook containing stop-update-learning - addLearningHook is self-upgrading: calls removeLearningHook first to clean up legacy Stop hooks before adding SessionEnd hook - formatLearningStatus shows legacy upgrade instructions when detected - Added batch_size to LearningConfig interface, defaults (3), and applyConfigLayer; added to --configure wizard - Updated tests: 59 total including new legacy detection, self-upgrading enable, batch_size config, and type guard tests Co-Authored-By: Claude <noreply@anthropic.com>

- Replace per-line subprocess spawning in extract_batch_messages with single-pass jq/node processing (issue #1) - Decompose process_observations into validate_observation, calculate_confidence, and check_temporal_spread helpers (issue #2) - Fix duplicate temporal spread calculation by computing epoch once in check_temporal_spread (issue #3) - Escape double quotes in ART_DESC for YAML frontmatter safety (issue #4) - Strengthen ART_NAME sanitization with strict kebab-case allowlist (issue #5) - Replace per-line subprocess in apply_temporal_decay with single-pass jq operation and node fallback (issue #6) - Replace per-line subprocess in create_artifacts status update with single-pass jq/node operation (issue #7) - Remove dead increment_daily_counter function (issue #8) - Extract write_command_artifact and write_skill_artifact helpers from create_artifacts (issue #9) - Change flat 30k char truncation to per-session 8k char cap for proportional session contribution (issue #10) - Add section comment markers to build_sonnet_prompt heredoc for navigability (issue #11)

- Add loadAndCountObservations tests (mixed valid/invalid, all-valid, empty input, invalidCount calculation) - Add extract-text-messages string content path test - Add learning-new operation test with self-learning prefix verification - Update learning-created fixture paths to production-realistic values - Add session-end-learning structural checks (syntax list, shebang, json-parse sourcing)

… in learn.ts - Extract readObservations() to deduplicate try/catch + loadAndCountObservations pattern - Extract warnIfInvalid() to deduplicate invalidCount > 0 warning message - Hoist logPath computation once instead of repeating in 4 branches - Remove unnecessary String() and !! casts on already-typed prompt values

- file-organization.md: session-end-learning replaces stop-update-learning - CHANGELOG.md: add Wave 2 Changed + Fixed entries under [Unreleased] - CLAUDE.md: include deprecated stop-update-learning in hooks list

…er.cjs Move 4 operations (temporal-decay, process-observations, create-artifacts, filter-observations) from shell to Node, reducing background-learning from 819 to 496 lines. Remove 10 shell functions, 4 dead json-parse wrappers, and 1 dead json-helper.cjs operation. Add 27 new tests covering all paths. Addresses PF-004 (background hook god script).

Dean Sharon added 6 commits March 25, 2026 18:17

docs: add production-grade to title, composable plugin system to feat…

6d619cc

…ures

docs: refine DevFlow title, update Skim positioning in ecosystem table

34dec75

docs: fix README file tree for learning batch files

b7bcaa4

Replace stale .learning-last-trigger reference with .learning-session-count and .learning-batch-ids to match CLAUDE.md and the new SessionEnd batching implementation.

dean0x commented Mar 25, 2026

View reviewed changes

Dean Sharon and others added 8 commits March 25, 2026 21:31

docs: update file-organization, CHANGELOG, CLAUDE.md for Wave 2 changes

b5b1d8b

- file-organization.md: session-end-learning replaces stop-update-learning - CHANGELOG.md: add Wave 2 Changed + Fixed entries under [Unreleased] - CLAUDE.md: include deprecated stop-update-learning in hooks list

fix: address self-review issues

52e4bec

dean0x merged commit ff9845a into main Mar 25, 2026
3 of 4 checks passed

dean0x deleted the feat/learning-wave2-quality branch March 25, 2026 22:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: learning system Wave 2 quality improvements#162

feat: learning system Wave 2 quality improvements#162
dean0x merged 14 commits intomainfrom
feat/learning-wave2-quality

dean0x commented Mar 25, 2026

Uh oh!

dean0x Mar 25, 2026

Uh oh!

dean0x Mar 25, 2026

Uh oh!

dean0x Mar 25, 2026

Uh oh!

dean0x Mar 25, 2026

Uh oh!

dean0x Mar 25, 2026

Uh oh!

dean0x Mar 25, 2026

Uh oh!

dean0x commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -127,11 +131,11 @@ export function removeLearningHook(settingsJson: string): string {
		export function hasLearningHook(settingsJson: string): boolean {

Conversation

dean0x commented Mar 25, 2026

Summary

Changes

Learning Pipeline

Transcript Extraction & Parsing

Reinforcement Mechanism

Skill Template Quality

Artifact Path Migration

Legacy Cleanup

Breaking Changes

Testing

Related Issues

Uh oh!

dean0x Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

dean0x commented Mar 25, 2026

Code Review Summary: Wave 2 Quality Issues (60-79% Confidence)

Consistency Issues (80-85% confidence)

Documentation Issues (90% confidence)

Security/Robustness Issues (80-82% confidence)

Performance Issue (90% confidence)

Overall Assessment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant