This document provides a high-level view of the ICR (Intent-Check-Receipt) plugin architecture. It's designed to help you understand the "big picture" before diving into implementation details.
Acknowledgement: This project is inspired by the YouTube video "The AI Failure Mode Nobody Warned You About (And how to prevent it from happening)". The video provides an excellent explanation of the "intent problem" that ICR addresses. We highly recommend watching it before reading this document.
- The Problem We're Solving
- The Solution: Three Phases
- System Architecture
- Component Overview
- Data Flow
- Key Algorithms
- File Organization
- Runtime Data
- Integration Points
- Design Decisions
When you ask an AI assistant to do something, there's a gap between:
- What you said: "Clean up old files"
- What you meant: Delete files older than 1 year in /tmp
- What the AI understood: Delete files older than 30 days everywhere
This gap is the Intent Problem. It leads to:
- Actions that don't match expectations
- Irreversible mistakes (deleted wrong files)
- Loss of trust in AI assistants
- No accountability when things go wrong
| Approach | Problem |
|---|---|
| Always ask for confirmation | Too many prompts, user fatigue |
| Never ask | Risky, no safety net |
| Simple permission rules | Doesn't understand context |
| Post-action logging | Too late to prevent mistakes |
ICR adds transparency before action and accountability after action:
- Make interpretation visible before acting
- Calibrate verification to risk level
- Log everything for audit and learning
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ β
β β β β β β β β
β β INTENT β β CHECK β β RECEIPT β β
β β β β β β β β
β β ββββββββββββββ β β ββββββββββββββ β β ββββββββββββββ β β
β β β Generate β β β β Evaluate β β β β Log β β β
β β β structured β β β β against β β β βeverything β β β
β β βinterpretationβ β β βthresholds β β β βthat β β β
β β β β β β β β β β βhappened β β β
β β ββββββββββββββ β β ββββββββββββββ β β ββββββββββββββ β β
β β β β β β β β β β β
β β βΌ β β βΌ β β βΌ β β
β β ββββββββββββββ β β ββββββββββββββ β β ββββββββββββββ β β
β β β Shows β β β β Routes β β β β Enables β β β
β β β WHAT β β β β to β β β β audit β β β
β β β and β β β β AUTO, β β β β and β β β
β β β WHY β β β β AI, or β β β β learning β β β
β β β β β β β HUMAN β β β β β β β
β β ββββββββββββββ β β ββββββββββββββ β β ββββββββββββββ β β
β β β β β β β β
β ββββββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββββββ β
β β
β Before Action β After Action β
β β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Generate a structured document that makes the AI's interpretation explicit:
- Task: What exactly will happen
- Who/What: What will and won't be affected
- Boundaries: Explicit limits on scope
- Reversibility: Can it be undone? How?
- Alternatives: Other interpretations considered
Evaluate whether verification is needed:
- Severity: How risky is this action type?
- Confidence: How certain are we about the interpretation?
- Decision: Route to AUTO, AI review, or HUMAN review
Log everything for accountability:
- Timestamp and session info
- The intent document
- The decision made
- User response (if asked)
- Execution outcome
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CLAUDE CODE β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β User β β Claude β β Tool β β Plugin β β
β β Input ββββ Brain ββββ Executor ββββ System β β
β β β β β β β β β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ ββββββββ¬βββββββ β
β β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββ
β
ββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ICR PLUGIN β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ENTRY POINTS β β
β β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β HOOKS β β SKILLS β β COMMANDS β β β
β β β β β β β β β β
β β β PreToolUse β β Intent β β /icr:* β β β
β β β Permission β β Analysis β β β β β
β β β SubagentStp β β β β β β β
β β ββββββββ¬βββββββ βββββββββββββββ βββββββββββββββ β β
β β β β β
β βββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CORE ENGINE β β
β β β β
β β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β β
β β β Severity β βConfidenceβ β Decision β β Intent β β β
β β βClassifierβ βCalculatorβ β Tree β βGenerator β β β
β β ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ β β
β β β β β β β β
β β ββββββββββββββ΄βββββββββββββ΄βββββββββββββ β β
β β β β β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β STORAGE LAYER β β β
β β β β β β
β β β ββββββββββββ ββββββββββββ ββββββββββββ β β β
β β β β Receipts β β Session β β Config β β β β
β β β β β β State β β β β β β
β β β ββββββββββββ ββββββββββββ ββββββββββββ β β β
β β β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Layer | Purpose | Components |
|---|---|---|
| Entry Points | Interface with Claude Code | Hooks, Skills, Commands |
| Core Engine | Business logic | Severity, Confidence, Decision, Intent |
| Storage | Persistence | Receipts, Session State, Config |
Automatically triggered by Claude Code events:
| Hook | Trigger | Purpose |
|---|---|---|
PreToolUse |
Before any tool runs | Main ICR check |
PermissionRequest |
When asking user | Add ICR context |
SubagentStop |
After subagent finishes | Review aggregate work |
User-invocable slash commands:
| Command | Purpose |
|---|---|
/icr:receipts |
View audit trail |
/icr:config |
Manage settings |
/icr:check |
Force manual check |
/icr:trust |
Toggle trust mode |
/icr:audit |
Pattern analysis |
/icr:export |
Export receipts |
/icr:stats |
View statistics |
/icr:debug |
Debug confidence |
/icr:simulate |
Dry-run check |
AI-invocable capabilities:
| Skill | Purpose |
|---|---|
intent-analysis |
Generate intent explanations |
Classifies actions into risk levels:
ββββββββββββββββββββββββββββββββββββββββββββββββββ
β SEVERITY CLASSIFIER β
β β
β Input: Tool name + Arguments β
β β
β βββββββββββββββββββββββββββββββββββββββββββ β
β β Layer 1: Static Rules β β
β β Bash β HIGH β β
β β Read β LOW β β
β ββββββββββββββββββββ¬βββββββββββββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββ β
β β Layer 2: Metadata Inference β β
β β "rm -rf" β CRITICAL β β
β β "delete" β CRITICAL β β
β ββββββββββββββββββββ¬βββββββββββββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββ β
β β Layer 3: User Rules β β
β β Custom patterns override β β
β ββββββββββββββββββββ¬βββββββββββββββββββββββ β
β βΌ β
β Output: LOW | MEDIUM | HIGH | CRITICAL β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββ
Calculates certainty score (0.0 - 1.0):
ββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONFIDENCE CALCULATOR β
β β
β βββββββββββββββ Weight: 30% β
β β Ambiguity β How many interpretations? β
β ββββββββ¬βββββββ β
β β β
β ββββββββΌβββββββ Weight: 25% β
β β Distance β Semantic gap? β
β ββββββββ¬βββββββ β
β β β
β ββββββββΌβββββββ Weight: 20% β
β β Historical β Seen before? β
β ββββββββ¬βββββββ β
β β β
β ββββββββΌβββββββ Weight: 25% β
β β Uncertainty β Hedging language? β
β ββββββββ¬βββββββ β
β β β
β βΌ β
β Weighted Average β 0.0 to 1.0 β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββ
Routes to appropriate verification:
ββββββββββββββββββββββββββββββββββββββββββββββββββ
β DECISION TREE β
β β
β Inputs: β
β - Severity (LOW/MEDIUM/HIGH/CRITICAL) β
β - Confidence (0.0 - 1.0) β
β - Trust Mode (on/off) β
β - Manual Check Flag β
β β
β βββββββββββββββββββββββββββββββββββββββββββ β
β β Manual check requested? β β
β β YES β HUMAN_REVIEW β β
β ββββββββββββββββββββ¬βββββββββββββββββββββββ β
β β NO β
β ββββββββββββββββββββΌβββββββββββββββββββββββ β
β β Trust mode on AND not CRITICAL? β β
β β YES β AUTO_APPROVE_TRUST_MODE β β
β ββββββββββββββββββββ¬βββββββββββββββββββββββ β
β β NO β
β ββββββββββββββββββββΌβββββββββββββββββββββββ β
β β Confidence >= autoApprove threshold? β β
β β YES β AUTO_APPROVE β β
β ββββββββββββββββββββ¬βββββββββββββββββββββββ β
β β NO β
β ββββββββββββββββββββΌβββββββββββββββββββββββ β
β β Confidence >= aiReview threshold? β β
β β YES β AI_REVIEW β β
β ββββββββββββββββββββ¬βββββββββββββββββββββββ β
β β NO β
β βΌ β
β HUMAN_REVIEW β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββ
Creates structured intent documents:
ββββββββββββββββββββββββββββββββββββββββββββββββββ
β INTENT GENERATOR β
β β
β Inputs: β
β - User prompt β
β - Tool name + arguments β
β - Confidence breakdown β
β β
β Outputs: β
β { β
β "task": "...", β
β "whoWhat": { β
β "affected": [...], β
β "excluded": [...] β
β }, β
β "boundaries": [...], β
β "ifUncertain": "...", β
β "reversibility": {...}, β
β "alternatives": [...], β
β "confidenceBreakdown": {...} β
β } β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββ
User Request: "Clean up old files"
β
βΌ
βββββββββββββββββββββββββ
β CLAUDE CODE β
β Interprets as: β
β Bash: rm -rf ... β
βββββββββββββ¬ββββββββββββ
β PreToolUse event
βΌ
βββββββββββββββββββββββββ
β pre-tool-check.sh β
β β
β 1. Check exclusions ββββΆ If excluded: proceed
β 2. Load config β
β 3. Classify severity ββββΆ severity.sh β CRITICAL
β 4. Generate intent ββββΆ intent.sh β document
β 5. Calc confidence ββββΆ confidence.sh β 0.62
β 6. Make decision ββββΆ decision.sh β HUMAN_REVIEW
β 7. Create receipt ββββΆ receipt.sh β saved
β 8. Return decision β
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β USER PROMPT β
β β
β Task: Delete files β
β Severity: CRITICAL β
β Confidence: 0.62 β
β β
β [1] Proceed β
β [2] Edit β
β [3] Abort β
βββββββββββββ¬ββββββββββββ
β User selects [1]
βΌ
βββββββββββββββββββββββββ
β RECEIPT UPDATED β
β - response: PROCEED β
β - outcome: proceededβ
βββββββββββββ¬ββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β TOOL EXECUTES β
β rm -rf runs β
βββββββββββββββββββββββββ
function calculate_confidence(user_prompt, tool, args, config):
# Get individual scores
ambiguity = analyze_ambiguity(user_prompt)
distance = calculate_distance(user_prompt, tool, args)
historical = analyze_historical(tool, args)
uncertainty = analyze_uncertainty(user_prompt)
# Get weights from config
weights = config.confidence.weights
# Weighted average
confidence = (
ambiguity * weights.ambiguityAnalysis +
distance * weights.intentToActionDistance +
historical * weights.historicalPatterns +
uncertainty * weights.uncertaintyMarkers
)
return clamp(confidence, 0.0, 1.0)
function make_decision(severity, confidence, config, session_state):
# Manual check overrides everything
if session_state.manualCheckNext:
return HUMAN_REVIEW
# Trust mode check
if session_state.trustModeEnabled and severity != CRITICAL:
return AUTO_APPROVE_TRUST_MODE
# Get thresholds for this severity
thresholds = config.thresholds[severity]
# Check against thresholds
if confidence >= thresholds.autoApprove:
return AUTO_APPROVE
if confidence >= thresholds.aiReview:
return AI_REVIEW
return HUMAN_REVIEW
function classify_severity(tool, args, config):
# Layer 1: Static rules
if tool in config.severity.staticRules:
base_severity = config.severity.staticRules[tool]
else:
base_severity = MEDIUM
# Layer 2: Metadata inference
inferred = infer_from_metadata(tool, args)
if inferred.severity > base_severity:
base_severity = inferred.severity
# Layer 3: User rules (highest priority)
for rule in config.severity.userRules:
if matches(rule.pattern, tool) and evaluates(rule.condition, args):
return rule.severity
return base_severity
icr/
βββ .claude-plugin/
β βββ plugin.json # Plugin manifest - START HERE
β
βββ scripts/
β βββ pre-tool-check.sh # Main entry point
β βββ permission-check.sh # Permission dialog hook
β βββ subagent-check.sh # Subagent completion hook
β βββ statusline.sh # Status line display
β βββ lib/
β βββ common.sh # Shared utilities
β βββ severity.sh # Severity classification
β βββ confidence.sh # Confidence calculation
β βββ decision.sh # Decision tree
β βββ intent.sh # Intent generation
β βββ receipt.sh # Receipt logging
β βββ checkpoint.sh # Checkpoint management
β
βββ commands/ # Slash commands (9 files)
βββ skills/ # AI skills (1 directory)
βββ config/ # Default configuration
βββ schemas/ # JSON validation schemas
βββ prompts/ # LLM prompt templates
βββ docs/ # Documentation
plugin.json
β
ββββΆ hooks/hooks.json
β β
β ββββΆ scripts/pre-tool-check.sh
β β
β ββββΆ lib/common.sh
β ββββΆ lib/severity.sh ββββΆ lib/common.sh
β ββββΆ lib/confidence.sh ββΆ lib/common.sh
β ββββΆ lib/decision.sh βββΆ lib/common.sh
β ββββΆ lib/intent.sh βββββΆ lib/common.sh
β ββββΆ lib/receipt.sh ββββΆ lib/common.sh
β ββββΆ lib/checkpoint.sh βΆ lib/common.sh
β
ββββΆ commands/*.md
β
ββββΆ skills/intent-analysis/SKILL.md
All runtime data is stored in .claude/icr/:
.claude/icr/
βββ config.json # User configuration overrides
βββ session-state.json # Current session state
βββ errors.log # Error log
β
βββ receipts/ # Audit trail
β βββ index.json # Fast lookup index
β βββ 2026-01-01/
β β βββ session-abc123/
β β βββ session-meta.json
β β βββ 001-receipt.json
β β βββ 002-receipt.json
β βββ 2026-01-02/
β βββ ...
β
βββ exports/ # Exported receipts
β βββ 2026-01-02T103000Z.json
β
βββ checkpoints/ # Checkpoint metadata
βββ chk-xyz789.json
| Data Type | Default Retention | Configurable |
|---|---|---|
| Receipts | 90 days | Yes |
| Exports | Permanent | N/A |
| Session state | Current session | N/A |
| Error logs | 30 days | No |
ICR integrates with Claude Code through:
- Plugin System: Discovered via
plugin.json - Hook Events: Receives
PreToolUse,PermissionRequest,SubagentStop - Commands: Registered slash commands
- Skills: AI-invocable capabilities
- Status Line: Optional status display
ICR configuration is merged from multiple sources:
Priority (lowest to highest):
1. Built-in defaults (config/defaults.json)
2. User global config (~/.claude/icr/config.json)
3. Project config (.claude/icr/config.json)
4. Runtime overrides (/icr:config set)
ICR integrates with Claude Code checkpoints:
- Creates checkpoints before HIGH/CRITICAL actions
- Links checkpoints to receipts
- Suggests rollback when problems detected
| Consideration | Decision |
|---|---|
| Claude Code hooks execute shell | Native integration |
| Cross-platform (macOS, Linux, WSL) | Maximum compatibility |
| No build step required | Faster iteration |
| Easy to modify and debug | Lower barrier |
| Consideration | Decision |
|---|---|
| Standard JSON tool | Available everywhere |
| Powerful query language | Complex transformations |
| Single dependency | Minimal requirements |
| Phase | Purpose |
|---|---|
| Intent | Transparency - show what AI thinks |
| Check | Control - let user/AI verify |
| Receipt | Accountability - log everything |
Different users have different risk tolerances:
- Security-focused: Higher thresholds (more checks)
- Productivity-focused: Lower thresholds (fewer interruptions)
- Mixed: Different thresholds per severity level
Constant verification causes fatigue. Trust mode provides:
- Escape hatch for known-safe work
- Still logs everything
- Never bypasses CRITICAL by default
- User Guide: USER_GUIDE.md
- Developer Guide: DEVELOPER_GUIDE.md
- Contributing: CONTRIBUTING.md
- Full Specification: ICR_PLUGIN_SPEC.md