ICR Architecture Overview

This document provides a high-level view of the ICR (Intent-Check-Receipt) plugin architecture. It's designed to help you understand the "big picture" before diving into implementation details.

Acknowledgement: This project is inspired by the YouTube video "The AI Failure Mode Nobody Warned You About (And how to prevent it from happening)". The video provides an excellent explanation of the "intent problem" that ICR addresses. We highly recommend watching it before reading this document.

The Problem We're Solving
The Solution: Three Phases
System Architecture
Component Overview
Data Flow
Key Algorithms
File Organization
Runtime Data
Integration Points
Design Decisions

The Problem We're Solving

The Intent Problem

When you ask an AI assistant to do something, there's a gap between:

What you said: "Clean up old files"
What you meant: Delete files older than 1 year in /tmp
What the AI understood: Delete files older than 30 days everywhere

This gap is the Intent Problem. It leads to:

Actions that don't match expectations
Irreversible mistakes (deleted wrong files)
Loss of trust in AI assistants
No accountability when things go wrong

Why Existing Solutions Fall Short

Approach	Problem
Always ask for confirmation	Too many prompts, user fatigue
Never ask	Risky, no safety net
Simple permission rules	Doesn't understand context
Post-action logging	Too late to prevent mistakes

The ICR Approach

ICR adds transparency before action and accountability after action:

Make interpretation visible before acting
Calibrate verification to risk level
Log everything for audit and learning

The Solution: Three Phases

┌─────────────────────────────────────────────────────────────────────────┐
│                                                                          │
│  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐       │
│  │                  │  │                  │  │                  │       │
│  │     INTENT       │  │      CHECK       │  │     RECEIPT      │       │
│  │                  │  │                  │  │                  │       │
│  │  ┌────────────┐  │  │  ┌────────────┐  │  │  ┌────────────┐  │       │
│  │  │  Generate  │  │  │  │  Evaluate  │  │  │  │    Log     │  │       │
│  │  │ structured │  │  │  │  against   │  │  │  │everything  │  │       │
│  │  │interpretation│ │  │  │thresholds │  │  │  │that        │  │       │
│  │  │            │  │  │  │            │  │  │  │happened    │  │       │
│  │  └────────────┘  │  │  └────────────┘  │  │  └────────────┘  │       │
│  │        │         │  │        │         │  │        │         │       │
│  │        ▼         │  │        ▼         │  │        ▼         │       │
│  │  ┌────────────┐  │  │  ┌────────────┐  │  │  ┌────────────┐  │       │
│  │  │   Shows    │  │  │  │  Routes    │  │  │  │  Enables   │  │       │
│  │  │   WHAT     │  │  │  │    to      │  │  │  │   audit    │  │       │
│  │  │   and      │  │  │  │  AUTO,     │  │  │  │   and      │  │       │
│  │  │   WHY      │  │  │  │  AI, or    │  │  │  │  learning  │  │       │
│  │  │            │  │  │  │  HUMAN     │  │  │  │            │  │       │
│  │  └────────────┘  │  │  └────────────┘  │  │  └────────────┘  │       │
│  │                  │  │                  │  │                  │       │
│  └──────────────────┘  └──────────────────┘  └──────────────────┘       │
│                                                                          │
│                    Before Action    │    After Action                    │
│                                     │                                    │
└─────────────────────────────────────────────────────────────────────────┘

Phase 1: Intent

Generate a structured document that makes the AI's interpretation explicit:

Task: What exactly will happen
Who/What: What will and won't be affected
Boundaries: Explicit limits on scope
Reversibility: Can it be undone? How?
Alternatives: Other interpretations considered

Phase 2: Check

Evaluate whether verification is needed:

Severity: How risky is this action type?
Confidence: How certain are we about the interpretation?
Decision: Route to AUTO, AI review, or HUMAN review

Phase 3: Receipt

Log everything for accountability:

Timestamp and session info
The intent document
The decision made
User response (if asked)
Execution outcome

System Architecture

High-Level Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                           CLAUDE CODE                                    │
│                                                                          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐    │
│  │    User     │  │   Claude    │  │    Tool     │  │   Plugin    │    │
│  │   Input     │──│   Brain     │──│  Executor   │──│   System    │    │
│  │             │  │             │  │             │  │             │    │
│  └─────────────┘  └─────────────┘  └─────────────┘  └──────┬──────┘    │
│                                                             │           │
└─────────────────────────────────────────────────────────────┼───────────┘
                                                              │
                              ┌────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                            ICR PLUGIN                                    │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │                         ENTRY POINTS                             │    │
│  │                                                                  │    │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │    │
│  │  │    HOOKS    │  │   SKILLS    │  │  COMMANDS   │              │    │
│  │  │             │  │             │  │             │              │    │
│  │  │ PreToolUse  │  │   Intent    │  │ /icr:*      │              │    │
│  │  │ Permission  │  │  Analysis   │  │             │              │    │
│  │  │ SubagentStp │  │             │  │             │              │    │
│  │  └──────┬──────┘  └─────────────┘  └─────────────┘              │    │
│  │         │                                                        │    │
│  └─────────┼────────────────────────────────────────────────────────┘    │
│            │                                                             │
│            ▼                                                             │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │                       CORE ENGINE                                │    │
│  │                                                                  │    │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐            │    │
│  │  │ Severity │ │Confidence│ │ Decision │ │  Intent  │            │    │
│  │  │Classifier│ │Calculator│ │   Tree   │ │Generator │            │    │
│  │  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘            │    │
│  │       │            │            │            │                   │    │
│  │       └────────────┴────────────┴────────────┘                   │    │
│  │                           │                                      │    │
│  │                           ▼                                      │    │
│  │  ┌──────────────────────────────────────────────────────────┐   │    │
│  │  │                   STORAGE LAYER                           │   │    │
│  │  │                                                           │   │    │
│  │  │  ┌──────────┐  ┌──────────┐  ┌──────────┐               │   │    │
│  │  │  │ Receipts │  │ Session  │  │  Config  │               │   │    │
│  │  │  │          │  │  State   │  │          │               │   │    │
│  │  │  └──────────┘  └──────────┘  └──────────┘               │   │    │
│  │  │                                                           │   │    │
│  │  └──────────────────────────────────────────────────────────┘   │    │
│  │                                                                  │    │
│  └──────────────────────────────────────────────────────────────────┘    │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Component Layers

Layer	Purpose	Components
Entry Points	Interface with Claude Code	Hooks, Skills, Commands
Core Engine	Business logic	Severity, Confidence, Decision, Intent
Storage	Persistence	Receipts, Session State, Config

Component Overview

Entry Point Components

Hooks

Automatically triggered by Claude Code events:

Hook	Trigger	Purpose
`PreToolUse`	Before any tool runs	Main ICR check
`PermissionRequest`	When asking user	Add ICR context
`SubagentStop`	After subagent finishes	Review aggregate work

Commands

User-invocable slash commands:

Command	Purpose
`/icr:receipts`	View audit trail
`/icr:config`	Manage settings
`/icr:check`	Force manual check
`/icr:trust`	Toggle trust mode
`/icr:audit`	Pattern analysis
`/icr:export`	Export receipts
`/icr:stats`	View statistics
`/icr:debug`	Debug confidence
`/icr:simulate`	Dry-run check

Skills

AI-invocable capabilities:

Skill	Purpose
`intent-analysis`	Generate intent explanations

Core Engine Components

Severity Classifier

Classifies actions into risk levels:

┌────────────────────────────────────────────────┐
│              SEVERITY CLASSIFIER                │
│                                                 │
│   Input: Tool name + Arguments                  │
│                                                 │
│   ┌─────────────────────────────────────────┐  │
│   │ Layer 1: Static Rules                    │  │
│   │   Bash → HIGH                           │  │
│   │   Read → LOW                            │  │
│   └──────────────────┬──────────────────────┘  │
│                      ▼                          │
│   ┌─────────────────────────────────────────┐  │
│   │ Layer 2: Metadata Inference              │  │
│   │   "rm -rf" → CRITICAL                   │  │
│   │   "delete" → CRITICAL                   │  │
│   └──────────────────┬──────────────────────┘  │
│                      ▼                          │
│   ┌─────────────────────────────────────────┐  │
│   │ Layer 3: User Rules                      │  │
│   │   Custom patterns override              │  │
│   └──────────────────┬──────────────────────┘  │
│                      ▼                          │
│   Output: LOW | MEDIUM | HIGH | CRITICAL       │
│                                                 │
└────────────────────────────────────────────────┘

Confidence Calculator

Calculates certainty score (0.0 - 1.0):

┌────────────────────────────────────────────────┐
│            CONFIDENCE CALCULATOR                │
│                                                 │
│   ┌─────────────┐  Weight: 30%                 │
│   │  Ambiguity  │  How many interpretations?   │
│   └──────┬──────┘                              │
│          │                                      │
│   ┌──────▼──────┐  Weight: 25%                 │
│   │  Distance   │  Semantic gap?               │
│   └──────┬──────┘                              │
│          │                                      │
│   ┌──────▼──────┐  Weight: 20%                 │
│   │ Historical  │  Seen before?                │
│   └──────┬──────┘                              │
│          │                                      │
│   ┌──────▼──────┐  Weight: 25%                 │
│   │ Uncertainty │  Hedging language?           │
│   └──────┬──────┘                              │
│          │                                      │
│          ▼                                      │
│   Weighted Average → 0.0 to 1.0                │
│                                                 │
└────────────────────────────────────────────────┘

Decision Tree

Routes to appropriate verification:

┌────────────────────────────────────────────────┐
│               DECISION TREE                     │
│                                                 │
│   Inputs:                                       │
│   - Severity (LOW/MEDIUM/HIGH/CRITICAL)        │
│   - Confidence (0.0 - 1.0)                     │
│   - Trust Mode (on/off)                        │
│   - Manual Check Flag                          │
│                                                 │
│   ┌─────────────────────────────────────────┐  │
│   │ Manual check requested?                  │  │
│   │   YES → HUMAN_REVIEW                    │  │
│   └──────────────────┬──────────────────────┘  │
│                      │ NO                       │
│   ┌──────────────────▼──────────────────────┐  │
│   │ Trust mode on AND not CRITICAL?          │  │
│   │   YES → AUTO_APPROVE_TRUST_MODE         │  │
│   └──────────────────┬──────────────────────┘  │
│                      │ NO                       │
│   ┌──────────────────▼──────────────────────┐  │
│   │ Confidence >= autoApprove threshold?     │  │
│   │   YES → AUTO_APPROVE                    │  │
│   └──────────────────┬──────────────────────┘  │
│                      │ NO                       │
│   ┌──────────────────▼──────────────────────┐  │
│   │ Confidence >= aiReview threshold?        │  │
│   │   YES → AI_REVIEW                       │  │
│   └──────────────────┬──────────────────────┘  │
│                      │ NO                       │
│                      ▼                          │
│   HUMAN_REVIEW                                  │
│                                                 │
└────────────────────────────────────────────────┘

Intent Generator

Creates structured intent documents:

┌────────────────────────────────────────────────┐
│             INTENT GENERATOR                    │
│                                                 │
│   Inputs:                                       │
│   - User prompt                                │
│   - Tool name + arguments                      │
│   - Confidence breakdown                       │
│                                                 │
│   Outputs:                                      │
│   {                                            │
│     "task": "...",                             │
│     "whoWhat": {                               │
│       "affected": [...],                       │
│       "excluded": [...]                        │
│     },                                         │
│     "boundaries": [...],                       │
│     "ifUncertain": "...",                      │
│     "reversibility": {...},                    │
│     "alternatives": [...],                     │
│     "confidenceBreakdown": {...}               │
│   }                                            │
│                                                 │
└────────────────────────────────────────────────┘

Data Flow

Complete Request Flow

User Request: "Clean up old files"
        │
        ▼
┌───────────────────────┐
│    CLAUDE CODE        │
│    Interprets as:     │
│    Bash: rm -rf ...   │
└───────────┬───────────┘
            │ PreToolUse event
            ▼
┌───────────────────────┐
│  pre-tool-check.sh    │
│                       │
│  1. Check exclusions  │──▶ If excluded: proceed
│  2. Load config       │
│  3. Classify severity │──▶ severity.sh → CRITICAL
│  4. Generate intent   │──▶ intent.sh → document
│  5. Calc confidence   │──▶ confidence.sh → 0.62
│  6. Make decision     │──▶ decision.sh → HUMAN_REVIEW
│  7. Create receipt    │──▶ receipt.sh → saved
│  8. Return decision   │
└───────────┬───────────┘
            │
            ▼
┌───────────────────────┐
│     USER PROMPT       │
│                       │
│  Task: Delete files   │
│  Severity: CRITICAL   │
│  Confidence: 0.62     │
│                       │
│  [1] Proceed          │
│  [2] Edit             │
│  [3] Abort            │
└───────────┬───────────┘
            │ User selects [1]
            ▼
┌───────────────────────┐
│   RECEIPT UPDATED     │
│   - response: PROCEED │
│   - outcome: proceeded│
└───────────┬───────────┘
            │
            ▼
┌───────────────────────┐
│   TOOL EXECUTES       │
│   rm -rf runs         │
└───────────────────────┘

Key Algorithms

Confidence Calculation

function calculate_confidence(user_prompt, tool, args, config):
    # Get individual scores
    ambiguity = analyze_ambiguity(user_prompt)
    distance = calculate_distance(user_prompt, tool, args)
    historical = analyze_historical(tool, args)
    uncertainty = analyze_uncertainty(user_prompt)

    # Get weights from config
    weights = config.confidence.weights

    # Weighted average
    confidence = (
        ambiguity * weights.ambiguityAnalysis +
        distance * weights.intentToActionDistance +
        historical * weights.historicalPatterns +
        uncertainty * weights.uncertaintyMarkers
    )

    return clamp(confidence, 0.0, 1.0)

Decision Routing

function make_decision(severity, confidence, config, session_state):
    # Manual check overrides everything
    if session_state.manualCheckNext:
        return HUMAN_REVIEW

    # Trust mode check
    if session_state.trustModeEnabled and severity != CRITICAL:
        return AUTO_APPROVE_TRUST_MODE

    # Get thresholds for this severity
    thresholds = config.thresholds[severity]

    # Check against thresholds
    if confidence >= thresholds.autoApprove:
        return AUTO_APPROVE

    if confidence >= thresholds.aiReview:
        return AI_REVIEW

    return HUMAN_REVIEW

Severity Classification

function classify_severity(tool, args, config):
    # Layer 1: Static rules
    if tool in config.severity.staticRules:
        base_severity = config.severity.staticRules[tool]
    else:
        base_severity = MEDIUM

    # Layer 2: Metadata inference
    inferred = infer_from_metadata(tool, args)
    if inferred.severity > base_severity:
        base_severity = inferred.severity

    # Layer 3: User rules (highest priority)
    for rule in config.severity.userRules:
        if matches(rule.pattern, tool) and evaluates(rule.condition, args):
            return rule.severity

    return base_severity

File Organization

Source Files

icr/
├── .claude-plugin/
│   └── plugin.json         # Plugin manifest - START HERE
│
├── scripts/
│   ├── pre-tool-check.sh   # Main entry point
│   ├── permission-check.sh # Permission dialog hook
│   ├── subagent-check.sh   # Subagent completion hook
│   ├── statusline.sh       # Status line display
│   └── lib/
│       ├── common.sh       # Shared utilities
│       ├── severity.sh     # Severity classification
│       ├── confidence.sh   # Confidence calculation
│       ├── decision.sh     # Decision tree
│       ├── intent.sh       # Intent generation
│       ├── receipt.sh      # Receipt logging
│       └── checkpoint.sh   # Checkpoint management
│
├── commands/               # Slash commands (9 files)
├── skills/                 # AI skills (1 directory)
├── config/                 # Default configuration
├── schemas/                # JSON validation schemas
├── prompts/                # LLM prompt templates
└── docs/                   # Documentation

Dependency Graph

plugin.json
    │
    ├──▶ hooks/hooks.json
    │        │
    │        └──▶ scripts/pre-tool-check.sh
    │                    │
    │                    ├──▶ lib/common.sh
    │                    ├──▶ lib/severity.sh ───▶ lib/common.sh
    │                    ├──▶ lib/confidence.sh ─▶ lib/common.sh
    │                    ├──▶ lib/decision.sh ──▶ lib/common.sh
    │                    ├──▶ lib/intent.sh ────▶ lib/common.sh
    │                    ├──▶ lib/receipt.sh ───▶ lib/common.sh
    │                    └──▶ lib/checkpoint.sh ▶ lib/common.sh
    │
    ├──▶ commands/*.md
    │
    └──▶ skills/intent-analysis/SKILL.md

Runtime Data

Storage Location

All runtime data is stored in .claude/icr/:

.claude/icr/
├── config.json             # User configuration overrides
├── session-state.json      # Current session state
├── errors.log              # Error log
│
├── receipts/               # Audit trail
│   ├── index.json          # Fast lookup index
│   ├── 2026-01-01/
│   │   └── session-abc123/
│   │       ├── session-meta.json
│   │       ├── 001-receipt.json
│   │       └── 002-receipt.json
│   └── 2026-01-02/
│       └── ...
│
├── exports/                # Exported receipts
│   └── 2026-01-02T103000Z.json
│
└── checkpoints/            # Checkpoint metadata
    └── chk-xyz789.json

Data Retention

Data Type	Default Retention	Configurable
Receipts	90 days	Yes
Exports	Permanent	N/A
Session state	Current session	N/A
Error logs	30 days	No

Integration Points

Claude Code Integration

ICR integrates with Claude Code through:

Plugin System: Discovered via plugin.json
Hook Events: Receives PreToolUse, PermissionRequest, SubagentStop
Commands: Registered slash commands
Skills: AI-invocable capabilities
Status Line: Optional status display

Configuration Integration

ICR configuration is merged from multiple sources:

Priority (lowest to highest):
1. Built-in defaults (config/defaults.json)
2. User global config (~/.claude/icr/config.json)
3. Project config (.claude/icr/config.json)
4. Runtime overrides (/icr:config set)

Checkpoint Integration

ICR integrates with Claude Code checkpoints:

Creates checkpoints before HIGH/CRITICAL actions
Links checkpoints to receipts
Suggests rollback when problems detected

Design Decisions

Why Bash Scripts?

Consideration	Decision
Claude Code hooks execute shell	Native integration
Cross-platform (macOS, Linux, WSL)	Maximum compatibility
No build step required	Faster iteration
Easy to modify and debug	Lower barrier

Why jq for JSON?

Consideration	Decision
Standard JSON tool	Available everywhere
Powerful query language	Complex transformations
Single dependency	Minimal requirements

Why Three-Phase Design?

Phase	Purpose
Intent	Transparency - show what AI thinks
Check	Control - let user/AI verify
Receipt	Accountability - log everything

Why Configurable Thresholds?

Different users have different risk tolerances:

Security-focused: Higher thresholds (more checks)
Productivity-focused: Lower thresholds (fewer interruptions)
Mixed: Different thresholds per severity level

Why Trust Mode?

Constant verification causes fatigue. Trust mode provides:

Escape hatch for known-safe work
Still logs everything
Never bypasses CRITICAL by default

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History