awslabs · fanhongy · Jun 16, 2026 · Jun 16, 2026 · Jun 18, 2026 · Jun 18, 2026
diff --git a/docs/memory.md b/docs/memory.md
@@ -2,6 +2,24 @@
 
 CAO's memory system gives agents persistent, cross-session storage. Agents store facts, decisions, and preferences during a session; CAO injects relevant memories back as context when the agent starts its next session.
 
+> For how it works under the hood — the three storage layers, the SQLite schema, and the read/write paths — see [`memory-internals.md`](memory-internals.md).
+
+## Capabilities
+
+- **Four scopes** (`global`, `project`, `session`, `agent`) and four type labels.
+- **Store / recall / forget** via MCP tools and `cao memory` CLI commands.
+- **Markdown wiki storage** with a SQLite metadata index.
+- **Search**: keyword (BM25), recency, or hybrid; results ranked by recency, a
+  composite 3-factor score (BM25 + recency + usage), or usage.
+- **Cross-references** between related memories, expandable on recall.
+- **Auto-injection** into each provider's config file on terminal creation, plus a
+  `<cao-memory>` block prepended to the agent's first message.
+- **LLM wiki compaction** (`cao memory compact`) that rewrites topic articles.
+- **Linting** (`cao memory lint`) for orphans, contradictions, stale claims, and more.
+- **Self-healing** (`cao memory heal`) that turns lint findings into fixes — dry-run
+  by default, `--apply` to mutate, with a full audit trail.
+- **Tiered retention / cleanup**, a daily audit log, and a memory Web UI.
+
 ## How It Works
 
 1. **Agent stores a memory** via `memory_store` MCP tool during a session
@@ -20,7 +38,10 @@ Scope controls where a memory is stored and who can read it back.
 | `session` | `memory/global/wiki/session/` | Ephemeral: notes for current session only |
 | `agent` | `memory/global/wiki/agent/` | Role-specific: patterns the agent role always applies |
 
-`project` is the default scope. The project hash is `sha256(realpath(cwd))[:12]`.
+`project` is the default scope. Project identity resolves via a precedence chain — a
+`CAO_PROJECT_ID` / `memory.project_id` override, then the normalized git remote URL, then
+`sha256(realpath(cwd))[:12]` as a fallback — so a project stays recallable across renames
+and moves.
 
 > **Note:** `session` and `agent` scopes are stored under the global container, not in their own top-level directories. Only `project` scope gets a dedicated directory keyed by project hash.
 
@@ -59,14 +80,19 @@ Search memories by keyword query and optional filters.
 
 ```
 memory_recall(
-  query="testing",     # optional, searches content
-  scope="project",     # optional, filter by scope
-  memory_type=None,    # optional, filter by type
-  limit=10             # optional, default 10, max 100
+  query="testing",         # optional, searches content
+  scope="project",         # optional, filter by scope
+  memory_type=None,        # optional, filter by type
+  limit=10,                # optional, default 10, max 100
+  search_mode="hybrid",    # optional: "hybrid" (default), "bm25", "metadata"
+  sort_by="recency",       # optional: "recency" (default), "score", "usage"
+  include_related=False    # optional, expand cross-referenced memories
 )
 ```
 
-Results are returned sorted by recency, with scope precedence: `session` > `project` > `global`.
+`sort_by` controls ranking: `recency` (newest first), `score` (composite 3-factor —
+BM25 relevance + recency + usage), or `usage` (most accessed). When no `scope` is
+given, results follow scope precedence: `session` > `project` > `global`.
 
 ### `memory_forget`
 
@@ -98,11 +124,39 @@ cao memory delete <key> --scope project --yes
 
 # Clear all memories for a scope
 cao memory clear --scope session --yes
+
+# Lint the wiki for orphans, contradictions, stale claims, etc.
+cao memory lint
+cao memory lint --scope project --format json
+
+# Compact wiki topics with the LLM compiler (repair sweep)
+cao memory compact --scope global
+cao memory compact --key testing-framework
+
+# Repair lint findings — dry-run by default, --apply to mutate
+cao memory heal --scope project              # dry-run plan
+cao memory heal --scope project --apply
+cao memory heal --scope project --apply --aggressive   # also heal poison_frequency
 ```
 
+`cao memory heal` consumes the findings from `cao memory lint` and applies one fix per
+issue type: it deletes orphan pages, resolves contradictions (keeping the newer article),
+strips stale claims, and — only under `--aggressive` — zeroes poisoned access counts. It
+is dry-run by default; pass `--apply` to mutate. Every applied mutation is written to the
+daily audit log.
+
 ## Context Injection
 
-When an agent receives its first message in a session, CAO prepends a `<cao-memory>` block containing relevant memories (up to 3000 characters). The block format:
+CAO injects relevant memories into a new session two ways:
+
+1. **First-message block** — when an agent receives its first message in a session,
+   CAO prepends a `<cao-memory>` block containing relevant memories.
+2. **Provider config file** — built-in plugins for Claude Code, Codex, and Kiro CLI
+   write the same block into each provider's per-project config file (e.g.
+   `.claude/CLAUDE.md`) on terminal creation, delimited by `cao-memory` markers so
+   repeated runs overwrite the same section.
+
+The block format:
 
 ```
 <cao-memory>
@@ -115,11 +169,16 @@ When an agent receives its first message in a session, CAO prepends a `<cao-memo
 <original user message>
 ```
 
-Memories are selected in scope precedence order: `session` > `project` > `global`.
+Memories are selected in scope precedence order: `session` > `project` > `global`. Each
+scope is independently capped — at most `MEMORY_MAX_PER_SCOPE` (10) entries and
+`MEMORY_SCOPE_BUDGET_CHARS` (1000) characters per scope — so one scope cannot monopolize
+the injection budget.
 
-## Auto-Save
+## Saving Memories
 
-In Phase 1 there is no automatic save hook. Agents must call `memory_store` explicitly via MCP when they want to persist a fact. Agent profiles include guidance on when to store. Hook-driven auto-save is shipped via per-provider plugins in a subsequent PR.
+Agents call `memory_store` explicitly via MCP when they want to persist a fact; agent
+profiles include guidance on when to store (see below). Hook- and plugin-driven injection
+surfaces stored memories back into later sessions automatically.
 
 ## Storage Layout
 

diff --git a/src/cli_agent_orchestrator/cli/commands/memory.py b/src/cli_agent_orchestrator/cli/commands/memory.py
@@ -359,3 +359,133 @@ def compact_cmd(scope, key):
     for topic_key, status in sorted(results.items()):
         click.echo(f"{status:<22} {topic_key}")
     click.echo(f"\nSummary: {summary}")
+
+
+@memory.command(name="heal")
+@click.option(
+    "--scope",
+    # Only global/project are resolvable from cwd context here; session/agent
+    # need a session_name/agent_profile that this CLI path cannot derive, so
+    # offering them would only ever raise "could not resolve scope_id".
+    type=click.Choice([MemoryScope.GLOBAL.value, MemoryScope.PROJECT.value], case_sensitive=False),
+    default="project",
+    show_default=True,
+    help="Scope to heal (global or project).",
+)
+@click.option(
+    "--apply",
+    "do_apply",
+    is_flag=True,
+    default=False,
+    help="Apply mutations. Without this flag, prints a dry-run plan only.",
+)
+@click.option(
+    "--aggressive",
+    is_flag=True,
+    default=False,
+    help="Enable destructive poison_frequency healing (requires --apply too).",
+)
+@click.option(
+    "--issue-type",
+    "issue_type",
+    type=click.Choice(
+        ["orphan_page", "contradiction", "stale_claim", "poison_frequency"],
+        case_sensitive=False,
+    ),
+    default=None,
+    help="Restrict healing to a single issue type.",
+)
+@click.option(
+    "--format",
+    "out_format",
+    type=click.Choice(["table", "json"], case_sensitive=False),
+    default="table",
+    show_default=True,
+    help="Output format.",
+)
+def heal_cmd(scope, do_apply, aggressive, issue_type, out_format):
+    """Repair wiki lint findings (orphan pages, contradictions, stale claims).
+
+    Dry-run by DEFAULT — prints what would change. Pass --apply to mutate.
+    poison_frequency healing additionally requires --aggressive.
+    graph_density is flag-only and never mutated.
+    """
+    import json as _json
+
+    from cli_agent_orchestrator.services import wiki_healer
+    from cli_agent_orchestrator.services.wiki_lint import run_lint
+
+    svc = _get_memory_service()
+    ctx = _cwd_context()
+
+    scope_id = None
+    if scope != MemoryScope.GLOBAL.value:
+        scope_id = svc.resolve_scope_id(scope, ctx)
+        if scope_id is None:
+            raise click.ClickException(f"could not resolve scope_id for scope '{scope}'")
+
+    project_hash = scope_id or "unknown"
+
+    try:
+        issues = _run_async(run_lint(project_hash, scope=scope))
+    except Exception as e:
+        raise click.ClickException(f"lint run failed: {e}")
+
+    if issue_type is not None:
+        issues = [i for i in issues if i.issue_type == issue_type]
+
+    try:
+        report = _run_async(
+            wiki_healer.heal(
+                issues,
+                scope=scope,
+                scope_id=scope_id,
+                apply=do_apply,
+                aggressive=aggressive,
+            )
+        )
+    except wiki_healer.HealConflictError as e:
+        raise click.ClickException(str(e))
+    except Exception as e:
+        raise click.ClickException(f"heal failed: {e}")
+
+    is_json = out_format.lower() == "json"
+    if is_json:
+        payload = {
+            "scope": report.scope,
+            "scope_id": report.scope_id,
+            "apply": report.apply,
+            "aggressive": report.aggressive,
+            "dry_run_summary": report.dry_run_summary,
+            "truncated_by_type": report.truncated_by_type,
+            "truncated_run_level": report.truncated_run_level,
+            "total_suppressed": report.total_suppressed,
+            "actions": [
+                {
+                    "issue_type": a.issue_type,
+                    "key": a.key,
+                    "related_key": a.related_key,
+                    "description": a.description,
+                    "status": a.status,
+                }
+                for a in report.actions
+            ],
+        }
+        click.echo(_json.dumps(payload, indent=2))
+        return
+
+    if report.dry_run_summary:
+        click.echo(report.dry_run_summary)
+    if not report.actions:
+        click.echo("Nothing to heal.")
+        return
+    header = f"{'STATUS':<10} {'TYPE':<22} {'KEY':<30} DESCRIPTION"
+    click.echo(header)
+    click.echo("-" * len(header))
+    for a in report.actions:
+        click.echo(f"{a.status:<10} {a.issue_type:<22} {a.key:<30} {a.description}")
+    if report.total_suppressed:
+        click.echo(
+            f"\n{report.total_suppressed} action(s) suppressed by caps "
+            f"(by type: {report.truncated_by_type}, run-level: {report.truncated_run_level})."
+        )
diff --git a/src/cli_agent_orchestrator/services/audit_log.py b/src/cli_agent_orchestrator/services/audit_log.py
@@ -59,6 +59,13 @@
 SYNC_AUDIT_EVENTS: frozenset = frozenset(
     {
         "lint_run_completed",
+        # Phase 4 U1 — wiki self-healing. Each applied mutation emits an
+        # AWAITED audit event; ``heal_run_completed`` summarises the run.
+        "orphan_pruned",
+        "contradiction_resolved",
+        "stale_claim_pruned",
+        "poison_access_zeroed",
+        "heal_run_completed",
     }
 )
 NOWAIT_AUDIT_EVENTS: frozenset = frozenset(