diff --git a/.gitignore b/.gitignore
index 93c0dad..44f3297 100644
--- a/.gitignore
+++ b/.gitignore
@@ -41,3 +41,12 @@ test-offload-sessions.sh
 # npm pack / release tarballs (never commit packaged outputs)
 *.tgz
 *.tar.gz
+
+# Local development notes (contributor-only, not shipped)
+docs/superpowers/
+
+# Per-developer Claude Code project instructions (contributor-only)
+CLAUDE.md
+
+# Plugin build output
+claude-code-plugin/dist/
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 8f2d346..38c2f80 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -4,6 +4,48 @@
 
 ---
 
+## [Unreleased]
+
+### 📦 新功能
+
+- **Claude Code + Codex CLI 插件**（`claude-code-plugin/`）：通过 Claude Code `/plugin install tdai-memory` 或 Codex CLI marketplace 一键启用，不修改用户 `~/.claude/settings.json` 或 `~/.codex/config.toml`。提供 3 个 hooks（`SessionStart` 异步预热、`UserPromptSubmit` 同步召回并通过 `additionalContext` 注入、`Stop` 异步捕获），3 个 slash skills（`/memory-search`、`/memory-status`、`/memory-clear-session`），以及一个总览 skill `tdai-memory`。Daemon 通过 `gateway-entry.ts` wrapper 绑定父进程生命周期。插件携带双 manifest（`.claude-plugin/plugin.json` 与 `.codex-plugin/plugin.json`），共享同一份 `hooks/hooks.json` 与 `skills/`。Claude Code（v2026.4+）是当前的一等宿主，端到端完整可用；Codex CLI（v0.130+）在 schema 层（hook 事件名、handler config 字段、`${CLAUDE_PLUGIN_ROOT}` 环境变量）已对齐，但当前部分阻塞——三层 blocker 详见 `claude-code-plugin/README.md`（discovery 层 [openai/codex#22078](https://github.com/openai/codex/issues/22078)、`async` 行为层 Codex 未实现、transcript 解析层只支持 cc 格式）。
+
+### 🔧 兼容性 / 安全增强
+
+- **Gateway 可选 Bearer Token 鉴权**：当设置 `TDAI_GATEWAY_TOKEN` 环境变量时，Gateway 要求所有非 OPTIONS 请求带 `Authorization: Bearer <token>`。未设置时行为不变，与 Hermes 完全向后兼容。Claude Code 插件每次 spawn daemon 时生成随机 256-bit token 写入权限 0600 文件。Bearer 字符串比较升级为 `crypto.timingSafeEqual`，Scheme 关键字按 RFC 6750 §2.1 大小写不敏感匹配（`Bearer`/`bearer`/`BEARER` 均可），401 响应携带 `WWW-Authenticate: Bearer realm="tdai-gateway"`。
+- **Token 通过文件路径（`TDAI_TOKEN_PATH`）传递给 daemon 子进程**，不再注入到 `TDAI_GATEWAY_TOKEN` 环境变量。后者会随 execve() 写入子进程初始 environment block，使 token 暴露于 `/proc/<pid>/environ` 与 `ps -E`；改为文件传递后只剩 0o600 token 文件这一面，daemon 加载时还会校验文件 owner uid。
+- **daemon 主机绑定加固**：cli.ts 启动时拒绝非 loopback 的 `TDAI_GATEWAY_HOST`，除非显式 `TDAI_GATEWAY_ALLOW_REMOTE=1` 打开开关；防止误把记忆端口曝露到 LAN/公网。
+- **新增 `tdai-memory-gateway` bin**（`./dist/src/gateway/cli.mjs`）：作为独立可执行 Gateway entry point，支持 `SIGTERM/SIGINT` 优雅关闭、可选父进程 PID liveness 探活（`TDAI_CC_PID` 环境变量，轮询间隔 15s）。供 Claude Code / Codex CLI 插件通过 `npx tdai-memory-gateway` 调用，无需把 npm 依赖打包进插件。
+- **daemon 进程管理重写**：基于 `O_CREAT|O_EXCL` 的 `spawn.lock` 互斥，并发触发的 SessionStart / UserPromptSubmit / Stop hook 中只有一个会真正 spawn，其余复用结果，根本性解决双 daemon / 端口与 token 错配问题；`state.json` 改 tmp + rename 原子写；`ensureRunning` 复用旧 daemon 前校验 `state.ccPid` 与当前 cc 一致，避免跨用户/跨会话错用旧 daemon；spawn 时显式设置 `cwd` 与 `TDAI_DATA_DIR` 注入，避免数据目录受 hook 进程 cwd 漂移影响；token 文件权限校验在 Windows 上跳过 `0o077` 位检测（Node `fs` 在 Win 下返回固定 mode 会误报），改用 NTFS ACL。
+- **`$ARGUMENTS` 命令注入面收敛**：cc 当前对 SKILL.md ``!`...` `` 块内的 `$ARGUMENTS` 执行字面 `replaceAll`，用户输入 `foo"; curl evil; "` 可注入到 shell（详见 anthropics/claude-code#16163）。重写 `memory-search/SKILL.md` 去掉 ``!`...` `` bash 块，改为引导 Claude 以 heredoc 通过 Bash 工具向 `hook.mjs search-stdin` 的 stdin 喂查询，用户输入不再经过 shell 词法解析。
+
+### 🐛 修复
+
+- **Stop hook 反复重写 L0**：之前每次 Stop 都向 `/capture` 全量发送最近 10 个 turn，而 Gateway 端 `originalUserMessageCount` 位置切片与 `afterTimestamp` 游标都缺失（`CaptureRequest` 不携带这两个字段），导致长会话前 N 个 turn 在每次 Stop 时反复写入 L0，污染 FTS5 与向量索引。改为基于 `$CLAUDE_PLUGIN_DATA/cursors/<sessionId>.json` 持久化的 `lastSentIndex` 取增量，首次发送以 50 turn 封顶，cursor 文件 tmp + rename 原子写。
+- **CJK 召回退化**：底层 2-gram 停用词表此前包含 `我们/你们/他们/这个/那个/可以/有没/没有/就是/不是` 等普通双字实义词，"我们的部署方案" 被切成 `[们的, 的部, 部署, 署方, 方案]`、丢失 "我们" 锚点 token，中文查询召回受损。停用词表缩到真正低信息量的疑问/连接片段。
+- **transcript 等待逻辑**：Stop hook 等待 cc 落盘从硬 sleep(800ms) 改为 `waitForTranscriptStable(2s)`：每 100ms 轮询 `stat().size`，连续两次相同字节数即视为 flush 完成；慢盘场景更稳。
+- **L0 jsonl 直查内存压力**：`searchL0JsonlDirect` 从 `readFile` 整体加载改为 `readline + createReadStream` 流式扫描，避免长会话 jsonl 触发 OOM；文件遍历从字符串排序+reverse（依赖 `YYYY-MM-DD.jsonl` 命名）改为 mtime 倒序，对 cc UUID 命名也工作正常。
+- **GatewayClient silent-failure 可观测**：所有 catch 块新增 `logPath` 失败追加，handleStatus 在 `/memory-status` 输出 `hook.log` / `daemon.log` 路径；daemon spawn 的 stdio stderr 重定向到 `daemon.log` 替代静默丢弃。
+- **Codex CLI plugin 端 hooks 注册补全**：`.codex-plugin/plugin.json` 之前只声明了 `"skills": "./skills/"`，缺 `"hooks": "./hooks/hooks.json"` —— Codex CLI 与 Claude Code 不同，plugin-local hooks 不走"约定俗成路径"，而是强制从 manifest 的 `hooks` 字段读取（见 `codex-rs/core-plugins/src/manifest.rs::RawPluginManifest`）。补上字段后 schema 层全部对齐：`SessionStart`/`UserPromptSubmit`/`Stop` 事件名、`command`/`timeout`/`statusMessage` handler 字段、`${CLAUDE_PLUGIN_ROOT}` 环境变量在 Codex 端都能解析（discovery.rs 注入了 `CLAUDE_PLUGIN_ROOT` backcompat alias，同时配 `PLUGIN_ROOT` 新名）。**注意 schema 层兼容 ≠ runtime 行为对齐**：Codex 解析 `async` 字段但实际硬编码为 sync 执行（`HookExecutionMode::Sync`，`core/src/hook_runtime.rs` 与 `hooks/src/engine/` 都没有消费 `r#async` 字段的代码），与 cc 的真异步行为不同；详见 README 中的 Codex 状态说明。
+
+### ✅ 测试
+
+- `auth.test.ts`：从 5 个 case 扩展到 14 个，覆盖鉴权对所有 POST 业务端点的矩阵、Bearer scheme 大小写、mangled Authorization 头、`WWW-Authenticate` 响应。
+- `hook.test.ts`：新增 cursor 增量、无新 turn 跳过 captureTurn、`MAX_CAPTURE_TURNS=50` 边界 3 个 case，且把 stop describe 整体 stub `CLAUDE_PLUGIN_DATA` 到 mkdtemp 隔离 cursor 状态。
+- `daemon.test.ts`：新增 `ensureRunning` 拒绝 ccPid 不匹配旧 state 的回归。
+
+### 📚 文档
+
+- `claude-code-plugin/README.md` 与 `README_CN.md`：安装、配置、数据布局、排障与安全模型完整说明，新增 `TDAI_TOKEN_PATH` / `TDAI_GATEWAY_ALLOW_REMOTE` / `TDAI_GATEWAY_CORS_ORIGIN` / Windows 兼容性说明。
+- `claude-code-plugin/README.md` 与 `README_CN.md`：Codex CLI 安装段下重写 "Codex CLI 当前状态：部分阻塞" 小节，披露与 cc 对等之前的三层 blocker：
+  1. **Discovery 层（上游阻塞）**：`source_type = "local"` 安装受上游 [openai/codex#22078](https://github.com/openai/codex/issues/22078) 影响，manifest 解析正常但 `skills/` 与 `hooks/hooks.json` 在运行时被静默丢弃，hook 根本不会触发；
+  2. **`async` 行为层（Codex 不实现）**：Codex 解析 `async` 字段但 `HookRunSummary` 硬编码 `HookExecutionMode::Sync`，cc 端 `SessionStart`/`Stop` 上的 `async: true + timeout: 30` 在 Codex 修复 #22078 后会变成同步 30s 阻塞；计划用单独 `hooks/codex-hooks.json` 差异化 timeout（待办）；
+  3. **Transcript 解析层（plugin 端未适配）**：Codex rollout jsonl schema `{timestamp, type, payload}` 与 cc transcript `{type, message, sessionId, parentUuid, …}` 完全不同，当前 `lib/transcript.ts` 仅解析 cc 格式，即使 Stop 在 Codex 上触发也会静默生成空 capture；Codex parser 是后续工作，等 #22078 修复后基于真实 Codex session 实现。
+  
+  当前 Codex 上真正能用的部分：manifest 解析、`/plugin` 可见可切换、`lib/daemon.ts` 宿主无关 daemon spawn 与 cc 共用同一段代码。同时同步降级 README 与 CHANGELOG 顶部"双宿主对齐"的过度乐观表述。
+
+---
+
 ## [0.3.4] - 2026-05-12
 
 ### 🐛 修复
diff --git a/claude-code-plugin/.claude-plugin/plugin.json b/claude-code-plugin/.claude-plugin/plugin.json
new file mode 100644
index 0000000..adb5a4d
--- /dev/null
+++ b/claude-code-plugin/.claude-plugin/plugin.json
@@ -0,0 +1,11 @@
+{
+  "name": "tdai-memory",
+  "version": "0.1.0",
+  "description": "Long-term + symbolic short-term memory for Claude Code, powered by TencentDB Agent Memory.",
+  "homepage": "https://github.com/Tencent/TencentDB-Agent-Memory",
+  "license": "MIT",
+  "author": {
+    "name": "李冠辰",
+    "email": "liguanchen@xiaomi.com"
+  }
+}
diff --git a/claude-code-plugin/.codex-plugin/plugin.json b/claude-code-plugin/.codex-plugin/plugin.json
new file mode 100644
index 0000000..6527bc6
--- /dev/null
+++ b/claude-code-plugin/.codex-plugin/plugin.json
@@ -0,0 +1,41 @@
+{
+  "name": "tdai-memory",
+  "version": "0.1.0",
+  "description": "Long-term + symbolic short-term memory for AI coding agents, powered by TencentDB Agent Memory.",
+  "author": {
+    "name": "李冠辰",
+    "email": "liguanchen@xiaomi.com"
+  },
+  "homepage": "https://github.com/Tencent/TencentDB-Agent-Memory",
+  "repository": "https://github.com/Tencent/TencentDB-Agent-Memory",
+  "license": "MIT",
+  "keywords": [
+    "memory",
+    "long-term-memory",
+    "short-term-memory",
+    "ai-memory",
+    "vector-search",
+    "sqlite-vec",
+    "persona",
+    "scene-extraction"
+  ],
+  "skills": "./skills/",
+  "hooks": "./hooks/hooks.json",
+  "interface": {
+    "displayName": "TDAI Memory",
+    "shortDescription": "Long-term + short-term memory for AI coding agents",
+    "longDescription": "Adds long-term memory and symbolic short-term memory to Codex CLI: automatic recall before every prompt (relevant past memories injected via additionalContext), automatic capture after every turn (L0 conversation written, L1/L2/L3 atoms/scenarios/persona extracted in the background), and manual control via skills (memory-search, memory-status, memory-clear-session). Memory is partitioned per project (hash of cwd) by default. The daemon runs locally on 127.0.0.1 with a Bearer-token-protected HTTP API (file mode 0600).",
+    "developerName": "TencentDB Agent Memory contributors",
+    "category": "Productivity",
+    "capabilities": [
+      "Read",
+      "Write"
+    ],
+    "brandColor": "#3B82F6",
+    "defaultPrompt": [
+      "Do you remember what we discussed about this project?",
+      "Search my memory for the migration plan we made last week",
+      "What were my preferences for the API design?"
+    ]
+  }
+}
diff --git a/claude-code-plugin/README.md b/claude-code-plugin/README.md
new file mode 100644
index 0000000..6af134e
--- /dev/null
+++ b/claude-code-plugin/README.md
@@ -0,0 +1,125 @@
+# TencentDB Agent Memory — Coding Agent Plugin
+
+Long-term + symbolic short-term memory for [Claude Code](https://claude.com/claude-code) and [OpenAI Codex CLI](https://developers.openai.com/codex/cli), powered by [TencentDB Agent Memory](https://github.com/Tencent/TencentDB-Agent-Memory).
+
+The plugin ships dual manifests (`.claude-plugin/plugin.json` and `.codex-plugin/plugin.json`) and reuses the same `hooks/hooks.json` and `skills/`. Claude Code (v2026.4+) and Codex CLI (v0.130+) share the hook protocol at the schema layer (event names, handler config fields, `${CLAUDE_PLUGIN_ROOT}` env var). **Claude Code is the first-class target today; Codex CLI is partially blocked** — see the [Codex CLI](#codex-cli) install section below for current status.
+
+[中文版](./README_CN.md)
+
+## What this gives you
+
+- **Automatic recall** before every prompt — relevant past memories injected into context
+- **Automatic capture** after every turn — L0 conversation written, L1/L2/L3 extracted in the background
+- **Manual control** via slash skills: `/memory-search`, `/memory-status`, `/memory-clear-session`
+- **Project-level isolation** by default (sessionKey = hash of cwd) — your `react-app` memories don't leak into your `golang-svc` work
+- **Bearer-secured local daemon** — no plaintext localhost API
+
+## Installation
+
+### Prerequisite
+
+Install the gateway runtime (the `tdai-memory-gateway` bin) globally — the plugin spawns the daemon via `npx tdai-memory-gateway`:
+
+```bash
+npm install -g @tencentdb-agent-memory/memory-tencentdb
+```
+
+This npm package contains the actual `TdaiGateway` (SQLite + sqlite-vec + LLM pipeline). The plugin itself is a thin shim that owns hooks, skills, and the per-session sessionKey — it does NOT bundle the heavy deps.
+
+### Claude Code
+
+```bash
+/plugin install tdai-memory
+```
+
+### Codex CLI
+
+```bash
+codex plugin marketplace add <marketplace-url>
+# then enable in the TUI: /plugin → toggle tdai-memory
+```
+
+(Once published to the Codex marketplace, this becomes a one-liner.)
+
+> **Codex CLI status (≤ v0.130): partially blocked.** Three layered blockers separate the current Codex experience from Claude Code parity:
+>
+> 1. **Plugin discovery (upstream blocker).** `source_type = "local"` marketplace installs are affected by Codex issue [openai/codex#22078](https://github.com/openai/codex/issues/22078): the manifest parses, the plugin appears in `/plugin` and is toggleable, but the declared `skills/` and `hooks/hooks.json` are silently dropped at runtime. Hooks never fire on Codex today.
+>
+> 2. **`async` hook field is parsed but not honored.** Codex deserializes the `async` field on hook commands (`codex-rs/config/src/hook_config.rs::HookHandlerConfig::Command`), but no code in `core/src/hook_runtime.rs` or `hooks/src/engine/` consumes it — `HookRunSummary` is hardcoded to `HookExecutionMode::Sync`. Our `SessionStart` and `Stop` hooks declare `async: true, timeout: 30`. Once #22078 ships, this means a Codex session start will block synchronously on first-run daemon spawn, and every Stop will block on capture. Planned mitigation: a separate `hooks/codex-hooks.json` referenced from `.codex-plugin/plugin.json` with shorter timeouts.
+>
+> 3. **`lib/transcript.ts` only parses the Claude Code transcript format.** Codex records sessions to `~/.codex/sessions/<yyyy>/<mm>/<dd>/rollout-*.jsonl` with shape `{timestamp, type: "session_meta" | …, payload: {…}}`, completely different from Claude Code's `{type, message, sessionId, parentUuid, …}`. Even if Stop fired on Codex, capture would silently produce empty turns. A Codex JSONL parser is planned once #22078 lets us validate end-to-end against a live Codex session.
+>
+> **What works today on Codex:** `.codex-plugin/plugin.json` is parsed correctly, the plugin is visible and toggleable in `/plugin`, and the shared daemon spawn / discovery logic in `lib/daemon.ts` is host-agnostic (same code path as Claude Code). Use Claude Code for actual memory functionality; track #22078 for the upstream fix.
+
+---
+
+No `~/.claude/settings.json` or `~/.codex/config.toml` mutation. The first time a session starts after installation, the plugin spawns the local daemon (via `npx tdai-memory-gateway`) on port 8421–8430 with a randomly generated Bearer token. State persists under `${CLAUDE_PLUGIN_DATA}`.
+
+## Configuration
+
+The plugin reads these optional environment variables:
+
+| Variable | Default | Purpose |
+|---|---|---|
+| `TDAI_SESSION_KEY` | `hash(cwd)` | Override the per-project memory partition |
+| `TDAI_TOKEN_PATH` | auto-generated 0o600 file | Path to a file the daemon reads the Bearer token from (preferred over `TDAI_GATEWAY_TOKEN`; the env-var form puts the token into `/proc/<pid>/environ` and `ps -E`) |
+| `TDAI_GATEWAY_TOKEN` | unset | Bearer token via env (fallback for the Hermes sidecar mode) |
+| `TDAI_GATEWAY_HOST` | `127.0.0.1` | Daemon bind host. Non-loopback values are refused unless `TDAI_GATEWAY_ALLOW_REMOTE=1` is set, to avoid exposing the memory port to the LAN. |
+| `TDAI_GATEWAY_ALLOW_REMOTE` | unset | Opt-in switch required to bind a non-loopback `TDAI_GATEWAY_HOST` |
+| `TDAI_GATEWAY_CORS_ORIGIN` | unset | When set, enables CORS with the given Origin; the default disables CORS so cross-origin pages cannot probe the daemon's port. |
+| `TDAI_GATEWAY_COMMAND` | `npx` | Override daemon spawn command (advanced; e.g. `node /path/to/cli.mjs` for development) |
+
+Most users never need to set any of these. `TDAI_SESSION_KEY=shared-with-other-project` is the most common power-user override.
+
+## Data location
+
+- `${CLAUDE_PLUGIN_DATA}/state.json` — daemon PID + port (tmp+rename atomic)
+- `${CLAUDE_PLUGIN_DATA}/token` — Bearer token (chmod 600, owner-uid checked)
+- `${CLAUDE_PLUGIN_DATA}/spawn.lock` — O_CREAT|O_EXCL daemon-spawn mutex (stale after 60s)
+- `${CLAUDE_PLUGIN_DATA}/cursors/<sessionId>.json` — per-cc-session `lastSentIndex` so Stop only POSTs new turns
+- `${CLAUDE_PLUGIN_DATA}/memory-tdai/` — SQLite + sqlite-vec database, scene blocks, persona snapshots
+- `${CLAUDE_PLUGIN_DATA}/hook.log` — hook diagnostic log (gateway-client failures, etc.)
+- `${CLAUDE_PLUGIN_DATA}/daemon.log` — daemon stderr/stdout (cold-start crashes, etc.)
+
+## How it works
+
+```
+User prompt → UserPromptSubmit hook → POST /recall → cc injects context
+cc replies   → Stop hook            → POST /capture → L0 + L1/L2/L3 pipeline
+Session end  → daemon detects parent cc exit → graceful shutdown
+```
+
+All hook handlers fail silently (writing to `hook.log`) — memory is never on the critical path of your conversation.
+
+## Troubleshooting
+
+**`/memory-status` says "unreachable"**:
+- Check `${CLAUDE_PLUGIN_DATA}/hook.log` (gateway-client request failures) and `${CLAUDE_PLUGIN_DATA}/daemon.log` (daemon cold-start crashes)
+- Restart your cc session — the SessionStart hook re-probes and re-spawns the daemon
+
+**Multiple cc terminals on the same project**:
+- All terminals share one daemon. The first to launch spawns it; subsequent terminals discover and reuse it via `state.json`.
+
+**Memory doesn't recall what I expect**:
+- Run `/memory-search <topic>` directly to see what's stored
+- Note that L1/L2/L3 extraction runs asynchronously — fresh conversations may need a few minutes before they appear in recall
+
+## Security model
+
+- The daemon listens only on `127.0.0.1` by default. Non-loopback `TDAI_GATEWAY_HOST` is refused unless `TDAI_GATEWAY_ALLOW_REMOTE=1` is also set.
+- Every request requires `Authorization: Bearer <token>`. Comparison is timing-safe; the scheme keyword is RFC 6750 §2.1 case-insensitive; 401 responses include `WWW-Authenticate: Bearer realm="tdai-gateway"`.
+- The token is generated freshly at each daemon spawn, written to `${CLAUDE_PLUGIN_DATA}/token` (chmod 600), and passed to the daemon child process **by file path** (`TDAI_TOKEN_PATH`) rather than as an env var, so the token does not surface via `/proc/<pid>/environ` or `ps -E`. Token-file owner is checked against the current uid on read.
+- The `memory-search` skill passes the user query to the daemon over **stdin** via a heredoc, never as a shell argv element — this avoids the literal-`replaceAll` `$ARGUMENTS` injection surface in cc (anthropics/claude-code#16163).
+- On Windows the 0o077 mode check is skipped (Node's `fs` returns fixed mode bits there); the OS-provided NTFS ACL on the token file is relied on instead.
+
+## Building from source
+
+```bash
+pnpm install
+pnpm build:cc-plugin
+pnpm test:cc-plugin
+```
+
+## License
+
+MIT — see [LICENSE](../LICENSE).
diff --git a/claude-code-plugin/README_CN.md b/claude-code-plugin/README_CN.md
new file mode 100644
index 0000000..5d5c761
--- /dev/null
+++ b/claude-code-plugin/README_CN.md
@@ -0,0 +1,125 @@
+# TencentDB Agent Memory — Coding Agent 插件
+
+为 [Claude Code](https://claude.com/claude-code) 与 [OpenAI Codex CLI](https://developers.openai.com/codex/cli) 提供长期记忆 + 符号化短期记忆，由 [TencentDB Agent Memory](https://github.com/Tencent/TencentDB-Agent-Memory) 驱动。
+
+插件携带双 manifest（`.claude-plugin/plugin.json` 与 `.codex-plugin/plugin.json`），共享同一份 `hooks/hooks.json` 与 `skills/`。Claude Code（v2026.4+）与 Codex CLI（v0.130+）在 **schema 层**对齐 hook 协议（事件名、handler 配置字段、`${CLAUDE_PLUGIN_ROOT}` 环境变量）。**Claude Code 是当前的一等宿主，Codex CLI 部分阻塞** —— 当前状态详见下方 [Codex CLI](#codex-cli) 安装段。
+
+[English version](./README.md)
+
+## 能给你什么
+
+- **自动召回**：每次提问前，相关过往记忆自动注入到上下文
+- **自动捕获**：每轮对话结束后，L0 落盘、L1/L2/L3 后台抽取
+- **手动控制**：通过 slash 技能 `/memory-search`、`/memory-status`、`/memory-clear-session`
+- **项目级隔离**：默认按 cwd hash 分区，`react-app` 的记忆不会泄漏到 `golang-svc`
+- **Bearer Token 鉴权**：本地 daemon 不裸奔，所有请求需带 token
+
+## 安装
+
+### 前置条件
+
+先全局安装 gateway 运行时（提供 `tdai-memory-gateway` 命令）—— 插件通过 `npx tdai-memory-gateway` 启动 daemon：
+
+```bash
+npm install -g @tencentdb-agent-memory/memory-tencentdb
+```
+
+该 npm 包含真正的 `TdaiGateway`（SQLite + sqlite-vec + LLM pipeline）。插件本身只是一层薄壳，提供 hook、skill 和 sessionKey 等绑定逻辑，不携带任何重型依赖。
+
+### Claude Code
+
+```bash
+/plugin install tdai-memory
+```
+
+### Codex CLI
+
+```bash
+codex plugin marketplace add <marketplace-url>
+# 在 TUI 中启用：/plugin → 切换 tdai-memory
+```
+
+（一旦发布到 Codex marketplace，将变为一条命令安装。）
+
+> **Codex CLI 当前状态（≤ v0.130）：部分阻塞。** 距离与 Claude Code 完全对等还有三层阻塞：
+>
+> 1. **Plugin discovery（上游阻塞）。** `source_type = "local"` marketplace 安装受 Codex issue [openai/codex#22078](https://github.com/openai/codex/issues/22078) 影响：manifest 能被解析、`/plugin` 中可见并可切换，但声明的 `skills/` 与 `hooks/hooks.json` 在运行时被静默丢弃。今天 Codex 上 hook 根本不会触发。
+>
+> 2. **`async` hook 字段被解析但未实际生效。** Codex 在 `codex-rs/config/src/hook_config.rs::HookHandlerConfig::Command` 中反序列化 `async` 字段，但 `core/src/hook_runtime.rs` 与 `hooks/src/engine/` 没有消费它的代码——`HookRunSummary` 硬编码为 `HookExecutionMode::Sync`。我们的 `SessionStart` 与 `Stop` hook 标了 `async: true, timeout: 30`。等 #22078 修复后，这意味着：Codex session 首次启动会同步阻塞等 daemon spawn，每次 Stop 会同步阻塞等 capture。计划绕过办法：单独拷一份 `hooks/codex-hooks.json` 给 `.codex-plugin/plugin.json` 引用，配较短 timeout。
+>
+> 3. **`lib/transcript.ts` 只解析 Claude Code transcript 格式。** Codex 把 session 录到 `~/.codex/sessions/<yyyy>/<mm>/<dd>/rollout-*.jsonl`，形态是 `{timestamp, type: "session_meta" | …, payload: {…}}`，跟 cc 的 `{type, message, sessionId, parentUuid, …}` 完全是两套 schema。即使 Stop 在 Codex 上触发了，capture 也只会静默生成空 turn。Codex JSONL parser 是后续工作，等 #22078 修复让我们能基于真实 Codex session 做端到端验证后再实现。
+>
+> **当前真正能用的部分：** `.codex-plugin/plugin.json` 解析正常、插件在 `/plugin` 中可见可切换、`lib/daemon.ts` 的 daemon spawn / discovery 逻辑是宿主无关的（与 cc 共用同一段代码）。当前阶段记忆功能请用 Claude Code，Codex 那边追 #22078 上游修复。
+
+---
+
+不需要改 `~/.claude/settings.json` 或 `~/.codex/config.toml`。第一次启动 session 时，插件通过 `npx tdai-memory-gateway` 在 8421–8430 端口拉起 daemon，并生成随机 Bearer token。状态保存在 `${CLAUDE_PLUGIN_DATA}`。
+
+## 配置
+
+插件读取以下可选环境变量：
+
+| 变量 | 默认值 | 作用 |
+|---|---|---|
+| `TDAI_SESSION_KEY` | `hash(cwd)` | 覆盖项目级记忆分区 |
+| `TDAI_TOKEN_PATH` | 自动生成的 0o600 文件 | daemon 从该文件读取 Bearer token（优于 `TDAI_GATEWAY_TOKEN`，后者会把 token 写进 `/proc/<pid>/environ` 与 `ps -E` 可见的环境块） |
+| `TDAI_GATEWAY_TOKEN` | 未设置 | 通过环境变量传 Bearer token（Hermes sidecar 模式的兼容方式） |
+| `TDAI_GATEWAY_HOST` | `127.0.0.1` | daemon 绑定地址。非 loopback 值需同时设置 `TDAI_GATEWAY_ALLOW_REMOTE=1`，否则启动被拒，防止误把记忆端口曝露到 LAN。 |
+| `TDAI_GATEWAY_ALLOW_REMOTE` | 未设置 | 显式开关，允许 daemon 绑定非 loopback host |
+| `TDAI_GATEWAY_CORS_ORIGIN` | 未设置 | 设置时按给定 Origin 启用 CORS；默认不启用，避免跨源页面探测 daemon 端口。 |
+| `TDAI_GATEWAY_COMMAND` | `npx` | 覆盖 daemon 启动命令（高级用法；如 `node /path/to/cli.mjs` 用于本地开发） |
+
+大多数用户都不需要设置任何变量。`TDAI_SESSION_KEY=shared-with-other-project` 是最常用的高级用法。
+
+## 数据位置
+
+- `${CLAUDE_PLUGIN_DATA}/state.json` — daemon PID + 端口（tmp+rename 原子写）
+- `${CLAUDE_PLUGIN_DATA}/token` — Bearer token（chmod 600，读取时校验 owner uid）
+- `${CLAUDE_PLUGIN_DATA}/spawn.lock` — O_CREAT|O_EXCL daemon 启动互斥锁（60s 后视为陈旧）
+- `${CLAUDE_PLUGIN_DATA}/cursors/<sessionId>.json` — 每个 cc 会话的 `lastSentIndex`，Stop hook 增量发送依赖
+- `${CLAUDE_PLUGIN_DATA}/memory-tdai/` — SQLite + sqlite-vec 数据、场景块、画像快照
+- `${CLAUDE_PLUGIN_DATA}/hook.log` — hook 排障日志（gateway-client 请求失败等）
+- `${CLAUDE_PLUGIN_DATA}/daemon.log` — daemon stderr/stdout（冷启动 crash 等）
+
+## 工作原理
+
+```
+用户输入  → UserPromptSubmit hook → POST /recall   → cc 注入上下文
+cc 回复  → Stop hook              → POST /capture  → L0 + L1/L2/L3 流水线
+会话退出 → daemon 检测父 cc 退出   → 优雅关闭
+```
+
+所有 hook 都是"失败静默"——日志写 `hook.log`，记忆系统永远不在对话的关键路径上。
+
+## 排障
+
+**`/memory-status` 显示 "unreachable"**：
+- 看 `${CLAUDE_PLUGIN_DATA}/hook.log`（gateway-client 请求失败）与 `${CLAUDE_PLUGIN_DATA}/daemon.log`（daemon 冷启动 crash）
+- 重启 cc 会话——SessionStart hook 会重新探活并 spawn daemon
+
+**多个 cc 终端开同一个项目**：
+- 共享一个 daemon。第一个启动的 cc 拉起它，后续 cc 通过 `state.json` 发现并复用。
+
+**记忆召回不准**：
+- 直接跑 `/memory-search <topic>` 看存了什么
+- L1/L2/L3 抽取是异步的，新对话需要几分钟才能被召回到
+
+## 安全模型
+
+- Daemon 默认仅监听 `127.0.0.1`。非 loopback `TDAI_GATEWAY_HOST` 必须同时设置 `TDAI_GATEWAY_ALLOW_REMOTE=1` 才允许绑定。
+- 每个请求都需要 `Authorization: Bearer <token>`。比较使用 `crypto.timingSafeEqual`，scheme 关键字按 RFC 6750 §2.1 大小写不敏感；401 响应携带 `WWW-Authenticate: Bearer realm="tdai-gateway"`。
+- Token 在每次 daemon spawn 时新生成，写入 `${CLAUDE_PLUGIN_DATA}/token`（chmod 600），通过 **文件路径** `TDAI_TOKEN_PATH` 传给 daemon 子进程，而不是注入到子进程环境变量——避免 token 出现在 `/proc/<pid>/environ` 与 `ps -E`。daemon 读取 token 时还会校验文件 owner uid 与当前进程一致。
+- `memory-search` skill 通过 heredoc 把用户 query 喂到 daemon stdin，而不是作为 shell argv 元素拼接——绕开 cc 当前对 `$ARGUMENTS` 的字面 `replaceAll` 注入面（anthropics/claude-code#16163）。
+- Windows 下跳过 0o077 mode 位校验（Node `fs` 在 Win 下返回固定 mode 位会误报），改为依赖 OS 给 token 文件的 NTFS ACL。
+
+## 源码构建
+
+```bash
+pnpm install
+pnpm build:cc-plugin
+pnpm test:cc-plugin
+```
+
+## License
+
+MIT — 见 [LICENSE](../LICENSE)。
diff --git a/claude-code-plugin/hooks/hooks.json b/claude-code-plugin/hooks/hooks.json
new file mode 100644
index 0000000..e5588ce
--- /dev/null
+++ b/claude-code-plugin/hooks/hooks.json
@@ -0,0 +1,41 @@
+{
+  "hooks": {
+    "SessionStart": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node \"${CLAUDE_PLUGIN_ROOT}/dist/lib/hook.mjs\" session-start",
+            "async": true,
+            "timeout": 30,
+            "statusMessage": "Initializing memory..."
+          }
+        ]
+      }
+    ],
+    "UserPromptSubmit": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node \"${CLAUDE_PLUGIN_ROOT}/dist/lib/hook.mjs\" user-prompt-submit",
+            "timeout": 5,
+            "statusMessage": "Recalling memories..."
+          }
+        ]
+      }
+    ],
+    "Stop": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "node \"${CLAUDE_PLUGIN_ROOT}/dist/lib/hook.mjs\" stop",
+            "async": true,
+            "timeout": 30
+          }
+        ]
+      }
+    ]
+  }
+}
diff --git a/claude-code-plugin/lib/daemon.ts b/claude-code-plugin/lib/daemon.ts
new file mode 100644
index 0000000..3711bf2
--- /dev/null
+++ b/claude-code-plugin/lib/daemon.ts
@@ -0,0 +1,343 @@
+/**
+ * Daemon manager — spawns the TdaiGateway as a long-lived sidecar bound
+ * to the parent cc process. Mirrors the supervisor.py pattern from
+ * hermes-plugin/.
+ */
+
+import { spawn, type ChildProcess } from "node:child_process";
+import { randomBytes } from "node:crypto";
+import { mkdir, writeFile, readFile, stat, unlink, open, rename } from "node:fs/promises";
+import { existsSync, openSync } from "node:fs";
+import { join } from "node:path";
+import http from "node:http";
+import net from "node:net";
+
+export interface DaemonState {
+  pid: number;
+  port: number;
+  ccPid: number;
+  startedAt: string;
+  tokenPath: string;
+}
+
+export interface DaemonManagerConfig {
+  dataDir: string;
+  portStart?: number;
+  portEnd?: number;
+}
+
+const DEFAULT_PORT_START = 8421;
+const DEFAULT_PORT_END = 8430;
+const STATE_FILE = "state.json";
+
+export async function readDaemonState(dataDir: string): Promise<DaemonState | null> {
+  const path = join(dataDir, STATE_FILE);
+  if (!existsSync(path)) return null;
+  try {
+    const raw = await readFile(path, "utf-8");
+    return JSON.parse(raw) as DaemonState;
+  } catch {
+    return null;
+  }
+}
+
+export async function writeDaemonState(dataDir: string, state: DaemonState): Promise<void> {
+  await mkdir(dataDir, { recursive: true });
+  // Atomic write: a concurrent reader never observes a half-written JSON.
+  const tmp = join(dataDir, `${STATE_FILE}.tmp`);
+  const final = join(dataDir, STATE_FILE);
+  await writeFile(tmp, JSON.stringify(state, null, 2), { mode: 0o600 });
+  await rename(tmp, final);
+}
+
+export async function clearDaemonState(dataDir: string): Promise<void> {
+  const path = join(dataDir, STATE_FILE);
+  try {
+    await unlink(path);
+  } catch {
+    // ignore
+  }
+}
+
+export class DaemonManager {
+  private dataDir: string;
+  private portStart: number;
+  private portEnd: number;
+
+  constructor(config: DaemonManagerConfig) {
+    this.dataDir = config.dataDir;
+    this.portStart = config.portStart ?? DEFAULT_PORT_START;
+    this.portEnd = config.portEnd ?? DEFAULT_PORT_END;
+  }
+
+  async generateToken(): Promise<string> {
+    await mkdir(this.dataDir, { recursive: true });
+    const token = randomBytes(32).toString("base64url");
+    const tokenPath = join(this.dataDir, "token");
+    await writeFile(tokenPath, token, { mode: 0o600 });
+    return tokenPath;
+  }
+
+  async readToken(tokenPath: string): Promise<string> {
+    const st = await stat(tokenPath);
+    // Windows' Node fs reports mode bits that don't map to POSIX rwx, so
+    // the 0o077 check would always fire and block Windows users entirely.
+    // Skip the bit-level check there and rely on the NTFS ACL the OS gave
+    // the file at create time.
+    if (process.platform !== "win32" && (st.mode & 0o077) !== 0) {
+      throw new Error(`Token file permission too loose: ${tokenPath}`);
+    }
+    // Owner check: refuse to read a token file we don't own. Guards the
+    // multi-user case where ~/.tdai-memory is on a shared FS and a peer
+    // UID could pre-create the file to phish the daemon.
+    if (process.platform !== "win32" && typeof process.getuid === "function") {
+      const uid = process.getuid();
+      if (st.uid !== uid) {
+        throw new Error(
+          `Token file owner mismatch: expected uid=${uid}, got uid=${st.uid} for ${tokenPath}`,
+        );
+      }
+    }
+    const raw = await readFile(tokenPath, "utf-8");
+    return raw.trim();
+  }
+
+  async findFreePort(
+    start = this.portStart,
+    end = this.portEnd,
+  ): Promise<number> {
+    for (let p = start; p <= end; p++) {
+      if (await this.isPortFree(p)) return p;
+    }
+    throw new Error(`No free port in ${start}..${end}`);
+  }
+
+  private isPortFree(port: number): Promise<boolean> {
+    return new Promise((resolve) => {
+      const tester = net.createServer();
+      tester.once("error", () => resolve(false));
+      tester.once("listening", () => {
+        tester.close(() => resolve(true));
+      });
+      tester.listen(port, "127.0.0.1");
+    });
+  }
+
+  async probe(): Promise<boolean> {
+    const state = await readDaemonState(this.dataDir);
+    if (!state) return false;
+    let token: string;
+    try {
+      token = await this.readToken(state.tokenPath);
+    } catch {
+      return false;
+    }
+    return this.healthCheck(state.port, token);
+  }
+
+  private healthCheck(port: number, token: string, timeoutMs = 2000): Promise<boolean> {
+    return new Promise((resolve) => {
+      const req = http.request(
+        {
+          host: "127.0.0.1",
+          port,
+          path: "/health",
+          method: "GET",
+          headers: { Authorization: `Bearer ${token}` },
+        },
+        (res) => resolve(res.statusCode === 200),
+      );
+      req.setTimeout(timeoutMs, () => {
+        req.destroy();
+        resolve(false);
+      });
+      req.on("error", () => resolve(false));
+      req.end();
+    });
+  }
+
+  async ensureRunning(ccPid: number): Promise<DaemonState> {
+    const reuseExisting = async (): Promise<DaemonState | null> => {
+      const existing = await readDaemonState(this.dataDir);
+      if (!existing) return null;
+      if (existing.ccPid !== ccPid) return null;
+      let token = "";
+      try {
+        token = await this.readToken(existing.tokenPath);
+      } catch {
+        return null;
+      }
+      if (!token) return null;
+      if (await this.healthCheck(existing.port, token)) return existing;
+      // Daemon may still be coming up (another hook just spawned it).
+      const deadline = Date.now() + 10_000;
+      while (Date.now() < deadline) {
+        await sleep(500);
+        if (await this.healthCheck(existing.port, token)) return existing;
+      }
+      return null;
+    };
+
+    const reused = await reuseExisting();
+    if (reused) return reused;
+
+    // O_CREAT|O_EXCL spawn lock — only one concurrent hook actually invokes
+    // spawn(). Other hooks block on it and recover the spawned state.
+    const lock = await this.acquireSpawnLock();
+    if (!lock) {
+      // Lock held by a peer hook. Wait up to 35s for it to write state.json
+      // and bring the daemon up.
+      const deadline = Date.now() + 35_000;
+      while (Date.now() < deadline) {
+        await sleep(500);
+        const r = await reuseExisting();
+        if (r) return r;
+      }
+      throw new Error("daemon spawn lock contention timed out");
+    }
+    try {
+      // Re-check inside the lock — a peer might have finished between our
+      // first reuseExisting and acquireSpawnLock.
+      const r = await reuseExisting();
+      if (r) return r;
+      return await this.spawn(ccPid);
+    } finally {
+      await lock.release();
+    }
+  }
+
+  /**
+   * Returns a held lock handle, or null if another process owns the lock.
+   * Stale locks (>60s old) are forcibly broken so a crashed hook never wedges
+   * the daemon-up path.
+   */
+  private async acquireSpawnLock(): Promise<{ release(): Promise<void> } | null> {
+    await mkdir(this.dataDir, { recursive: true });
+    const lockPath = join(this.dataDir, "spawn.lock");
+    const tryCreate = async (): Promise<{ release(): Promise<void> } | null> => {
+      try {
+        const fh = await open(lockPath, "wx"); // O_CREAT|O_EXCL|O_WRONLY
+        await fh.write(`${process.pid}\n`);
+        await fh.close();
+        return {
+          release: async () => {
+            try {
+              await unlink(lockPath);
+            } catch {
+              // already gone
+            }
+          },
+        };
+      } catch (err) {
+        if ((err as NodeJS.ErrnoException).code === "EEXIST") return null;
+        throw err;
+      }
+    };
+    const first = await tryCreate();
+    if (first) return first;
+    try {
+      const st = await stat(lockPath);
+      if (Date.now() - st.mtimeMs > 60_000) {
+        await unlink(lockPath).catch(() => {});
+        return tryCreate();
+      }
+    } catch {
+      // race: lock disappeared, retry once
+      return tryCreate();
+    }
+    return null;
+  }
+
+  /**
+   * Spawn the Gateway daemon by invoking `npx tdai-memory-gateway`.
+   *
+   * The user must have `@tencentdb-agent-memory/memory-tencentdb` installed,
+   * either globally (`npm install -g`) or in the current project (which exposes
+   * the `tdai-memory-gateway` bin via npx's PATH resolution).
+   */
+  async spawn(ccPid: number): Promise<DaemonState> {
+    const port = await this.findFreePort();
+    const tokenPath = await this.generateToken();
+    const token = await this.readToken(tokenPath);
+
+    const command = process.env.TDAI_GATEWAY_COMMAND ?? "npx";
+    const args = process.env.TDAI_GATEWAY_COMMAND
+      ? []
+      : ["--yes", "tdai-memory-gateway"];
+
+    // Pass the token by FILE PATH, not as an env var. execve() snapshots the
+    // initial environment block and exposes it via /proc/<pid>/environ /
+    // `ps -E` to any peer process with the same UID — a token file gated by
+    // 0600 + owner check is a smaller attack surface.
+    //
+    // Also pin TDAI_DATA_DIR explicitly: without it the gateway resolves its
+    // data dir against process.cwd() of the spawning hook, which can be any
+    // arbitrary user directory and would split data across cwds.
+    const childEnv = {
+      ...process.env,
+      TDAI_GATEWAY_PORT: String(port),
+      TDAI_CC_PID: String(ccPid),
+      TDAI_TOKEN_PATH: tokenPath,
+      TDAI_DATA_DIR: process.env.TDAI_DATA_DIR ?? this.dataDir,
+    } as NodeJS.ProcessEnv;
+    delete childEnv.TDAI_GATEWAY_TOKEN;
+
+    // Redirect stderr (and stdout) into daemon.log so cold-start crashes are
+    // not swallowed silently. detached + unref keeps the daemon alive past
+    // the hook process exit; the log fds are independent of our stdio.
+    await mkdir(this.dataDir, { recursive: true });
+    const logPath = join(this.dataDir, "daemon.log");
+    let logFd: number | "ignore" = "ignore";
+    try {
+      logFd = openSync(logPath, "a");
+    } catch {
+      // fall back to discarding stderr if we can't open the log
+    }
+
+    const child: ChildProcess = spawn(command, args, {
+      env: childEnv,
+      cwd: this.dataDir,
+      detached: true,
+      stdio: ["ignore", logFd, logFd],
+    });
+    child.unref();
+
+    if (!child.pid) {
+      throw new Error("Failed to spawn daemon: child has no pid");
+    }
+
+    // Write state.json IMMEDIATELY so concurrent hooks (e.g. Stop firing
+    // before SessionStart's spawn finishes its health probe) see that a
+    // daemon is being brought up and can wait for it via ensureRunning's
+    // health-retry loop, instead of treating it as "no daemon".
+    const pendingState: DaemonState = {
+      pid: child.pid,
+      port,
+      ccPid,
+      startedAt: new Date().toISOString(),
+      tokenPath,
+    };
+    await writeDaemonState(this.dataDir, pendingState);
+
+    // Gateway cold-start needs to init SQLite + sqlite-vec + BM25 encoder +
+    // pipeline + LLM runner. On slower machines this can exceed 10s, so give
+    // it 30s. The hook is async (cc doesn't block on it) so the longer
+    // budget doesn't impact UX.
+    const deadline = Date.now() + 30_000;
+    while (Date.now() < deadline) {
+      if (await this.healthCheck(port, token, 500)) {
+        return pendingState;
+      }
+      await sleep(200);
+    }
+
+    // Health probe timed out. Remove the pending state so subsequent hooks
+    // don't keep waiting on a daemon that never came up.
+    await clearDaemonState(this.dataDir);
+    throw new Error(`Daemon did not become healthy on port ${port} within 30s`);
+  }
+}
+
+function sleep(ms: number): Promise<void> {
+  return new Promise((r) => setTimeout(r, ms));
+}
diff --git a/claude-code-plugin/lib/gateway-client.ts b/claude-code-plugin/lib/gateway-client.ts
new file mode 100644
index 0000000..403e954
--- /dev/null
+++ b/claude-code-plugin/lib/gateway-client.ts
@@ -0,0 +1,222 @@
+/**
+ * HTTP client for the TDAI Gateway, with Bearer token authentication and
+ * silent-failure semantics suitable for cc hook handlers (any error returns
+ * an empty / no-op response rather than throwing). Failures are also
+ * appended to an optional log file so the daemon's health can be diagnosed
+ * via /memory-status without re-attaching a debugger.
+ */
+
+import http from "node:http";
+import { appendFile } from "node:fs/promises";
+import { URL } from "node:url";
+
+export interface GatewayClientConfig {
+  baseUrl: string;
+  token: string;
+  timeoutMs?: number;
+  /** If set, every fallthrough error is appended here as one line. */
+  logPath?: string;
+}
+
+export interface RecallResult {
+  context: string;
+  strategy?: string;
+  memory_count?: number;
+}
+
+export interface CaptureTurnPayload {
+  user_content: string;
+  assistant_content: string;
+  session_key: string;
+  session_id?: string;
+  messages?: Array<{ role: string; content: string }>;
+}
+
+export interface CaptureTurnResult {
+  l0_recorded: number;
+  scheduler_notified: boolean;
+}
+
+export interface SearchResult {
+  results: string;
+  total: number;
+  strategy?: string;
+}
+
+export class GatewayClient {
+  private baseUrl: URL;
+  private token: string;
+  private timeoutMs: number;
+  private logPath?: string;
+
+  constructor(config: GatewayClientConfig) {
+    this.baseUrl = new URL(config.baseUrl);
+    this.token = config.token;
+    this.timeoutMs = config.timeoutMs ?? 5_000;
+    this.logPath = config.logPath;
+  }
+
+  private async logFailure(method: string, path: string, detail: string): Promise<void> {
+    if (!this.logPath) return;
+    try {
+      await appendFile(
+        this.logPath,
+        `[${new Date().toISOString()}] gateway-client ${method} ${path}: ${detail}\n`,
+      );
+    } catch {
+      // unable to log — nothing else we can do from a hook handler
+    }
+  }
+
+  private describeStatus(status: number, body: string): string {
+    const trimmed = body.length > 200 ? body.slice(0, 200) + "…" : body;
+    return `HTTP ${status} ${trimmed}`;
+  }
+
+  async health(): Promise<boolean> {
+    try {
+      const { status, body } = await this.request("GET", "/health");
+      if (status === 200) return true;
+      await this.logFailure("GET", "/health", this.describeStatus(status, body));
+      return false;
+    } catch (err) {
+      await this.logFailure("GET", "/health", err instanceof Error ? err.message : String(err));
+      return false;
+    }
+  }
+
+  async recall(query: string, sessionKey: string): Promise<RecallResult> {
+    try {
+      const { status, body } = await this.request("POST", "/recall", {
+        query,
+        session_key: sessionKey,
+      });
+      if (status !== 200) {
+        await this.logFailure("POST", "/recall", this.describeStatus(status, body));
+        return { context: "" };
+      }
+      const parsed = JSON.parse(body) as RecallResult;
+      return {
+        context: parsed.context ?? "",
+        strategy: parsed.strategy,
+        memory_count: parsed.memory_count,
+      };
+    } catch (err) {
+      await this.logFailure("POST", "/recall", err instanceof Error ? err.message : String(err));
+      return { context: "" };
+    }
+  }
+
+  async captureTurn(payload: CaptureTurnPayload): Promise<CaptureTurnResult | null> {
+    try {
+      const { status, body } = await this.request("POST", "/capture", payload);
+      if (status !== 200) {
+        await this.logFailure("POST", "/capture", this.describeStatus(status, body));
+        return null;
+      }
+      return JSON.parse(body) as CaptureTurnResult;
+    } catch (err) {
+      await this.logFailure("POST", "/capture", err instanceof Error ? err.message : String(err));
+      return null;
+    }
+  }
+
+  async searchMemories(
+    query: string,
+    opts?: { limit?: number; type?: string; scene?: string },
+  ): Promise<SearchResult> {
+    try {
+      const { status, body } = await this.request("POST", "/search/memories", {
+        query,
+        limit: opts?.limit,
+        type: opts?.type,
+        scene: opts?.scene,
+      });
+      if (status !== 200) {
+        await this.logFailure("POST", "/search/memories", this.describeStatus(status, body));
+        return { results: "", total: 0 };
+      }
+      return JSON.parse(body) as SearchResult;
+    } catch (err) {
+      await this.logFailure("POST", "/search/memories", err instanceof Error ? err.message : String(err));
+      return { results: "", total: 0 };
+    }
+  }
+
+  async searchConversations(
+    query: string,
+    opts?: { limit?: number; sessionKey?: string },
+  ): Promise<SearchResult> {
+    try {
+      const { status, body } = await this.request("POST", "/search/conversations", {
+        query,
+        limit: opts?.limit,
+        session_key: opts?.sessionKey,
+      });
+      if (status !== 200) {
+        await this.logFailure("POST", "/search/conversations", this.describeStatus(status, body));
+        return { results: "", total: 0 };
+      }
+      return JSON.parse(body) as SearchResult;
+    } catch (err) {
+      await this.logFailure("POST", "/search/conversations", err instanceof Error ? err.message : String(err));
+      return { results: "", total: 0 };
+    }
+  }
+
+  async sessionEnd(sessionKey: string): Promise<void> {
+    try {
+      const { status, body } = await this.request("POST", "/session/end", { session_key: sessionKey });
+      if (status !== 200) {
+        await this.logFailure("POST", "/session/end", this.describeStatus(status, body));
+      }
+    } catch (err) {
+      await this.logFailure("POST", "/session/end", err instanceof Error ? err.message : String(err));
+    }
+  }
+
+  private request(
+    method: string,
+    path: string,
+    bodyObj?: unknown,
+  ): Promise<{ status: number; body: string }> {
+    return new Promise((resolve, reject) => {
+      const bodyStr = bodyObj ? JSON.stringify(bodyObj) : undefined;
+      const opts: http.RequestOptions = {
+        protocol: this.baseUrl.protocol,
+        hostname: this.baseUrl.hostname,
+        port: this.baseUrl.port,
+        method,
+        path,
+        headers: {
+          Authorization: `Bearer ${this.token}`,
+          ...(bodyStr
+            ? {
+                "Content-Type": "application/json",
+                "Content-Length": Buffer.byteLength(bodyStr).toString(),
+              }
+            : {}),
+        },
+      };
+
+      const req = http.request(opts, (res) => {
+        const chunks: Buffer[] = [];
+        res.on("data", (c) => chunks.push(c));
+        res.on("end", () =>
+          resolve({
+            status: res.statusCode ?? 0,
+            body: Buffer.concat(chunks).toString("utf-8"),
+          }),
+        );
+      });
+
+      req.setTimeout(this.timeoutMs, () => {
+        req.destroy(new Error(`Timeout after ${this.timeoutMs}ms`));
+      });
+
+      req.on("error", reject);
+      if (bodyStr) req.write(bodyStr);
+      req.end();
+    });
+  }
+}
diff --git a/claude-code-plugin/lib/hook.ts b/claude-code-plugin/lib/hook.ts
new file mode 100644
index 0000000..6f16f4c
--- /dev/null
+++ b/claude-code-plugin/lib/hook.ts
@@ -0,0 +1,495 @@
+/**
+ * Unified hook entry point. Dispatched by the first CLI arg.
+ *
+ * Usage from cc plugin hook config:
+ *   node ${CLAUDE_PLUGIN_ROOT}/dist/lib/hook.mjs <event-name>
+ *
+ * Where <event-name> is one of:
+ *   session-start | user-prompt-submit | post-tool-use | stop |
+ *   search | status | clear-session
+ */
+
+import { GatewayClient } from "./gateway-client.js";
+import { getSessionKey } from "./session-key.js";
+import { readAllTurns } from "./transcript.js";
+import { DaemonManager, readDaemonState } from "./daemon.js";
+import { appendFile, mkdir, readdir, readFile, rename, stat, writeFile } from "node:fs/promises";
+import { createReadStream } from "node:fs";
+import { createInterface } from "node:readline";
+import { basename, join } from "node:path";
+import { homedir } from "node:os";
+
+const MAX_INJECT_CHARS = 10_000;
+const MAX_CAPTURE_TURNS = 50;
+
+export type HookEvent =
+  | "session-start"
+  | "user-prompt-submit"
+  | "post-tool-use"
+  | "stop"
+  | "search"
+  | "search-stdin"
+  | "status"
+  | "clear-session";
+
+export interface HookInput {
+  stdin: string;
+  client: GatewayClient;
+  args?: string[];
+}
+
+export async function handleHook(event: HookEvent, input: HookInput): Promise<string> {
+  const data = parseStdin(input.stdin);
+  switch (event) {
+    case "session-start":
+      return handleSessionStart(data, input.client);
+    case "user-prompt-submit":
+      return handleUserPromptSubmit(data, input.client);
+    case "post-tool-use":
+      return handlePostToolUse(data, input.client);
+    case "stop":
+      return handleStop(data, input.client);
+    case "search":
+      return handleSearch(input.args ?? [], input.client);
+    case "search-stdin":
+      return handleSearchStdin(input.stdin, input.client);
+    case "status":
+      return handleStatus(input.client);
+    case "clear-session":
+      return handleClearSession(data, input.client);
+    default:
+      return "";
+  }
+}
+
+interface HookStdin {
+  session_id?: string;
+  transcript_path?: string;
+  cwd?: string;
+  prompt?: string;
+  source?: string;
+  tool_name?: string;
+  tool_input?: unknown;
+  tool_response?: unknown;
+  tool_use_id?: string;
+  stop_hook_active?: boolean;
+}
+
+function parseStdin(raw: string): HookStdin {
+  if (!raw) return {};
+  try {
+    return JSON.parse(raw) as HookStdin;
+  } catch {
+    return {};
+  }
+}
+
+async function handleSessionStart(_data: HookStdin, client: GatewayClient): Promise<string> {
+  await client.health();
+  return "";
+}
+
+async function handleUserPromptSubmit(data: HookStdin, client: GatewayClient): Promise<string> {
+  const prompt = data.prompt ?? "";
+  const cwd = data.cwd ?? process.cwd();
+  if (!prompt) return "";
+
+  const sessionKey = getSessionKey(cwd);
+
+  // Primary path: L1/L2/L3 recall (structured atoms + persona + scene).
+  const recall = await client.recall(prompt, sessionKey);
+  let context = recall.context ?? "";
+
+  // Fallback 1: daemon /search/conversations (FTS5 BM25 on L0 table).
+  if (!context) {
+    const conv = await client.searchConversations(prompt, {
+      limit: 3,
+      sessionKey,
+    });
+    if (conv.total > 0 && conv.results) {
+      context = `## Past conversations (relevant to current prompt)\n\n${conv.results}`;
+    }
+  }
+
+  // Fallback 2: direct L0 jsonl file scan. Covers the case where FTS5 is
+  // unavailable (e.g. Node.js built-in node:sqlite lacks fts5 module) AND
+  // no embedding service is configured. Reads $TDAI_DATA_DIR/conversations/
+  // and does simple keyword matching — no ranking, but good enough to
+  // surface relevant history on day zero.
+  if (!context) {
+    const dataDir = process.env.TDAI_DATA_DIR;
+    if (dataDir) {
+      context = await searchL0JsonlDirect(join(dataDir, "conversations"), prompt, sessionKey, 3);
+    }
+  }
+
+  if (!context) return "";
+
+  if (context.length > MAX_INJECT_CHARS) {
+    context =
+      context.slice(0, MAX_INJECT_CHARS - 100) +
+      "\n\n[…recall truncated — use /memory-search for full results…]";
+  }
+  return JSON.stringify({
+    hookSpecificOutput: {
+      hookEventName: "UserPromptSubmit",
+      additionalContext: context,
+    },
+  });
+}
+
+async function handlePostToolUse(_data: HookStdin, _client: GatewayClient): Promise<string> {
+  // No-op fallback. PostToolUse capture is intentionally deferred to a
+  // follow-up PR — see spec §5.3 for the buffer endpoint design. The
+  // hooks.json registration was removed so this handler is unreachable
+  // by default; it remains here only as a safety net if someone manually
+  // re-enables the PostToolUse hook before the follow-up lands.
+  return "";
+}
+
+async function handleStop(data: HookStdin, client: GatewayClient): Promise<string> {
+  if (data.stop_hook_active === true) return "";
+  if (!data.transcript_path) return "";
+
+  // cc may trigger Stop before the last assistant block is flushed to disk.
+  // Poll the file size until two consecutive 100ms ticks see identical bytes,
+  // capped at 2s. Replaces a fragile 800ms hard sleep that still missed slow
+  // disks on real-machine validation.
+  await waitForTranscriptStable(data.transcript_path, 2_000);
+
+  const allTurns = await readAllTurns(data.transcript_path);
+  if (allTurns.length === 0) return "";
+
+  // Persist a per-session cursor so the next Stop only sends turns appended
+  // after this one. Without it, every Stop posts the latest N turns and the
+  // Gateway writes them to L0 again, duplicating long sessions across calls.
+  const dataDir = resolveDataDir();
+  const cursorId = sanitizeCursorId(
+    data.session_id ?? (basename(data.transcript_path).replace(/\.jsonl$/, "") || "default"),
+  );
+  const lastSent = await readCursor(dataDir, cursorId);
+
+  let newTurns = allTurns.slice(lastSent);
+  if (newTurns.length === 0) return "";
+
+  // Bound the first capture so a pre-existing long transcript doesn't dump
+  // hundreds of turns in a single /capture request.
+  if (newTurns.length > MAX_CAPTURE_TURNS) {
+    newTurns = newTurns.slice(-MAX_CAPTURE_TURNS);
+  }
+
+  const cwd = data.cwd ?? process.cwd();
+  const sessionKey = getSessionKey(cwd);
+
+  const messages = newTurns.flatMap((t) => [
+    { role: "user" as const, content: t.user },
+    { role: "assistant" as const, content: t.assistant },
+  ]);
+
+  const lastTurn = newTurns[newTurns.length - 1];
+  await client.captureTurn({
+    user_content: lastTurn.user,
+    assistant_content: lastTurn.assistant,
+    messages,
+    session_key: sessionKey,
+    session_id: data.session_id,
+  });
+  await writeCursor(dataDir, cursorId, allTurns.length);
+  return "";
+}
+
+async function waitForTranscriptStable(path: string, maxMs: number): Promise<void> {
+  const start = Date.now();
+  let lastSize = -1;
+  let stableTicks = 0;
+  while (Date.now() - start < maxMs) {
+    try {
+      const st = await stat(path);
+      if (st.size === lastSize) {
+        stableTicks++;
+        if (stableTicks >= 2) return;
+      } else {
+        stableTicks = 0;
+        lastSize = st.size;
+      }
+    } catch {
+      // not yet written
+    }
+    await new Promise((r) => setTimeout(r, 100));
+  }
+}
+
+function resolveDataDir(): string {
+  return process.env.CLAUDE_PLUGIN_DATA ?? join(homedir(), ".tdai-memory");
+}
+
+function sanitizeCursorId(id: string): string {
+  return id.replace(/[^A-Za-z0-9_-]/g, "_").slice(0, 64) || "default";
+}
+
+async function readCursor(dataDir: string, cursorId: string): Promise<number> {
+  try {
+    const raw = await readFile(join(dataDir, "cursors", `${cursorId}.json`), "utf-8");
+    const obj = JSON.parse(raw) as { lastSentIndex?: unknown };
+    return typeof obj.lastSentIndex === "number" && obj.lastSentIndex >= 0
+      ? obj.lastSentIndex
+      : 0;
+  } catch {
+    return 0;
+  }
+}
+
+async function writeCursor(dataDir: string, cursorId: string, lastSentIndex: number): Promise<void> {
+  const dir = join(dataDir, "cursors");
+  await mkdir(dir, { recursive: true });
+  const tmp = join(dir, `${cursorId}.json.tmp`);
+  const final = join(dir, `${cursorId}.json`);
+  await writeFile(
+    tmp,
+    JSON.stringify({ lastSentIndex, updatedAt: new Date().toISOString() }),
+    { mode: 0o600 },
+  );
+  // Atomic replace so a crashed write never corrupts the cursor file.
+  await rename(tmp, final);
+}
+
+async function handleSearch(args: string[], client: GatewayClient): Promise<string> {
+  const query = args.join(" ").trim();
+  if (!query) return "Usage: /memory-search <query>";
+  const result = await client.searchMemories(query, { limit: 10 });
+  return result.results || "No memories found.";
+}
+
+/**
+ * Read the query from stdin instead of argv. Used by the memory-search skill
+ * to avoid the cc `$ARGUMENTS` literal-replaceAll RCE surface (see Anthropic
+ * GH issue #16163) — when the query rides on stdin it never touches a shell
+ * word-split or expansion stage.
+ */
+async function handleSearchStdin(rawStdin: string, client: GatewayClient): Promise<string> {
+  const query = rawStdin.trim();
+  if (!query) return "Usage: pipe the query to stdin";
+  const result = await client.searchMemories(query, { limit: 10 });
+  return result.results || "No memories found.";
+}
+
+async function handleStatus(client: GatewayClient): Promise<string> {
+  const ok = await client.health();
+  const dataDir = resolveDataDir();
+  const hookLog = join(dataDir, "hook.log");
+  const daemonLog = join(dataDir, "daemon.log");
+  const header = ok ? "TDAI memory daemon: healthy" : "TDAI memory daemon: unreachable";
+  return `${header}\nhook log:   ${hookLog}\ndaemon log: ${daemonLog}`;
+}
+
+async function handleClearSession(data: HookStdin, client: GatewayClient): Promise<string> {
+  const cwd = data.cwd ?? process.cwd();
+  const sessionKey = getSessionKey(cwd);
+  await client.sessionEnd(sessionKey);
+  return `Cleared session buffer for: ${sessionKey}`;
+}
+
+// ============================================================================
+// L0 jsonl direct search (last-resort fallback)
+// ============================================================================
+
+interface L0JsonlRecord {
+  sessionKey?: string;
+  role?: string;
+  content?: string;
+  recordedAt?: string;
+}
+
+async function searchL0JsonlDirect(
+  convDir: string,
+  query: string,
+  sessionKey: string,
+  limit: number,
+): Promise<string> {
+  let files: string[];
+  try {
+    files = (await readdir(convDir)).filter((f) => f.endsWith(".jsonl"));
+  } catch {
+    return "";
+  }
+  if (files.length === 0) return "";
+
+  // Sort by mtime desc so newer conversations are scanned first. Filename
+  // ordering used to assume "YYYY-MM-DD.jsonl" naming, which broke for any
+  // other scheme (e.g. cc transcript UUIDs).
+  const withMtime = await Promise.all(
+    files.map(async (f) => {
+      try {
+        const st = await stat(join(convDir, f));
+        return { name: f, mtime: st.mtimeMs };
+      } catch {
+        return { name: f, mtime: 0 };
+      }
+    }),
+  );
+  withMtime.sort((a, b) => b.mtime - a.mtime);
+  const sortedFiles = withMtime.map((e) => e.name);
+
+  // CJK 2-gram tokens, sans a small stop set. The previous list stopped
+  // common content-bearing pronouns ("我们/你们/这个/可以/有没/没有" etc.)
+  // which silently shredded recall for everyday Chinese queries — keep only
+  // genuinely low-signal interrogative / connective fragments here.
+  const CJK_STOP = new Set([
+    "之前", "前聊", "聊的", "还记", "记得", "得么", "得吗",
+    "一下", "怎么", "什么", "关于", "知道", "以前", "上次",
+    "如何", "为何", "为啥", "哪里", "哪些", "为什",
+    "请问", "请帮", "帮我", "麻烦",
+  ]);
+  const keywords: string[] = [];
+  for (const seg of query.toLowerCase().replace(/[^\w一-鿿]/g, " ").split(/\s+/)) {
+    if (!seg) continue;
+    if (/[一-鿿]/.test(seg)) {
+      for (let i = 0; i <= seg.length - 2; i++) {
+        const gram = seg.slice(i, i + 2);
+        if (!CJK_STOP.has(gram)) keywords.push(gram);
+      }
+    } else if (seg.length >= 2) {
+      keywords.push(seg);
+    }
+  }
+  if (keywords.length === 0) return "";
+
+  type Match = { role: string; content: string; recordedAt: string; hits: number };
+  const matches: Match[] = [];
+  const seen = new Set<string>();
+
+  for (const f of sortedFiles) {
+    // Stream the file line-by-line: large jsonl (multi-MB) used to be
+    // readFile'd into memory in full, which OOM'd on long-running sessions.
+    let rl;
+    try {
+      rl = createInterface({
+        input: createReadStream(join(convDir, f), { encoding: "utf-8" }),
+        crlfDelay: Infinity,
+      });
+    } catch {
+      continue;
+    }
+    try {
+      for await (const line of rl) {
+        if (!line.trim()) continue;
+        try {
+          const rec = JSON.parse(line) as L0JsonlRecord;
+          if (rec.sessionKey !== sessionKey) continue;
+          const text = rec.content ?? "";
+          const textLower = text.toLowerCase();
+          const hits = keywords.filter((kw) => textLower.includes(kw)).length;
+          if (hits === 0) continue;
+          // Deduplicate identical content (e.g. repeated user prompts).
+          const fingerprint = text.slice(0, 120);
+          if (seen.has(fingerprint)) continue;
+          seen.add(fingerprint);
+          matches.push({
+            role: rec.role ?? "unknown",
+            content: text.length > 2000 ? text.slice(0, 2000) + "…" : text,
+            recordedAt: rec.recordedAt ?? "",
+            hits,
+          });
+        } catch {
+          // skip malformed lines
+        }
+      }
+    } finally {
+      rl.close();
+    }
+  }
+
+  if (matches.length === 0) return "";
+
+  // Rank: assistant messages first (more informative than user prompts),
+  // then by keyword hits (desc), then content length (desc).
+  const rolePriority = (r: string) => (r === "assistant" ? 1 : 0);
+  matches.sort(
+    (a, b) =>
+      rolePriority(b.role) - rolePriority(a.role) ||
+      b.hits - a.hits ||
+      b.content.length - a.content.length,
+  );
+
+  const selected = matches.slice(0, limit);
+  const lines = [`Found ${selected.length} matching conversation(s):`, ""];
+  for (const m of selected) {
+    lines.push("---");
+    lines.push(`**[${m.role}]** ${m.recordedAt}`);
+    lines.push("");
+    lines.push(m.content);
+    lines.push("");
+  }
+  return `## Past conversations (relevant to current prompt)\n\n${lines.join("\n")}`;
+}
+
+// ============================================================================
+// CLI entry — only runs when this file is executed directly via `node hook.js`
+// ============================================================================
+
+async function main(): Promise<void> {
+  const event = (process.argv[2] ?? "") as HookEvent;
+  const args = process.argv.slice(3);
+
+  const dataDir = resolveDataDir();
+  const logPath = join(dataDir, "hook.log");
+
+  try {
+    const stdin = await readStdin();
+
+    const mgr = new DaemonManager({ dataDir });
+    let state = await readDaemonState(dataDir);
+
+    if (event === "session-start" && !state) {
+      try {
+        state = await mgr.ensureRunning(process.ppid);
+      } catch (err) {
+        await safeLog(logPath, `session-start: spawn failed: ${(err as Error).message}`);
+      }
+    }
+
+    if (!state) {
+      await safeLog(logPath, `${event}: no daemon, skipped`);
+      return;
+    }
+
+    const token = await mgr.readToken(state.tokenPath);
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${state.port}`,
+      token,
+      timeoutMs: event === "user-prompt-submit" ? 4_000 : 10_000,
+      logPath,
+    });
+
+    const out = await handleHook(event, { stdin, client, args });
+    if (out) process.stdout.write(out);
+  } catch (err) {
+    await safeLog(logPath, `${event}: ${(err as Error).message}`);
+  }
+}
+
+function readStdin(): Promise<string> {
+  return new Promise((resolve) => {
+    if (process.stdin.isTTY) {
+      resolve("");
+      return;
+    }
+    const chunks: Buffer[] = [];
+    process.stdin.on("data", (c) => chunks.push(c));
+    process.stdin.on("end", () => resolve(Buffer.concat(chunks).toString("utf-8")));
+    process.stdin.on("error", () => resolve(""));
+  });
+}
+
+async function safeLog(path: string, msg: string): Promise<void> {
+  try {
+    await appendFile(path, `[${new Date().toISOString()}] ${msg}\n`);
+  } catch {
+    // ignore
+  }
+}
+
+const isMainModule = import.meta.url === `file://${process.argv[1]}`;
+if (isMainModule) {
+  main().catch(() => process.exit(0));
+}
diff --git a/claude-code-plugin/lib/session-key.ts b/claude-code-plugin/lib/session-key.ts
new file mode 100644
index 0000000..d9b21ef
--- /dev/null
+++ b/claude-code-plugin/lib/session-key.ts
@@ -0,0 +1,22 @@
+/**
+ * Compute a stable session key for a given working directory.
+ *
+ * Default: SHA-256 of the normalized absolute path, first 16 hex chars (64 bits).
+ * Override: TDAI_SESSION_KEY env var, if non-empty.
+ *
+ * Used by hook handlers to partition memory by project rather than by
+ * Claude Code session, so multiple cc terminals on the same project share
+ * recall results.
+ */
+
+import { createHash } from "node:crypto";
+import { resolve } from "node:path";
+
+export function getSessionKey(cwd: string): string {
+  const override = process.env.TDAI_SESSION_KEY;
+  if (override && override.length > 0) {
+    return override;
+  }
+  const normalized = resolve(cwd);
+  return createHash("sha256").update(normalized).digest("hex").slice(0, 16);
+}
diff --git a/claude-code-plugin/lib/transcript.ts b/claude-code-plugin/lib/transcript.ts
new file mode 100644
index 0000000..12acd50
--- /dev/null
+++ b/claude-code-plugin/lib/transcript.ts
@@ -0,0 +1,163 @@
+/**
+ * Parse cc transcript jsonl files defensively. cc's transcript format is
+ * NOT a documented stable API — fields may rename across versions. This
+ * module returns null on any unexpected shape rather than throwing.
+ */
+
+import { readFile } from "node:fs/promises";
+
+export interface TranscriptEntry {
+  type: "user" | "assistant" | string;
+  role: string;
+  content: string;
+  /** True when the raw message.content was an array (tool_result, skill
+   *  output, multi-modal input).  Used by readAllTurns to avoid treating
+   *  injected system messages as real user prompts. */
+  contentIsArray: boolean;
+  uuid?: string;
+  parentUuid?: string;
+  timestamp?: string;
+}
+
+export interface Turn {
+  user: string;
+  assistant: string;
+}
+
+/**
+ * Parse a single JSONL line. Returns null on malformed or unrecognized shape.
+ */
+export function parseTranscriptLine(line: string): TranscriptEntry | null {
+  let obj: unknown;
+  try {
+    obj = JSON.parse(line);
+  } catch {
+    return null;
+  }
+  if (!obj || typeof obj !== "object") return null;
+  const o = obj as Record<string, unknown>;
+
+  const type = typeof o.type === "string" ? o.type : null;
+  if (!type) return null;
+
+  const message = o.message as Record<string, unknown> | undefined;
+  if (!message || typeof message !== "object") return null;
+
+  const role = typeof message.role === "string" ? message.role : type;
+
+  const content = extractContent(message.content);
+  if (content === null) return null;
+
+  return {
+    type,
+    role,
+    content,
+    contentIsArray: Array.isArray(message.content),
+    uuid: typeof o.uuid === "string" ? o.uuid : undefined,
+    parentUuid: typeof o.parentUuid === "string" ? o.parentUuid : undefined,
+    timestamp: typeof o.timestamp === "string" ? o.timestamp : undefined,
+  };
+}
+
+function extractContent(content: unknown): string | null {
+  if (typeof content === "string") return content;
+  if (Array.isArray(content)) {
+    const parts: string[] = [];
+    for (const item of content) {
+      if (!item || typeof item !== "object") continue;
+      const it = item as Record<string, unknown>;
+      if (typeof it.text === "string") parts.push(it.text);
+    }
+    return parts.length > 0 ? parts.join("\n") : null;
+  }
+  return null;
+}
+
+/**
+ * Read the latest complete user+assistant turn from a transcript jsonl file.
+ * Returns null if the file is missing, empty, or contains no complete turn.
+ *
+ * A single turn may span multiple transcript entries when the assistant
+ * response is split by tool-use / tool-result cycles. This function merges
+ * all assistant text blocks between the last real user prompt and the end
+ * of the file so the full response is captured — not just the first or
+ * last fragment.
+ */
+export async function readLatestTurn(path: string): Promise<Turn | null> {
+  let raw: string;
+  try {
+    raw = await readFile(path, "utf-8");
+  } catch {
+    return null;
+  }
+  const lines = raw.split(/\r?\n/).filter((l) => l.length > 0);
+  if (lines.length === 0) return null;
+
+  // Walk backwards collecting ALL assistant text blocks until we hit a
+  // real user prompt (tool_result entries return null from
+  // parseTranscriptLine, so they are silently skipped).
+  const assistantParts: string[] = [];
+  let user: string | null = null;
+
+  for (let i = lines.length - 1; i >= 0; i--) {
+    const entry = parseTranscriptLine(lines[i]);
+    if (!entry) continue;
+    if (entry.role === "assistant") {
+      if (entry.content) assistantParts.unshift(entry.content);
+    } else if (entry.role === "user" && !entry.contentIsArray) {
+      // Only treat string-content user entries as real prompts.
+      // Array-content entries are tool_result / skill output / attachments.
+      if (assistantParts.length > 0) {
+        user = entry.content;
+        break;
+      }
+    }
+  }
+
+  if (user === null || assistantParts.length === 0) return null;
+  return { user, assistant: assistantParts.join("\n\n") };
+}
+
+/**
+ * Read ALL complete user+assistant turns from a transcript. Each turn
+ * merges multi-part assistant responses (split by tool cycles) into a
+ * single string, same as {@link readLatestTurn}.
+ */
+export async function readAllTurns(path: string): Promise<Turn[]> {
+  let raw: string;
+  try {
+    raw = await readFile(path, "utf-8");
+  } catch {
+    return [];
+  }
+  const lines = raw.split(/\r?\n/).filter((l) => l.length > 0);
+  if (lines.length === 0) return [];
+
+  const turns: Turn[] = [];
+  let currentUser: string | null = null;
+  let assistantParts: string[] = [];
+
+  for (const line of lines) {
+    const entry = parseTranscriptLine(line);
+    if (!entry) continue;
+
+    if (entry.role === "user" && !entry.contentIsArray) {
+      // Only string-content user entries are real prompts.
+      // Array-content entries (tool_result, skill output) are skipped.
+      if (currentUser !== null && assistantParts.length > 0) {
+        turns.push({ user: currentUser, assistant: assistantParts.join("\n\n") });
+      }
+      currentUser = entry.content;
+      assistantParts = [];
+    } else if (entry.role === "assistant" && entry.content) {
+      assistantParts.push(entry.content);
+    }
+  }
+
+  // Flush final turn.
+  if (currentUser !== null && assistantParts.length > 0) {
+    turns.push({ user: currentUser, assistant: assistantParts.join("\n\n") });
+  }
+
+  return turns;
+}
diff --git a/claude-code-plugin/skills/memory-clear-session/SKILL.md b/claude-code-plugin/skills/memory-clear-session/SKILL.md
new file mode 100644
index 0000000..61126fa
--- /dev/null
+++ b/claude-code-plugin/skills/memory-clear-session/SKILL.md
@@ -0,0 +1,11 @@
+---
+name: memory-clear-session
+description: Manually clear the current session's accumulated memory buffer for this working directory. DESTRUCTIVE — call only when the user explicitly asks to forget the current context.
+disable-model-invocation: true
+---
+
+The user has explicitly requested to clear this session's memory buffer.
+
+!`node "${CLAUDE_PLUGIN_ROOT}/dist/lib/hook.mjs" clear-session`
+
+Confirm to the user that the session buffer was cleared. Long-term memories (L1/L2/L3) are untouched.
diff --git a/claude-code-plugin/skills/memory-search/SKILL.md b/claude-code-plugin/skills/memory-search/SKILL.md
new file mode 100644
index 0000000..4014a2b
--- /dev/null
+++ b/claude-code-plugin/skills/memory-search/SKILL.md
@@ -0,0 +1,21 @@
+---
+name: memory-search
+description: Search long-term memory (TencentDB Agent Memory) for relevant past interactions, preferences, or decisions. Use when the user asks "do you remember…" or references past work in this project.
+argument-hint: <query>
+---
+
+The user wants to search the long-term memory store for the following query:
+
+$ARGUMENTS
+
+Run the search via the Bash tool. The plugin reads the query from **stdin** to keep user-controlled text outside any shell word-split / expansion stage (cc currently performs a literal `replaceAll` on `$ARGUMENTS`, so passing it as an argv element would expose a command-injection surface — see Anthropic GH issue #16163).
+
+Use a here-document with a long random sentinel:
+
+```bash
+node "${CLAUDE_PLUGIN_ROOT}/dist/lib/hook.mjs" search-stdin <<'__TDAI_QUERY_EOF__'
+<paste the user's query verbatim, on one or more lines, exactly as shown above>
+__TDAI_QUERY_EOF__
+```
+
+Then summarize the matching memories to answer the user's question. If no memories were returned, say so plainly.
diff --git a/claude-code-plugin/skills/memory-status/SKILL.md b/claude-code-plugin/skills/memory-status/SKILL.md
new file mode 100644
index 0000000..adc5f35
--- /dev/null
+++ b/claude-code-plugin/skills/memory-status/SKILL.md
@@ -0,0 +1,10 @@
+---
+name: memory-status
+description: Check the health of the TDAI memory daemon. Reports whether the local gateway is running and reachable.
+---
+
+Checking TDAI memory daemon status...
+
+!`node "${CLAUDE_PLUGIN_ROOT}/dist/lib/hook.mjs" status`
+
+Report the result to the user.
diff --git a/claude-code-plugin/skills/tdai-memory/SKILL.md b/claude-code-plugin/skills/tdai-memory/SKILL.md
new file mode 100644
index 0000000..6c7c8c1
--- /dev/null
+++ b/claude-code-plugin/skills/tdai-memory/SKILL.md
@@ -0,0 +1,29 @@
+---
+name: tdai-memory
+description: TencentDB Agent Memory provides long-term memory (user preferences, past decisions, style) and short-term project context. Use this skill to understand how to leverage memory in this conversation.
+---
+
+# Using TencentDB Agent Memory
+
+This plugin gives Claude long-term + symbolic short-term memory.
+
+## What happens automatically
+
+- Every prompt: relevant past memories are pre-loaded into context (via `UserPromptSubmit` hook → `/recall`)
+- Every turn: the user/assistant exchange is captured to L0 (via `Stop` hook → `/capture`); structured L1/L2/L3 extraction runs in the background
+
+## Manual control (slash skills)
+
+- `/memory-search <query>` — search past memories for a specific topic
+- `/memory-status` — check daemon health
+- `/memory-clear-session` — clear the current session's buffer (manual invocation only)
+
+## Hints for Claude
+
+When the user asks "do you remember…" or references prior work, the recalled context (in the `<system-reminder>` block this turn) is your source. If the context is missing, suggest the user run `/memory-search <query>`.
+
+## Where data lives
+
+Memory is stored under `${CLAUDE_PLUGIN_DATA}/memory-tdai/` — a SQLite + sqlite-vec database plus markdown snapshots. Data is partitioned by working-directory hash by default; export `TDAI_SESSION_KEY=<custom>` to override.
+
+See the project README for full architecture details.
diff --git a/claude-code-plugin/tests/daemon.test.ts b/claude-code-plugin/tests/daemon.test.ts
new file mode 100644
index 0000000..aaf98ec
--- /dev/null
+++ b/claude-code-plugin/tests/daemon.test.ts
@@ -0,0 +1,149 @@
+import { describe, it, expect, beforeEach, afterEach } from "vitest";
+import { mkdtemp, rm, writeFile, readFile, stat } from "node:fs/promises";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import {
+  DaemonManager,
+  readDaemonState,
+  writeDaemonState,
+} from "../lib/daemon.js";
+
+let dataDir: string;
+
+beforeEach(async () => {
+  dataDir = await mkdtemp(join(tmpdir(), "tdai-daemon-test-"));
+});
+
+afterEach(async () => {
+  await rm(dataDir, { recursive: true, force: true });
+});
+
+describe("DaemonManager state file", () => {
+  it("readDaemonState returns null when state.json missing", async () => {
+    const state = await readDaemonState(dataDir);
+    expect(state).toBeNull();
+  });
+
+  it("writeDaemonState writes a parseable JSON file", async () => {
+    await writeDaemonState(dataDir, {
+      pid: 999,
+      port: 8421,
+      ccPid: 998,
+      startedAt: "2026-05-15T10:00:00Z",
+      tokenPath: join(dataDir, "token"),
+    });
+    const state = await readDaemonState(dataDir);
+    expect(state).toEqual({
+      pid: 999,
+      port: 8421,
+      ccPid: 998,
+      startedAt: "2026-05-15T10:00:00Z",
+      tokenPath: join(dataDir, "token"),
+    });
+  });
+});
+
+describe("DaemonManager token file", () => {
+  it("generateToken creates a 600-mode file with 256-bit base64url token", async () => {
+    const mgr = new DaemonManager({ dataDir });
+    const tokenPath = await mgr.generateToken();
+    const content = await readFile(tokenPath, "utf-8");
+    expect(content).toMatch(/^[A-Za-z0-9_-]{43}$/);
+    const st = await stat(tokenPath);
+    expect(st.mode & 0o777).toBe(0o600);
+  });
+
+  it("readToken throws when permission is too loose", async () => {
+    const tokenPath = join(dataDir, "token");
+    await writeFile(tokenPath, "abc", { mode: 0o644 });
+    const mgr = new DaemonManager({ dataDir });
+    await expect(mgr.readToken(tokenPath)).rejects.toThrow(/permission/i);
+  });
+
+  it("readToken returns the trimmed token when permission is 600", async () => {
+    const tokenPath = join(dataDir, "token");
+    await writeFile(tokenPath, "secret-token\n", { mode: 0o600 });
+    const mgr = new DaemonManager({ dataDir });
+    const tok = await mgr.readToken(tokenPath);
+    expect(tok).toBe("secret-token");
+  });
+});
+
+describe("DaemonManager findFreePort", () => {
+  it("returns a free port within range", async () => {
+    const mgr = new DaemonManager({ dataDir });
+    const port = await mgr.findFreePort(18500, 18510);
+    expect(port).toBeGreaterThanOrEqual(18500);
+    expect(port).toBeLessThanOrEqual(18510);
+  });
+
+  it("throws when all ports are taken", async () => {
+    const http = await import("node:http");
+    const blockers: import("node:http").Server[] = [];
+    for (let p = 18600; p <= 18602; p++) {
+      const s = http.createServer();
+      await new Promise<void>((r) => s.listen(p, "127.0.0.1", () => r()));
+      blockers.push(s);
+    }
+    try {
+      const mgr = new DaemonManager({ dataDir });
+      await expect(mgr.findFreePort(18600, 18602)).rejects.toThrow(/no free port/i);
+    } finally {
+      for (const s of blockers) await new Promise<void>((r) => s.close(() => r()));
+    }
+  });
+});
+
+describe("DaemonManager probe", () => {
+  it("probe returns false when state.json is missing", async () => {
+    const mgr = new DaemonManager({ dataDir });
+    expect(await mgr.probe()).toBe(false);
+  });
+
+  it("probe returns false when daemon health check fails", async () => {
+    await writeDaemonState(dataDir, {
+      pid: 99999,
+      port: 1,
+      ccPid: process.pid,
+      startedAt: "2026-05-15T10:00:00Z",
+      tokenPath: join(dataDir, "token"),
+    });
+    await writeFile(join(dataDir, "token"), "x", { mode: 0o600 });
+    const mgr = new DaemonManager({ dataDir });
+    expect(await mgr.probe()).toBe(false);
+  });
+});
+
+describe("DaemonManager ensureRunning ccPid mismatch", () => {
+  // Confirms reuseExisting refuses a state.json whose ccPid differs from the
+  // caller's ccPid — guards against picking up a daemon spawned by a different
+  // cc instance on a shared box.
+  it("does NOT reuse a daemon recorded for a foreign ccPid", async () => {
+    const tokenPath = join(dataDir, "token");
+    await writeFile(tokenPath, "secret-foreign", { mode: 0o600 });
+    await writeDaemonState(dataDir, {
+      pid: 12345,
+      port: 18999, // nothing actually listening here
+      ccPid: 999_999, // some other cc
+      startedAt: "2026-05-15T10:00:00Z",
+      tokenPath,
+    });
+
+    const mgr = new DaemonManager({ dataDir, portStart: 18500, portEnd: 18510 });
+    // Stub spawn to a thin marker so we don't actually fork a daemon.
+    let spawnCalls = 0;
+    // eslint-disable-next-line @typescript-eslint/no-explicit-any
+    (mgr as any).spawn = async () => {
+      spawnCalls++;
+      return {
+        pid: 1,
+        port: 18500,
+        ccPid: process.pid,
+        startedAt: new Date().toISOString(),
+        tokenPath,
+      };
+    };
+    await mgr.ensureRunning(process.pid);
+    expect(spawnCalls).toBe(1);
+  });
+});
diff --git a/claude-code-plugin/tests/fixtures/transcript-sample.jsonl b/claude-code-plugin/tests/fixtures/transcript-sample.jsonl
new file mode 100644
index 0000000..141ce8b
--- /dev/null
+++ b/claude-code-plugin/tests/fixtures/transcript-sample.jsonl
@@ -0,0 +1,4 @@
+{"type":"user","message":{"role":"user","content":"first question"},"uuid":"u1","timestamp":"2026-05-15T10:00:00Z"}
+{"type":"assistant","message":{"role":"assistant","content":"first answer"},"uuid":"a1","parentUuid":"u1","timestamp":"2026-05-15T10:00:01Z"}
+{"type":"user","message":{"role":"user","content":"second question"},"uuid":"u2","timestamp":"2026-05-15T10:01:00Z"}
+{"type":"assistant","message":{"role":"assistant","content":"second answer"},"uuid":"a2","parentUuid":"u2","timestamp":"2026-05-15T10:01:01Z"}
\ No newline at end of file
diff --git a/claude-code-plugin/tests/gateway-client.test.ts b/claude-code-plugin/tests/gateway-client.test.ts
new file mode 100644
index 0000000..317cfca
--- /dev/null
+++ b/claude-code-plugin/tests/gateway-client.test.ts
@@ -0,0 +1,183 @@
+import { describe, it, expect, afterEach } from "vitest";
+import http from "node:http";
+import { GatewayClient } from "../lib/gateway-client.js";
+
+interface CapturedRequest {
+  method: string;
+  path: string;
+  headers: http.IncomingHttpHeaders;
+  body: string;
+}
+
+function startStubServer(
+  handler: (req: CapturedRequest) => { status: number; body: unknown },
+): Promise<{ port: number; close: () => Promise<void>; captured: CapturedRequest[] }> {
+  return new Promise((resolve) => {
+    const captured: CapturedRequest[] = [];
+    const server = http.createServer((req, res) => {
+      const chunks: Buffer[] = [];
+      req.on("data", (c) => chunks.push(c));
+      req.on("end", () => {
+        const captured1: CapturedRequest = {
+          method: req.method ?? "",
+          path: req.url ?? "",
+          headers: req.headers,
+          body: Buffer.concat(chunks).toString("utf-8"),
+        };
+        captured.push(captured1);
+        const { status, body } = handler(captured1);
+        const json = JSON.stringify(body);
+        res.writeHead(status, { "Content-Type": "application/json" });
+        res.end(json);
+      });
+    });
+    server.listen(0, "127.0.0.1", () => {
+      const port = (server.address() as { port: number }).port;
+      resolve({
+        port,
+        close: () => new Promise((r) => server.close(() => r())),
+        captured,
+      });
+    });
+  });
+}
+
+describe("GatewayClient", () => {
+  let stub: Awaited<ReturnType<typeof startStubServer>>;
+
+  afterEach(async () => {
+    if (stub) await stub.close();
+  });
+
+  it("sends Authorization: Bearer <token> on health probe", async () => {
+    stub = await startStubServer(() => ({
+      status: 200,
+      body: { status: "ok", version: "x", uptime: 1 },
+    }));
+
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${stub.port}`,
+      token: "secret-123",
+    });
+    const ok = await client.health();
+    expect(ok).toBe(true);
+    expect(stub.captured[0].headers.authorization).toBe("Bearer secret-123");
+  });
+
+  it("health returns false on non-200", async () => {
+    stub = await startStubServer(() => ({ status: 500, body: { error: "x" } }));
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${stub.port}`,
+      token: "t",
+    });
+    expect(await client.health()).toBe(false);
+  });
+
+  it("health returns false on connection error", async () => {
+    const client = new GatewayClient({
+      baseUrl: "http://127.0.0.1:1",
+      token: "t",
+    });
+    expect(await client.health()).toBe(false);
+  });
+
+  it("recall POSTs query and session_key, returns context string", async () => {
+    stub = await startStubServer(() => ({
+      status: 200,
+      body: { context: "recalled-content", strategy: "hybrid", memory_count: 3 },
+    }));
+
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${stub.port}`,
+      token: "t",
+    });
+    const result = await client.recall("hello", "session-abc");
+    expect(result.context).toBe("recalled-content");
+    expect(stub.captured[0].method).toBe("POST");
+    expect(stub.captured[0].path).toBe("/recall");
+    expect(JSON.parse(stub.captured[0].body)).toEqual({
+      query: "hello",
+      session_key: "session-abc",
+    });
+  });
+
+  it("recall returns empty context on error (silent failure)", async () => {
+    stub = await startStubServer(() => ({ status: 500, body: { error: "x" } }));
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${stub.port}`,
+      token: "t",
+    });
+    const result = await client.recall("hello", "k");
+    expect(result.context).toBe("");
+  });
+
+  it("captureTurn POSTs the expected payload", async () => {
+    stub = await startStubServer(() => ({
+      status: 200,
+      body: { l0_recorded: 1, scheduler_notified: true },
+    }));
+
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${stub.port}`,
+      token: "t",
+    });
+    await client.captureTurn({
+      user_content: "u",
+      assistant_content: "a",
+      session_key: "k",
+      session_id: "s",
+    });
+    expect(stub.captured[0].path).toBe("/capture");
+    expect(JSON.parse(stub.captured[0].body)).toEqual({
+      user_content: "u",
+      assistant_content: "a",
+      session_key: "k",
+      session_id: "s",
+    });
+  });
+
+  it("searchMemories POSTs query, returns results text", async () => {
+    stub = await startStubServer(() => ({
+      status: 200,
+      body: { results: "memory-text", total: 5, strategy: "hybrid" },
+    }));
+
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${stub.port}`,
+      token: "t",
+    });
+    const res = await client.searchMemories("query");
+    expect(res.results).toBe("memory-text");
+    expect(res.total).toBe(5);
+  });
+
+  it("searchConversations POSTs to /search/conversations", async () => {
+    stub = await startStubServer(() => ({
+      status: 200,
+      body: { results: "conv-text", total: 2 },
+    }));
+
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${stub.port}`,
+      token: "t",
+    });
+    const res = await client.searchConversations("q");
+    expect(res.results).toBe("conv-text");
+    expect(stub.captured[0].path).toBe("/search/conversations");
+  });
+
+  it("times out long-running requests", async () => {
+    const hangServer = http.createServer((_req, _res) => {});
+    await new Promise<void>((r) => hangServer.listen(0, "127.0.0.1", () => r()));
+    const port = (hangServer.address() as { port: number }).port;
+
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${port}`,
+      token: "t",
+      timeoutMs: 100,
+    });
+    const result = await client.recall("q", "k");
+    expect(result.context).toBe("");
+    hangServer.close();
+  });
+});
diff --git a/claude-code-plugin/tests/hook.test.ts b/claude-code-plugin/tests/hook.test.ts
new file mode 100644
index 0000000..564a709
--- /dev/null
+++ b/claude-code-plugin/tests/hook.test.ts
@@ -0,0 +1,371 @@
+import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
+import { mkdtemp, rm } from "node:fs/promises";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { handleHook } from "../lib/hook.js";
+import type { GatewayClient, RecallResult } from "../lib/gateway-client.js";
+
+function makeFakeClient(overrides: Partial<GatewayClient> = {}): GatewayClient {
+  return {
+    health: vi.fn(async () => true),
+    recall: vi.fn(async (): Promise<RecallResult> => ({ context: "recalled" })),
+    captureTurn: vi.fn(async () => ({ l0_recorded: 1, scheduler_notified: true })),
+    searchMemories: vi.fn(async () => ({ results: "m", total: 1 })),
+    searchConversations: vi.fn(async () => ({ results: "c", total: 1 })),
+    sessionEnd: vi.fn(async () => {}),
+    ...overrides,
+  } as unknown as GatewayClient;
+}
+
+describe("handleHook: user-prompt-submit", () => {
+  it("emits hookSpecificOutput with additionalContext from /recall", async () => {
+    const client = makeFakeClient();
+    const stdin = JSON.stringify({
+      session_id: "s1",
+      cwd: "/tmp/proj",
+      prompt: "what did we do?",
+    });
+    const out = await handleHook("user-prompt-submit", { stdin, client });
+    const parsed = JSON.parse(out);
+    expect(parsed.hookSpecificOutput.hookEventName).toBe("UserPromptSubmit");
+    expect(parsed.hookSpecificOutput.additionalContext).toBe("recalled");
+  });
+
+  it("truncates additionalContext over 10000 chars", async () => {
+    const big = "x".repeat(20_000);
+    const client = makeFakeClient({
+      recall: vi.fn(async () => ({ context: big })),
+    } as Partial<GatewayClient>);
+    const stdin = JSON.stringify({ session_id: "s", cwd: "/tmp/p", prompt: "q" });
+    const out = await handleHook("user-prompt-submit", { stdin, client });
+    const parsed = JSON.parse(out);
+    expect(parsed.hookSpecificOutput.additionalContext.length).toBeLessThanOrEqual(10_000);
+    expect(parsed.hookSpecificOutput.additionalContext).toContain("truncated");
+  });
+
+  it("emits empty string when all fallbacks return nothing (no TDAI_DATA_DIR)", async () => {
+    const orig = process.env.TDAI_DATA_DIR;
+    delete process.env.TDAI_DATA_DIR;
+    try {
+      const client = makeFakeClient({
+        recall: vi.fn(async () => ({ context: "" })),
+        searchConversations: vi.fn(async () => ({ results: "", total: 0 })),
+      } as Partial<GatewayClient>);
+      const stdin = JSON.stringify({ session_id: "s", cwd: "/tmp/p", prompt: "q" });
+      const out = await handleHook("user-prompt-submit", { stdin, client });
+      expect(out).toBe("");
+    } finally {
+      if (orig !== undefined) process.env.TDAI_DATA_DIR = orig;
+    }
+  });
+
+  it("falls back to L0 jsonl direct search when daemon search returns nothing", async () => {
+    const fs = await import("node:fs/promises");
+    const path = await import("node:path");
+    const os = await import("node:os");
+    const tmpDir = path.join(os.tmpdir(), `tdai-hook-test-${Date.now()}`);
+    const convDir = path.join(tmpDir, "conversations");
+    await fs.mkdir(convDir, { recursive: true });
+
+    const sessionKey = "abc123";
+    const records = [
+      JSON.stringify({ sessionKey, role: "user", content: "我用 Go 写 Kubernetes operator", recordedAt: "2026-05-15T06:00:00Z" }),
+      JSON.stringify({ sessionKey, role: "assistant", content: "K8s operator 用 Go 是主流", recordedAt: "2026-05-15T06:00:01Z" }),
+      JSON.stringify({ sessionKey: "other", role: "user", content: "unrelated stuff", recordedAt: "2026-05-15T06:00:02Z" }),
+    ];
+    await fs.writeFile(path.join(convDir, "2026-05-15.jsonl"), records.join("\n"));
+
+    const orig = process.env.TDAI_DATA_DIR;
+    process.env.TDAI_DATA_DIR = tmpDir;
+    try {
+      const client = makeFakeClient({
+        recall: vi.fn(async () => ({ context: "" })),
+        searchConversations: vi.fn(async () => ({ results: "", total: 0 })),
+      } as Partial<GatewayClient>);
+      // sessionKey in getSessionKey("/tmp/p") won't match "abc123", so we
+      // need cwd that hashes to "abc123" — easier: just mock getSessionKey.
+      // Instead, directly use a prompt that matches and set cwd so sessionKey
+      // matches the records. We'll use TDAI_SESSION_KEY override.
+      const origSK = process.env.TDAI_SESSION_KEY;
+      process.env.TDAI_SESSION_KEY = sessionKey;
+      try {
+        const stdin = JSON.stringify({ session_id: "s", cwd: "/tmp/p", prompt: "K8s operator" });
+        const out = await handleHook("user-prompt-submit", { stdin, client });
+        expect(out).not.toBe("");
+        const parsed = JSON.parse(out);
+        expect(parsed.hookSpecificOutput.additionalContext).toContain("Past conversations");
+        expect(parsed.hookSpecificOutput.additionalContext).toContain("Kubernetes operator");
+        // "unrelated stuff" from other session should NOT appear
+        expect(parsed.hookSpecificOutput.additionalContext).not.toContain("unrelated");
+      } finally {
+        if (origSK !== undefined) process.env.TDAI_SESSION_KEY = origSK;
+        else delete process.env.TDAI_SESSION_KEY;
+      }
+    } finally {
+      if (orig !== undefined) process.env.TDAI_DATA_DIR = orig;
+      else delete process.env.TDAI_DATA_DIR;
+      await fs.rm(tmpDir, { recursive: true, force: true });
+    }
+  });
+
+  it("falls back to L0 conversation search when /recall returns empty context", async () => {
+    const searchConversations = vi.fn(async () => ({
+      results: "Found 1 matching message(s):\n---\n**[user]** ...",
+      total: 1,
+    }));
+    const client = makeFakeClient({
+      recall: vi.fn(async () => ({ context: "" })),
+      searchConversations,
+    } as Partial<GatewayClient>);
+    const stdin = JSON.stringify({ session_id: "s", cwd: "/tmp/p", prompt: "k8s operator" });
+    const out = await handleHook("user-prompt-submit", { stdin, client });
+    const parsed = JSON.parse(out);
+    expect(parsed.hookSpecificOutput.hookEventName).toBe("UserPromptSubmit");
+    expect(parsed.hookSpecificOutput.additionalContext).toContain("Past conversations");
+    expect(parsed.hookSpecificOutput.additionalContext).toContain("Found 1 matching");
+    // L0 fallback should be scoped to the current project (sessionKey).
+    const call = searchConversations.mock.calls[0];
+    expect(call[1]?.sessionKey).toBeTruthy();
+    expect(call[1]?.limit).toBe(3);
+  });
+
+  it("skips L0 fallback when /recall already returns context", async () => {
+    const searchConversations = vi.fn(async () => ({ results: "should-not-be-called", total: 1 }));
+    const client = makeFakeClient({
+      recall: vi.fn(async () => ({ context: "primary-recall" })),
+      searchConversations,
+    } as Partial<GatewayClient>);
+    const stdin = JSON.stringify({ session_id: "s", cwd: "/tmp/p", prompt: "q" });
+    const out = await handleHook("user-prompt-submit", { stdin, client });
+    const parsed = JSON.parse(out);
+    expect(parsed.hookSpecificOutput.additionalContext).toBe("primary-recall");
+    expect(searchConversations).not.toHaveBeenCalled();
+  });
+});
+
+describe("handleHook: stop", () => {
+  // Stop now persists a per-session cursor to $CLAUDE_PLUGIN_DATA/cursors/.
+  // Isolate it to a tmpdir per test so cursor state never leaks across runs
+  // (a previously-written cursor would make the next run see lastSent>0 and
+  // suppress the captureTurn call this test asserts on).
+  let cursorDir: string;
+  beforeEach(async () => {
+    cursorDir = await mkdtemp(join(tmpdir(), "tdai-stop-cursor-"));
+    vi.stubEnv("CLAUDE_PLUGIN_DATA", cursorDir);
+  });
+  afterEach(async () => {
+    await rm(cursorDir, { recursive: true, force: true });
+  });
+
+  it("exits silently when stop_hook_active is true", async () => {
+    const captureTurn = vi.fn();
+    const client = makeFakeClient({
+      captureTurn,
+    } as Partial<GatewayClient>);
+    const stdin = JSON.stringify({
+      session_id: "s",
+      transcript_path: "/tmp/t.jsonl",
+      stop_hook_active: true,
+    });
+    const out = await handleHook("stop", { stdin, client });
+    expect(out).toBe("");
+    expect(captureTurn).not.toHaveBeenCalled();
+  });
+
+  it("calls captureTurn when stop_hook_active is false", async () => {
+    const captureTurn = vi.fn(async () => null);
+    const client = makeFakeClient({
+      captureTurn,
+    } as Partial<GatewayClient>);
+    const fs = await import("node:fs/promises");
+    const path = await import("node:path");
+    const os = await import("node:os");
+    const tmp = path.join(os.tmpdir(), `tx-${Date.now()}.jsonl`);
+    await fs.writeFile(
+      tmp,
+      [
+        '{"type":"user","message":{"role":"user","content":"q"},"uuid":"u"}',
+        '{"type":"assistant","message":{"role":"assistant","content":"a"},"uuid":"a"}',
+      ].join("\n"),
+    );
+    try {
+      const stdin = JSON.stringify({
+        session_id: "s",
+        transcript_path: tmp,
+        cwd: "/tmp/proj",
+        stop_hook_active: false,
+      });
+      await handleHook("stop", { stdin, client });
+      expect(captureTurn).toHaveBeenCalledOnce();
+      const call = captureTurn.mock.calls[0][0];
+      expect(call.user_content).toBe("q");
+      expect(call.assistant_content).toBe("a");
+    } finally {
+      await fs.unlink(tmp);
+    }
+  });
+
+  it("only sends new turns on the second Stop (cursor incremental capture)", async () => {
+    // Two-turn transcript, fire Stop once. Then append a third turn and fire
+    // Stop again. The second call must POST only the new turn — without the
+    // cursor a long session would re-write every turn on each Stop.
+    const captureTurn = vi.fn(async () => null);
+    const client = makeFakeClient({
+      captureTurn,
+    } as Partial<GatewayClient>);
+    const fs = await import("node:fs/promises");
+    const path = await import("node:path");
+    const os = await import("node:os");
+    const tmp = path.join(os.tmpdir(), `tx-cursor-${Date.now()}.jsonl`);
+    const lines = [
+      '{"type":"user","message":{"role":"user","content":"q1"},"uuid":"u1"}',
+      '{"type":"assistant","message":{"role":"assistant","content":"a1"},"uuid":"a1"}',
+      '{"type":"user","message":{"role":"user","content":"q2"},"uuid":"u2"}',
+      '{"type":"assistant","message":{"role":"assistant","content":"a2"},"uuid":"a2"}',
+    ];
+    await fs.writeFile(tmp, lines.join("\n"));
+    try {
+      const stdin = JSON.stringify({
+        session_id: "cursor-test",
+        transcript_path: tmp,
+        cwd: "/tmp/proj",
+        stop_hook_active: false,
+      });
+      await handleHook("stop", { stdin, client });
+      expect(captureTurn).toHaveBeenCalledTimes(1);
+      const first = captureTurn.mock.calls[0][0];
+      expect(first.messages).toHaveLength(4); // 2 turns × (user + assistant)
+
+      // Append a third turn and fire Stop again.
+      await fs.appendFile(
+        tmp,
+        "\n" +
+          [
+            '{"type":"user","message":{"role":"user","content":"q3"},"uuid":"u3"}',
+            '{"type":"assistant","message":{"role":"assistant","content":"a3"},"uuid":"a3"}',
+          ].join("\n"),
+      );
+      await handleHook("stop", { stdin, client });
+      expect(captureTurn).toHaveBeenCalledTimes(2);
+      const second = captureTurn.mock.calls[1][0];
+      // Cursor should have skipped the first 2 turns — only q3/a3 sent.
+      expect(second.messages).toHaveLength(2);
+      expect(second.user_content).toBe("q3");
+      expect(second.assistant_content).toBe("a3");
+    } finally {
+      await fs.unlink(tmp);
+    }
+  });
+
+  it("skips captureTurn when no new turns since last cursor", async () => {
+    const captureTurn = vi.fn(async () => null);
+    const client = makeFakeClient({
+      captureTurn,
+    } as Partial<GatewayClient>);
+    const fs = await import("node:fs/promises");
+    const path = await import("node:path");
+    const os = await import("node:os");
+    const tmp = path.join(os.tmpdir(), `tx-nochange-${Date.now()}.jsonl`);
+    await fs.writeFile(
+      tmp,
+      [
+        '{"type":"user","message":{"role":"user","content":"q"},"uuid":"u"}',
+        '{"type":"assistant","message":{"role":"assistant","content":"a"},"uuid":"a"}',
+      ].join("\n"),
+    );
+    try {
+      const stdin = JSON.stringify({
+        session_id: "nochange-test",
+        transcript_path: tmp,
+        cwd: "/tmp/proj",
+        stop_hook_active: false,
+      });
+      await handleHook("stop", { stdin, client });
+      await handleHook("stop", { stdin, client });
+      // Second Stop sees the same transcript → cursor already at end → no call.
+      expect(captureTurn).toHaveBeenCalledTimes(1);
+    } finally {
+      await fs.unlink(tmp);
+    }
+  });
+
+  it("caps first capture at MAX_CAPTURE_TURNS (50) when transcript is long", async () => {
+    const captureTurn = vi.fn(async () => null);
+    const client = makeFakeClient({
+      captureTurn,
+    } as Partial<GatewayClient>);
+    const fs = await import("node:fs/promises");
+    const path = await import("node:path");
+    const os = await import("node:os");
+    const tmp = path.join(os.tmpdir(), `tx-cap-${Date.now()}.jsonl`);
+    const lines: string[] = [];
+    for (let i = 0; i < 60; i++) {
+      lines.push(`{"type":"user","message":{"role":"user","content":"q${i}"},"uuid":"u${i}"}`);
+      lines.push(`{"type":"assistant","message":{"role":"assistant","content":"a${i}"},"uuid":"a${i}"}`);
+    }
+    await fs.writeFile(tmp, lines.join("\n"));
+    try {
+      const stdin = JSON.stringify({
+        session_id: "cap-test",
+        transcript_path: tmp,
+        cwd: "/tmp/proj",
+        stop_hook_active: false,
+      });
+      await handleHook("stop", { stdin, client });
+      expect(captureTurn).toHaveBeenCalledTimes(1);
+      const call = captureTurn.mock.calls[0][0];
+      // Capped at 50 turns × (user + assistant) = 100 messages.
+      expect(call.messages).toHaveLength(100);
+      // Cap takes the LAST 50 turns; lastTurn is q59/a59.
+      expect(call.user_content).toBe("q59");
+      expect(call.assistant_content).toBe("a59");
+    } finally {
+      await fs.unlink(tmp);
+    }
+  });
+});
+
+describe("handleHook: post-tool-use", () => {
+  it("fire-and-forget — does not throw on success", async () => {
+    const client = makeFakeClient();
+    const stdin = JSON.stringify({
+      session_id: "s",
+      tool_name: "Read",
+      tool_use_id: "t1",
+    });
+    await expect(
+      handleHook("post-tool-use", { stdin, client }),
+    ).resolves.not.toThrow();
+  });
+});
+
+describe("handleHook: session-start", () => {
+  it("invokes health probe, succeeds silently", async () => {
+    const client = makeFakeClient();
+    const stdin = JSON.stringify({ session_id: "s", cwd: "/tmp/p", source: "startup" });
+    await expect(
+      handleHook("session-start", { stdin, client }),
+    ).resolves.not.toThrow();
+  });
+});
+
+describe("handleHook: search (slash command)", () => {
+  it("returns formatted memory search output", async () => {
+    const client = makeFakeClient({
+      searchMemories: vi.fn(async () => ({ results: "MEMORY_RESULTS", total: 3 })),
+    } as Partial<GatewayClient>);
+    const out = await handleHook("search", { stdin: "", client, args: ["my", "query"] });
+    expect(out).toContain("MEMORY_RESULTS");
+  });
+});
+
+describe("handleHook: invalid event", () => {
+  it("returns empty string on unknown event", async () => {
+    const client = makeFakeClient();
+    const out = await handleHook("nonsense" as never, {
+      stdin: "{}",
+      client,
+    });
+    expect(out).toBe("");
+  });
+});
diff --git a/claude-code-plugin/tests/session-key.test.ts b/claude-code-plugin/tests/session-key.test.ts
new file mode 100644
index 0000000..bcffca5
--- /dev/null
+++ b/claude-code-plugin/tests/session-key.test.ts
@@ -0,0 +1,49 @@
+import { describe, it, expect, vi, afterEach } from "vitest";
+import { getSessionKey } from "../lib/session-key.js";
+
+describe("getSessionKey", () => {
+  afterEach(() => {
+    vi.unstubAllEnvs();
+  });
+
+  it("derives a 16-char hex key from cwd by default", () => {
+    const key = getSessionKey("/Users/alice/projects/foo");
+    expect(key).toMatch(/^[0-9a-f]{16}$/);
+  });
+
+  it("returns the same key for the same cwd", () => {
+    const k1 = getSessionKey("/Users/alice/projects/foo");
+    const k2 = getSessionKey("/Users/alice/projects/foo");
+    expect(k1).toBe(k2);
+  });
+
+  it("returns different keys for different cwd", () => {
+    const k1 = getSessionKey("/Users/alice/projects/foo");
+    const k2 = getSessionKey("/Users/alice/projects/bar");
+    expect(k1).not.toBe(k2);
+  });
+
+  it("normalizes the path (foo/./bar === foo/bar)", () => {
+    const k1 = getSessionKey("/Users/alice/projects/foo");
+    const k2 = getSessionKey("/Users/alice/projects/./foo");
+    expect(k1).toBe(k2);
+  });
+
+  it("normalizes trailing slashes", () => {
+    const k1 = getSessionKey("/Users/alice/projects/foo");
+    const k2 = getSessionKey("/Users/alice/projects/foo/");
+    expect(k1).toBe(k2);
+  });
+
+  it("honors TDAI_SESSION_KEY env override", () => {
+    vi.stubEnv("TDAI_SESSION_KEY", "custom-key-42");
+    const key = getSessionKey("/whatever");
+    expect(key).toBe("custom-key-42");
+  });
+
+  it("empty TDAI_SESSION_KEY falls back to cwd hash", () => {
+    vi.stubEnv("TDAI_SESSION_KEY", "");
+    const key = getSessionKey("/Users/alice/projects/foo");
+    expect(key).toMatch(/^[0-9a-f]{16}$/);
+  });
+});
diff --git a/claude-code-plugin/tests/smoke.e2e.test.ts b/claude-code-plugin/tests/smoke.e2e.test.ts
new file mode 100644
index 0000000..6d6c38c
--- /dev/null
+++ b/claude-code-plugin/tests/smoke.e2e.test.ts
@@ -0,0 +1,69 @@
+import { describe, it, expect, beforeAll, afterAll, vi } from "vitest";
+import { mkdtemp, rm } from "node:fs/promises";
+import { tmpdir } from "node:os";
+import { join } from "node:path";
+import { TdaiGateway } from "../../src/gateway/server.js";
+import { GatewayClient } from "../lib/gateway-client.js";
+
+describe("cc-plugin smoke e2e (in-process gateway)", () => {
+  let dataDir: string;
+  let gateway: TdaiGateway;
+  const PORT = 19421;
+  const TOKEN = "smoke-e2e-token-" + Math.random().toString(36).slice(2);
+
+  beforeAll(async () => {
+    dataDir = await mkdtemp(join(tmpdir(), "tdai-smoke-"));
+    vi.stubEnv("TDAI_GATEWAY_TOKEN", TOKEN);
+    vi.stubEnv("TDAI_DATA_DIR", dataDir);
+    // Re-stub TDAI_GATEWAY_TOKEN inside each test as well (vitest unstubEnvs: true).
+    gateway = new TdaiGateway({
+      server: { port: PORT, host: "127.0.0.1" },
+      data: { baseDir: dataDir },
+    } as never);
+    await gateway.start();
+  }, 60_000);
+
+  afterAll(async () => {
+    if (gateway) await gateway.stop();
+    if (dataDir) await rm(dataDir, { recursive: true, force: true });
+  });
+
+  it("rejects unauthenticated /health", async () => {
+    vi.stubEnv("TDAI_GATEWAY_TOKEN", TOKEN);
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${PORT}`,
+      token: "wrong-token",
+      timeoutMs: 5_000,
+    });
+    const ok = await client.health();
+    expect(ok).toBe(false);
+  });
+
+  it("accepts authenticated /health", async () => {
+    vi.stubEnv("TDAI_GATEWAY_TOKEN", TOKEN);
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${PORT}`,
+      token: TOKEN,
+      timeoutMs: 5_000,
+    });
+    const ok = await client.health();
+    expect(ok).toBe(true);
+  });
+
+  it("captures a turn end-to-end (L0 written)", async () => {
+    vi.stubEnv("TDAI_GATEWAY_TOKEN", TOKEN);
+    const client = new GatewayClient({
+      baseUrl: `http://127.0.0.1:${PORT}`,
+      token: TOKEN,
+      timeoutMs: 30_000,
+    });
+    const result = await client.captureTurn({
+      user_content: "smoke test user message",
+      assistant_content: "smoke test assistant response",
+      session_key: "smoke-key-1",
+      session_id: "smoke-session-1",
+    });
+    expect(result).not.toBeNull();
+    expect(result!.l0_recorded).toBeGreaterThanOrEqual(0);
+  });
+});
diff --git a/claude-code-plugin/tests/transcript.test.ts b/claude-code-plugin/tests/transcript.test.ts
new file mode 100644
index 0000000..23ccb85
--- /dev/null
+++ b/claude-code-plugin/tests/transcript.test.ts
@@ -0,0 +1,160 @@
+import { describe, it, expect } from "vitest";
+import { readLatestTurn, readAllTurns, parseTranscriptLine } from "../lib/transcript.js";
+import { resolve, dirname } from "node:path";
+import { fileURLToPath } from "node:url";
+import { writeFile, unlink } from "node:fs/promises";
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const FIXTURE = resolve(__dirname, "fixtures/transcript-sample.jsonl");
+
+describe("parseTranscriptLine", () => {
+  it("parses a user message", () => {
+    const line = '{"type":"user","message":{"role":"user","content":"hi"},"uuid":"u1"}';
+    const parsed = parseTranscriptLine(line);
+    expect(parsed).toEqual({
+      type: "user",
+      role: "user",
+      content: "hi",
+      contentIsArray: false,
+      uuid: "u1",
+    });
+  });
+
+  it("parses an assistant message", () => {
+    const line = '{"type":"assistant","message":{"role":"assistant","content":"hello"},"uuid":"a1"}';
+    const parsed = parseTranscriptLine(line);
+    expect(parsed?.role).toBe("assistant");
+    expect(parsed?.content).toBe("hello");
+  });
+
+  it("returns null for malformed JSON", () => {
+    expect(parseTranscriptLine("{ not json }")).toBeNull();
+  });
+
+  it("returns null for messages without content", () => {
+    const line = '{"type":"user","message":{"role":"user"},"uuid":"u1"}';
+    expect(parseTranscriptLine(line)).toBeNull();
+  });
+
+  it("handles content array (multi-part messages) by joining strings", () => {
+    const line = '{"type":"user","message":{"role":"user","content":[{"type":"text","text":"hello"},{"type":"text","text":"world"}]},"uuid":"u1"}';
+    const parsed = parseTranscriptLine(line);
+    expect(parsed?.content).toBe("hello\nworld");
+    expect(parsed?.contentIsArray).toBe(true);
+  });
+
+  it("marks string content as contentIsArray=false", () => {
+    const line = '{"type":"user","message":{"role":"user","content":"plain text"},"uuid":"u1"}';
+    const parsed = parseTranscriptLine(line);
+    expect(parsed?.contentIsArray).toBe(false);
+  });
+});
+
+describe("readLatestTurn", () => {
+  it("returns the most recent user/assistant pair", async () => {
+    const turn = await readLatestTurn(FIXTURE);
+    expect(turn).not.toBeNull();
+    expect(turn!.user).toBe("second question");
+    expect(turn!.assistant).toBe("second answer");
+  });
+
+  it("returns null for a missing file", async () => {
+    const turn = await readLatestTurn("/tmp/nonexistent-transcript-tdai.jsonl");
+    expect(turn).toBeNull();
+  });
+
+  it("returns null for an empty file", async () => {
+    const tmpPath = resolve(__dirname, "fixtures/empty.jsonl");
+    await writeFile(tmpPath, "");
+    try {
+      const turn = await readLatestTurn(tmpPath);
+      expect(turn).toBeNull();
+    } finally {
+      await unlink(tmpPath);
+    }
+  });
+
+  it("merges multi-part assistant responses split by tool cycles", async () => {
+    const tmpPath = resolve(__dirname, "fixtures/multi-part.jsonl");
+    const lines = [
+      '{"type":"user","message":{"role":"user","content":"search deepseek"},"uuid":"u1"}',
+      '{"type":"assistant","message":{"role":"assistant","content":"Searching for DeepSeek info..."},"uuid":"a1"}',
+      '{"type":"user","message":{"role":"user","content":[{"type":"tool_result","tool_use_id":"t1","content":"search results here"}]},"uuid":"tr1"}',
+      '{"type":"assistant","message":{"role":"assistant","content":"DeepSeek V4 is a large MoE model with 685B parameters."},"uuid":"a2"}',
+    ];
+    await writeFile(tmpPath, lines.join("\n"));
+    try {
+      const turn = await readLatestTurn(tmpPath);
+      expect(turn).not.toBeNull();
+      expect(turn!.user).toBe("search deepseek");
+      expect(turn!.assistant).toContain("Searching for DeepSeek");
+      expect(turn!.assistant).toContain("685B parameters");
+    } finally {
+      await unlink(tmpPath);
+    }
+  });
+});
+
+describe("readAllTurns", () => {
+  it("returns all turns from the fixture", async () => {
+    const turns = await readAllTurns(FIXTURE);
+    expect(turns).toHaveLength(2);
+    expect(turns[0].user).toBe("first question");
+    expect(turns[0].assistant).toBe("first answer");
+    expect(turns[1].user).toBe("second question");
+    expect(turns[1].assistant).toBe("second answer");
+  });
+
+  it("returns empty array for missing file", async () => {
+    const turns = await readAllTurns("/tmp/nonexistent.jsonl");
+    expect(turns).toEqual([]);
+  });
+
+  it("does not split turn on skill output (array-content user entry)", async () => {
+    const tmpPath = resolve(__dirname, "fixtures/skill-output.jsonl");
+    const lines = [
+      '{"type":"user","message":{"role":"user","content":"search deepseek v4"},"uuid":"u1"}',
+      '{"type":"assistant","message":{"role":"assistant","content":"Searching with Tavily..."},"uuid":"a1"}',
+      '{"type":"assistant","message":{"role":"assistant","content":[{"type":"tool_use","id":"t1","name":"Skill"}]},"uuid":"a2"}',
+      '{"type":"user","message":{"role":"user","content":[{"type":"tool_result","tool_use_id":"t1","content":"skill launched"}]},"uuid":"tr1"}',
+      '{"type":"user","message":{"role":"user","content":[{"type":"text","text":"Base directory for this skill: /path/to/skill\\n\\n# tavily search\\n\\nSearch the web..."}]},"uuid":"sk1"}',
+      '{"type":"assistant","message":{"role":"assistant","content":[{"type":"tool_use","id":"t2","name":"Bash"}]},"uuid":"a3"}',
+      '{"type":"user","message":{"role":"user","content":[{"type":"tool_result","tool_use_id":"t2","content":"search results here 15000 chars"}]},"uuid":"tr2"}',
+      '{"type":"assistant","message":{"role":"assistant","content":"## DeepSeek V4 Summary\\n\\nDeepSeek V4 launched with 1.6T parameters."},"uuid":"a4"}',
+    ];
+    await writeFile(tmpPath, lines.join("\n"));
+    try {
+      const turns = await readAllTurns(tmpPath);
+      expect(turns).toHaveLength(1);
+      expect(turns[0].user).toBe("search deepseek v4");
+      expect(turns[0].assistant).toContain("Searching with Tavily");
+      expect(turns[0].assistant).toContain("DeepSeek V4 Summary");
+      expect(turns[0].assistant).toContain("1.6T parameters");
+    } finally {
+      await unlink(tmpPath);
+    }
+  });
+
+  it("merges multi-part assistant responses across tool cycles", async () => {
+    const tmpPath = resolve(__dirname, "fixtures/allturns-multi.jsonl");
+    const lines = [
+      '{"type":"user","message":{"role":"user","content":"q1"},"uuid":"u1"}',
+      '{"type":"assistant","message":{"role":"assistant","content":"a1"},"uuid":"a1"}',
+      '{"type":"user","message":{"role":"user","content":"q2"},"uuid":"u2"}',
+      '{"type":"assistant","message":{"role":"assistant","content":"part1"},"uuid":"a2"}',
+      '{"type":"user","message":{"role":"user","content":[{"type":"tool_result","tool_use_id":"t1","content":"tool output"}]},"uuid":"tr1"}',
+      '{"type":"assistant","message":{"role":"assistant","content":"part2 with details"},"uuid":"a3"}',
+    ];
+    await writeFile(tmpPath, lines.join("\n"));
+    try {
+      const turns = await readAllTurns(tmpPath);
+      expect(turns).toHaveLength(2);
+      expect(turns[0]).toEqual({ user: "q1", assistant: "a1" });
+      expect(turns[1].user).toBe("q2");
+      expect(turns[1].assistant).toContain("part1");
+      expect(turns[1].assistant).toContain("part2 with details");
+    } finally {
+      await unlink(tmpPath);
+    }
+  });
+});
diff --git a/claude-code-plugin/tsconfig.json b/claude-code-plugin/tsconfig.json
new file mode 100644
index 0000000..49f4459
--- /dev/null
+++ b/claude-code-plugin/tsconfig.json
@@ -0,0 +1,20 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "bundler",
+    "lib": ["ES2022"],
+    "strict": true,
+    "esModuleInterop": true,
+    "skipLibCheck": true,
+    "forceConsistentCasingInFileNames": true,
+    "resolveJsonModule": true,
+    "outDir": "./dist",
+    "rootDir": "./",
+    "declaration": false,
+    "noEmitOnError": true,
+    "types": ["node"]
+  },
+  "include": ["lib/**/*.ts", "tests/**/*.ts"],
+  "exclude": ["dist", "node_modules"]
+}
diff --git a/claude-code-plugin/tsdown.config.ts b/claude-code-plugin/tsdown.config.ts
new file mode 100644
index 0000000..2fe228b
--- /dev/null
+++ b/claude-code-plugin/tsdown.config.ts
@@ -0,0 +1,18 @@
+import { defineConfig } from "tsdown";
+
+export default defineConfig({
+  entry: ["./lib/hook.ts"],
+  outDir: "./dist/lib",
+  format: "esm",
+  platform: "node",
+  clean: true,
+  fixedExtension: true,
+  dts: false,
+  sourcemap: false,
+  // Plugin only bundles its own hook entry (no npm deps in hook.ts).
+  // The actual Gateway daemon is spawned via `npx tdai-memory-gateway`
+  // from the user's globally installed @tencentdb-agent-memory/memory-tencentdb.
+  deps: {
+    neverBundle: (id) => id.startsWith("node:"),
+  },
+});
diff --git a/claude-code-plugin/vitest.e2e.config.ts b/claude-code-plugin/vitest.e2e.config.ts
new file mode 100644
index 0000000..92a66f1
--- /dev/null
+++ b/claude-code-plugin/vitest.e2e.config.ts
@@ -0,0 +1,11 @@
+import { defineConfig } from "vitest/config";
+
+export default defineConfig({
+  test: {
+    environment: "node",
+    pool: "forks",
+    include: ["claude-code-plugin/tests/**/*.e2e.test.ts"],
+    testTimeout: 120_000,
+    hookTimeout: 120_000,
+  },
+});
diff --git a/package.json b/package.json
index 2d74158..6885453 100644
--- a/package.json
+++ b/package.json
@@ -7,7 +7,8 @@
   "bin": {
     "migrate-sqlite-to-tcvdb": "./bin/migrate-sqlite-to-tcvdb.mjs",
     "export-tencent-vdb": "./bin/export-tencent-vdb.mjs",
-    "read-local-memory": "./bin/read-local-memory.mjs"
+    "read-local-memory": "./bin/read-local-memory.mjs",
+    "tdai-memory-gateway": "./dist/src/gateway/cli.mjs"
   },
   "exports": {
     ".": {
@@ -29,6 +30,10 @@
     "test": "vitest run",
     "test:watch": "vitest",
     "test:coverage": "vitest run --coverage",
+    "build:cc-plugin": "tsdown -c claude-code-plugin/tsdown.config.ts",
+    "test:cc-plugin": "vitest run claude-code-plugin/tests/",
+    "build:all": "npm run build && npm run build:cc-plugin",
+    "test:cc-plugin:e2e": "vitest run -c claude-code-plugin/vitest.e2e.config.ts",
     "postinstall": "bash scripts/openclaw-after-tool-call-messages.patch.sh 2>/dev/null || true"
   },
   "files": [
@@ -45,6 +50,13 @@
     "scripts/openclaw-after-tool-call-messages.patch.sh",
     "scripts/setup-offload.sh",
     "hermes-plugin/",
+    "claude-code-plugin/.claude-plugin/",
+    "claude-code-plugin/.codex-plugin/",
+    "claude-code-plugin/hooks/",
+    "claude-code-plugin/skills/",
+    "claude-code-plugin/dist/",
+    "claude-code-plugin/README.md",
+    "claude-code-plugin/README_CN.md",
     "openclaw.plugin.json",
     "README.md",
     "CHANGELOG.md",
diff --git a/src/gateway/__tests__/auth.test.ts b/src/gateway/__tests__/auth.test.ts
new file mode 100644
index 0000000..083d3d6
--- /dev/null
+++ b/src/gateway/__tests__/auth.test.ts
@@ -0,0 +1,161 @@
+import { describe, it, expect, beforeAll, beforeEach, afterAll, vi } from "vitest";
+import http from "node:http";
+import { TdaiGateway } from "../server.js";
+
+async function request(
+  port: number,
+  path: string,
+  headers: Record<string, string> = {},
+  method = "GET",
+): Promise<{ status: number; body: string; wwwAuth: string | undefined }> {
+  return new Promise((resolve, reject) => {
+    const req = http.request(
+      { host: "127.0.0.1", port, path, method, headers },
+      (res) => {
+        const chunks: Buffer[] = [];
+        res.on("data", (c) => chunks.push(c));
+        res.on("end", () =>
+          resolve({
+            status: res.statusCode ?? 0,
+            body: Buffer.concat(chunks).toString("utf-8"),
+            wwwAuth: res.headers["www-authenticate"] as string | undefined,
+          }),
+        );
+      },
+    );
+    req.on("error", reject);
+    req.end();
+  });
+}
+
+describe("Gateway optional Bearer token", () => {
+  let gateway: TdaiGateway;
+  const PORT = 18421;
+  const TOKEN = "test-token-abc-123";
+
+  beforeAll(async () => {
+    vi.stubEnv("TDAI_GATEWAY_TOKEN", TOKEN);
+    gateway = new TdaiGateway({
+      server: { port: PORT, host: "127.0.0.1" },
+    } as never);
+    await gateway.start();
+  });
+
+  // vitest config has `unstubEnvs: true`, which resets stubs before each test.
+  // Re-stub here so the middleware (which reads process.env per-request) sees the token.
+  beforeEach(() => {
+    vi.stubEnv("TDAI_GATEWAY_TOKEN", TOKEN);
+  });
+
+  afterAll(async () => {
+    await gateway.stop();
+  });
+
+  it("rejects unauthenticated requests with 401 when token is configured", async () => {
+    const res = await request(PORT, "/health");
+    expect(res.status).toBe(401);
+  });
+
+  it("rejects wrong token with 401", async () => {
+    const res = await request(PORT, "/health", {
+      Authorization: "Bearer wrong-token",
+    });
+    expect(res.status).toBe(401);
+  });
+
+  it("accepts correct Bearer token", async () => {
+    const res = await request(PORT, "/health", {
+      Authorization: `Bearer ${TOKEN}`,
+    });
+    expect(res.status).toBe(200);
+  });
+
+  it("includes WWW-Authenticate header on 401 per RFC 6750 §3", async () => {
+    const res = await request(PORT, "/health");
+    expect(res.status).toBe(401);
+    expect(res.wwwAuth).toMatch(/^Bearer\s+realm=/);
+  });
+
+  it("accepts case-insensitive 'Bearer' scheme keyword per RFC 6750 §2.1", async () => {
+    for (const scheme of ["Bearer", "bearer", "BEARER", "BeArEr"]) {
+      const res = await request(PORT, "/health", {
+        Authorization: `${scheme} ${TOKEN}`,
+      });
+      expect(res.status, `scheme=${scheme}`).toBe(200);
+    }
+  });
+
+  it("rejects mangled Authorization headers", async () => {
+    const cases = [
+      `Basic ${TOKEN}`,
+      `Bearer`,
+      `Bearer `,
+      `Bearer  ${TOKEN}  extra`,
+      ``,
+      `Bearer ${TOKEN}x`,
+      `Bearer x${TOKEN}`,
+    ];
+    for (const h of cases) {
+      const res = await request(PORT, "/health", { Authorization: h });
+      expect(res.status, `auth=${JSON.stringify(h)}`).toBe(401);
+    }
+  });
+
+  it.each([
+    ["POST", "/recall"],
+    ["POST", "/capture"],
+    ["POST", "/search/memories"],
+    ["POST", "/search/conversations"],
+    ["POST", "/session/end"],
+    ["POST", "/seed"],
+  ])("enforces auth on %s %s (no token → 401)", async (method, path) => {
+    const res = await request(PORT, path, {}, method);
+    expect(res.status).toBe(401);
+  });
+
+  it("allows OPTIONS preflight without token (CORS)", async () => {
+    return new Promise<void>((resolve, reject) => {
+      const req = http.request(
+        {
+          host: "127.0.0.1",
+          port: PORT,
+          path: "/recall",
+          method: "OPTIONS",
+        },
+        (res) => {
+          expect(res.statusCode).toBe(204);
+          resolve();
+        },
+      );
+      req.on("error", reject);
+      req.end();
+    });
+  });
+});
+
+describe("Gateway with no token configured", () => {
+  let gateway: TdaiGateway;
+  const PORT = 18422;
+
+  beforeAll(async () => {
+    vi.stubEnv("TDAI_GATEWAY_TOKEN", "");
+    gateway = new TdaiGateway({
+      server: { port: PORT, host: "127.0.0.1" },
+    } as never);
+    await gateway.start();
+  });
+
+  // vitest config has `unstubEnvs: true`; re-stub each test so middleware sees empty token.
+  beforeEach(() => {
+    vi.stubEnv("TDAI_GATEWAY_TOKEN", "");
+  });
+
+  afterAll(async () => {
+    await gateway.stop();
+  });
+
+  it("accepts unauthenticated requests when token is empty (backward compat)", async () => {
+    const res = await request(PORT, "/health");
+    expect(res.status).toBe(200);
+  });
+});
diff --git a/src/gateway/cli.ts b/src/gateway/cli.ts
new file mode 100644
index 0000000..24b6915
--- /dev/null
+++ b/src/gateway/cli.ts
@@ -0,0 +1,108 @@
+#!/usr/bin/env node
+/**
+ * `tdai-memory-gateway` — standalone Gateway daemon entry.
+ *
+ * Exposed as a `bin` in package.json so users can run:
+ *   npx tdai-memory-gateway           # from a project that depends on the package
+ *   tdai-memory-gateway               # after `npm install -g @tencentdb-agent-memory/memory-tencentdb`
+ *
+ * Reads config from environment variables (see src/gateway/config.ts):
+ *   TDAI_TOKEN_PATH     path to a 0600 file holding the Bearer token (preferred —
+ *                       avoids leaking the token via /proc/<pid>/environ / `ps -E`)
+ *   TDAI_GATEWAY_TOKEN  Bearer token (fallback for Hermes-style direct env passing)
+ *   TDAI_GATEWAY_PORT   port to bind (default 8420)
+ *   TDAI_GATEWAY_HOST   bind host (default 127.0.0.1). Non-loopback values require
+ *                       TDAI_GATEWAY_ALLOW_REMOTE=1 to opt in (defence in depth).
+ *   TDAI_DATA_DIR       data root
+ *   TDAI_CC_PID         (optional) parent process pid; daemon self-exits when it dies
+ *
+ * Designed for use by host-agnostic plugins (Claude Code, Codex CLI) that spawn
+ * the Gateway as a sidecar without bundling npm dependencies.
+ */
+
+import { readFileSync } from "node:fs";
+import { TdaiGateway } from "./server.js";
+
+const LOOPBACK_HOSTS = new Set(["127.0.0.1", "localhost", "::1", "::ffff:127.0.0.1"]);
+
+function assertSafeHost(): void {
+  const host = process.env.TDAI_GATEWAY_HOST?.trim();
+  if (!host) return;
+  if (LOOPBACK_HOSTS.has(host)) return;
+  if (process.env.TDAI_GATEWAY_ALLOW_REMOTE === "1") return;
+  process.stderr.write(
+    `tdai-memory-gateway: refusing to bind TDAI_GATEWAY_HOST=${host} (non-loopback). ` +
+      `Set TDAI_GATEWAY_ALLOW_REMOTE=1 to opt in.\n`,
+  );
+  process.exit(2);
+}
+
+function loadTokenFromFile(): void {
+  const tokenPath = process.env.TDAI_TOKEN_PATH;
+  if (!tokenPath) return;
+  try {
+    const token = readFileSync(tokenPath, "utf-8").trim();
+    if (!token) {
+      process.stderr.write(`tdai-memory-gateway: TDAI_TOKEN_PATH=${tokenPath} is empty\n`);
+      process.exit(2);
+    }
+    // Set on the in-process env object only — this does NOT mutate the
+    // execve() environment block, so /proc/<pid>/environ / `ps -E` won't
+    // expose the token.
+    process.env.TDAI_GATEWAY_TOKEN = token;
+  } catch (err) {
+    process.stderr.write(
+      `tdai-memory-gateway: failed to read TDAI_TOKEN_PATH=${tokenPath}: ${String(err)}\n`,
+    );
+    process.exit(2);
+  }
+}
+
+async function main(): Promise<void> {
+  assertSafeHost();
+  loadTokenFromFile();
+  const gateway = new TdaiGateway();
+  await gateway.start();
+
+  let shuttingDown = false;
+  const shutdown = async (reason: string): Promise<void> => {
+    if (shuttingDown) return;
+    shuttingDown = true;
+    try {
+      await Promise.race([
+        gateway.stop(),
+        new Promise<void>((r) => setTimeout(r, 5_000)),
+      ]);
+    } catch {
+      // best effort
+    }
+    process.exit(reason === "error" ? 1 : 0);
+  };
+
+  process.on("SIGTERM", () => void shutdown("SIGTERM"));
+  process.on("SIGINT", () => void shutdown("SIGINT"));
+
+  const ccPid = parseInt(process.env.TDAI_CC_PID ?? "0", 10);
+  if (Number.isFinite(ccPid) && ccPid > 0) {
+    // Poll every 15s — short enough that a vanished cc doesn't keep the
+    // daemon alive long enough to collide with a fresh hook spawning a
+    // replacement, long enough that the syscall load stays negligible.
+    const timer = setInterval(() => {
+      try {
+        process.kill(ccPid, 0);
+      } catch (e) {
+        const err = e as NodeJS.ErrnoException;
+        if (err.code === "ESRCH") {
+          clearInterval(timer);
+          void shutdown("parent-exit");
+        }
+      }
+    }, 15_000);
+    timer.unref();
+  }
+}
+
+main().catch((err) => {
+  process.stderr.write(`tdai-memory-gateway failed: ${String(err)}\n`);
+  process.exit(1);
+});
diff --git a/src/gateway/server.ts b/src/gateway/server.ts
index bd7d0a0..5f16f17 100644
--- a/src/gateway/server.ts
+++ b/src/gateway/server.ts
@@ -15,6 +15,7 @@
  */
 
 import http from "node:http";
+import { timingSafeEqual } from "node:crypto";
 import { URL } from "node:url";
 import { TdaiCore } from "../core/tdai-core.js";
 import { StandaloneHostAdapter } from "../adapters/standalone/host-adapter.js";
@@ -176,7 +177,7 @@ export class TdaiGateway {
     // CORS headers (for development)
     res.setHeader("Access-Control-Allow-Origin", "*");
     res.setHeader("Access-Control-Allow-Methods", "GET, POST, OPTIONS");
-    res.setHeader("Access-Control-Allow-Headers", "Content-Type");
+    res.setHeader("Access-Control-Allow-Headers", "Content-Type, Authorization");
 
     if (method === "OPTIONS") {
       res.writeHead(204);
@@ -184,6 +185,8 @@ export class TdaiGateway {
       return;
     }
 
+    if (!this.authorize(req, res)) return;
+
     try {
       switch (`${method} ${pathname}`) {
         case "GET /health":
@@ -210,6 +213,35 @@ export class TdaiGateway {
     }
   }
 
+  /**
+   * Optional Bearer-token gate. When TDAI_GATEWAY_TOKEN (or a token file
+   * pointed to by TDAI_TOKEN_PATH, loaded by cli.ts into process.env) is set,
+   * every non-OPTIONS request must carry a matching `Authorization: Bearer
+   * <token>` header. Comparison is timing-safe and case-insensitive on the
+   * "Bearer" scheme keyword per RFC 6750 §2.1.
+   *
+   * Returns true if the request is authorized, false if a 401 has been sent.
+   */
+  private authorize(req: http.IncomingMessage, res: http.ServerResponse): boolean {
+    const expectedToken = process.env.TDAI_GATEWAY_TOKEN;
+    if (!expectedToken) return true;
+
+    const authHeader = req.headers.authorization ?? "";
+    const match = /^Bearer\s+(\S+)\s*$/i.exec(authHeader);
+    const provided = match?.[1] ?? "";
+    const expectedBuf = Buffer.from(expectedToken, "utf-8");
+    const providedBuf = Buffer.from(provided, "utf-8");
+    const ok =
+      expectedBuf.length > 0 &&
+      providedBuf.length === expectedBuf.length &&
+      timingSafeEqual(providedBuf, expectedBuf);
+    if (ok) return true;
+
+    res.setHeader("WWW-Authenticate", 'Bearer realm="tdai-gateway"');
+    sendError(res, 401, "Unauthorized");
+    return false;
+  }
+
   // ============================
   // Route handlers
   // ============================
diff --git a/tsdown.config.ts b/tsdown.config.ts
index 16b0073..89b0e06 100644
--- a/tsdown.config.ts
+++ b/tsdown.config.ts
@@ -11,7 +11,7 @@ function collectExternalDependencies(): string[] {
 }
 
 export default defineConfig({
-  entry: ["./index.ts"],
+  entry: ["./index.ts", "./src/gateway/cli.ts"],
   outDir: "./dist",
   format: "esm",
   platform: "node",
diff --git a/vitest.config.ts b/vitest.config.ts
index 1f9ce29..96d8f64 100644
--- a/vitest.config.ts
+++ b/vitest.config.ts
@@ -4,7 +4,11 @@ export default defineConfig({
   test: {
     environment: "node",
     pool: "forks",
-    include: ["src/**/*.test.ts", "__tests__/**/*.test.ts"],
+    include: [
+      "src/**/*.test.ts",
+      "__tests__/**/*.test.ts",
+      "claude-code-plugin/tests/**/*.test.ts",
+    ],
     exclude: ["dist/**", "node_modules/**", "**/*.e2e.test.ts"],
     testTimeout: 120_000,
     hookTimeout: 120_000,
@@ -15,7 +19,7 @@ export default defineConfig({
     coverage: {
       provider: "v8",
       reporter: ["text", "html", "lcov"],
-      include: ["src/**/*.ts", "index.ts"],
+      include: ["src/**/*.ts", "index.ts", "claude-code-plugin/lib/**/*.ts"],
       exclude: [
         "src/**/*.test.ts",
         "dist/**",