fix(plugin): make before_message_write hook a no-op by default (cache-friendly) by YOMXXX · Pull Request #52 · Tencent/TencentDB-Agent-Memory

YOMXXX · 2026-05-18T16:20:20Z

Summary | 摘要

before_message_write hook 默认改为 no-op，恢复 LLM prompt cache 在 sub-agent / replay 边界上的命中。原剥离行为保留为 TDAI_STRIP_RELEVANT_MEMORIES_ON_WRITE=1 的 opt-in。Closes #11.

Default before_message_write to no-op for <relevant-memories> stripping so the LLM prompt prefix stays stable across sub-agent / replay boundaries. Legacy strip behavior is preserved behind an opt-in env var. Closes #11.

Root cause

index.ts:591 的 hook 把 user message 里的 <relevant-memories> 块剥离再写 session JSONL。hook 注释里写它有两层用意：

让 transcript 干净（不让 replay 看到召回 artifact）
防 L0 反馈循环（防止召回内容被 L0 当作"用户原话"再次录入）

但 (2) 已经被 src/core/conversation/l0-recorder.ts:254 上的 sanitizeText() 独立完成 —— 每条 L0-bound message 都会经过 sanitizeText，它会剥离 <relevant-memories>（以及其他几个 injected tag）。所以 hook 那次剥离严格是为 (1)，也是 #11 cache miss 的唯一来源。

Fix

新默认：hook 不再剥离 <relevant-memories>，user message 完整写入 session JSONL → sub-agent / replay 重读 transcript 时跟 in-memory effectivePrompt 前缀对齐 → LLM cache 命中。
旧行为回退：TDAI_STRIP_RELEVANT_MEMORIES_ON_WRITE=1 一键恢复剥离。env 在每次 hook 触发时读取，运行时切换不需要重启 host。
不变：
- auto-recall.ts 仍注入 <relevant-memories> 到当前轮 user message
- l0-recorder.ts 的 sanitizeText 仍在 L0 录入时剥离 memories（真正的反馈循环防御）
- sanitize.ts 自身

Theoretical cache-hit model

N 轮对话，每轮 user message 含 1-3 KB <relevant-memories> 块。Sub-agent / replay 路径：

Strip behavior	LLM sees on turn N+1	Cache prefix vs in-memory turn N
Strip (current)	User msgs 1..N without memories	Differs in every user msg → miss
Don't strip (new)	User msgs 1..N with original memories	Identical → prefix cache hit

Don't-strip cost：N × (1-3 KB) 额外 tokens。N ≤ 30、memories ≤ 3 KB 时 ≤ 90 KB ≈ 25 K tokens，远小于现代 context window 上限（128 K – 1 M），也明显小于 first-token cache miss 带来的延迟与单价损失（典型 cache hit 是 ~0.1× cost）。

Refactor

把 hook callback 抽成 export 的纯函数 maybeStripRelevantMemoriesOnWrite(message)，hook 注册改为一行 wrapper。让 helper 可以脱离 OpenClaw runtime 独立单测，且 hook 的语义（"看 message 决定要不要替换 content"）跟 helper 签名 ({ content } | null) 一一对应。

Tests

新建 src/__tests__/before-message-write.test.ts (main 上还没有 src 测试，跟 #39 / #42 / #51 同套路），11 个 cases：

#	场景	期望
1	env 未设 + user msg 含 memories	helper 返回 `null`
2	`env=1` + string content + 含 memories	剥离
3	`env=1` + parts content + 某 part 含 memories	仅剥离该 part
4	`env=1` + role=assistant + 含 memories	不动
5	`env=1` + user msg 无 memories tag	返回 `null`
6	env 为 `"true"` / `"yes"` / `"0"` / `"1 "` / `""` / `"TRUE"`（非字面值 `"1"`）	视为未设 → 返回 `null`（it.each 6 rows）

✓ npx vitest run src/__tests__/before-message-write.test.ts → 11/11 passed

Compatibility

Hermes plugin path：不受影响。Hermes 通过 Gateway HTTP /recall 拿召回内容，自己拼 prompt，不走 OpenClaw before_message_write hook。
Claude Code plugin path：不受影响。cc plugin 通过 additionalContext 在 cc 的 UserPromptSubmit hook 里注入召回，跟 OpenClaw hook 完全独立。
依赖 transcript 干净度的 OpenClaw 用户（log shipping / audit / 独立 replay 工具）：设 TDAI_STRIP_RELEVANT_MEMORIES_ON_WRITE=1 恢复旧行为。

Out of scope

真实 LLM cache 命中率测量 —— 没有 representative CI 环境；理论模型如上，腾讯团队若想 A/B 在合并前自行验证。
per-turn 精细化剥离（"只剥离非最近一轮 memories"）—— 当前 OpenClaw hook 接口看不到 history，无法实现。
L0 capture 路径改动 —— 已经正确。

DCO

Commit 带 Signed-off-by: 李冠辰 <liguanchen@xiaomi.com>。

@yunhao-tech

…-friendly) The before_message_write hook in index.ts unconditionally stripped <relevant-memories> tags from user messages before persisting to the session JSONL. This destroyed LLM prompt-prefix stability across sub-agent / replay boundaries: each turn's in-memory effectivePrompt contained the memories block but the same turn re-read from the JSONL did not — every cross-boundary prompt suffered a first-token cache miss. Reported in Tencent#11 by @yunhao-tech; similar direction suggested by @changxu21-spec. The hook's stated dual purpose was (1) transcript cleanliness and (2) anti-feedback-loop. (2) is already handled independently by sanitizeText() in src/core/conversation/l0-recorder.ts:254 on every L0-bound message, so the hook strip is strictly about (1) and is strictly the cause of the cache miss. Default behavior change: the hook becomes a no-op. The transcript now preserves <relevant-memories> blocks; prompt cache hits across boundaries; long agent loops no longer accumulate first-token latency. Opt-in legacy strip: set TDAI_STRIP_RELEVANT_MEMORIES_ON_WRITE=1. Env is read on every hook invocation (no constructor-time caching), so operators can flip behavior without restarting OpenClaw / Hermes. Strip logic is factored into an exported helper maybeStripRelevantMemoriesOnWrite(message) so it can be unit-tested without mocking the OpenClaw runtime. Tests: new src/__tests__/before-message-write.test.ts — 11 cases covering env-unset default, env=1 strip for string/parts content, role guard, no-tag short-circuit, and a 6-row it.each over non-literal-"1" env values. Unchanged: - src/core/hooks/auto-recall.ts still injects <relevant-memories>. - src/core/conversation/l0-recorder.ts still strips via sanitizeText on L0 capture (true anti-loop defense). - src/utils/sanitize.ts itself. Hermes plugin path and Claude Code plugin path are both unaffected (neither goes through the OpenClaw before_message_write hook). Closes Tencent#11. Signed-off-by: 李冠辰 <liguanchen@xiaomi.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(plugin): make before_message_write hook a no-op by default (cache-friendly)#52

fix(plugin): make before_message_write hook a no-op by default (cache-friendly)#52
YOMXXX wants to merge 1 commit into
Tencent:mainfrom
YOMXXX:fix/cache-friendly-default

YOMXXX commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

YOMXXX commented May 18, 2026

Summary | 摘要

Root cause

Fix

Theoretical cache-hit model

Refactor

Tests

Compatibility

Out of scope

DCO

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant