Skip to content

refactor(context): context双阈值预算预警实现#172

Merged
Nomikfk1215 merged 2 commits into1024XEngineer:mainfrom
Nomikfk1215:main
Apr 14, 2026
Merged

refactor(context): context双阈值预算预警实现#172
Nomikfk1215 merged 2 commits into1024XEngineer:mainfrom
Nomikfk1215:main

Conversation

@Nomikfk1215
Copy link
Copy Markdown
Collaborator

No description provided.

@Nomikfk1215
Copy link
Copy Markdown
Collaborator Author

@xgopilot

Copy link
Copy Markdown

@xgopilot xgopilot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Focused review findings after code-quality, performance, security, and docs passes.

Also noted: this PR includes .tmp/pr133-review as a gitlink (Subproject commit ...), which appears to be an accidental artifact and should be dropped from the patch.

if compactErr != nil {
return "", compactErr
requestTokens := int(tokenusage.ApproximateRequestTokens(messages))
compacted, compactErr := r.maybeAutoCompactSession(ctx, sess, promptTokens, requestTokens)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybeAutoCompactSession now runs on every loop iteration (the previous step == 0 gate was removed). In tool-heavy turns, this can compact mid-run and summarize away structured tool_result messages, which risks changing tool-loop behavior before the final assistant answer. Consider restricting auto-compaction to pre-first-turn (or preserving latest tool-result context explicitly) to avoid semantic loss.

requestTokens := int(tokenusage.ApproximateRequestTokens(messages))
compacted, compactErr := r.maybeAutoCompactSession(ctx, sess, promptTokens, requestTokens)
if compactErr != nil {
retried, retryErr := r.retryAfterPromptTooLong(ctx, sess, out, compactErr, &reactiveRetries, maxReactiveRetries)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When preflight compaction returns local prompt too long (from promptTokens), this path still attempts reactive compaction/retry. Because the latest user message is preserved, retries often cannot reduce the offending token source and can add avoidable compaction round-trips. Consider bypassing reactive retry for locally generated prompt-too-long preflight errors.


func IsContextTooLongMessage(message string) bool {
message = strings.ToLower(strings.TrimSpace(message))
if message == "" {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsContextTooLongMessage relies on broad substring hints (for example "context window", "too many tokens") over arbitrary error text. This can misclassify unrelated failures as context-too-long and trigger unnecessary reactive compaction/retry. Consider tightening classification order (provider code/status first, then narrower fallback matching).

@xgopilot
Copy link
Copy Markdown

xgopilot bot commented Apr 13, 2026

Review Findings

  1. High internal/agent/runner.go:341-367
    maybeAutoCompactSession(...) now runs on every loop iteration (the step == 0 guard was removed). In tool-driven turns, this can compact mid-loop and replace structured tool_result context with a summary before the model finishes the turn, which can change tool-loop semantics and final-answer correctness.

  2. Medium internal/agent/runner.go:347-356, internal/agent/compaction.go:85-88
    When preflight detects local newPromptTooLongError(...) (based on promptTokens), the code still routes through retryAfterPromptTooLong(...). Because promptTokens is derived from the latest user input and remains unchanged, reactive compaction usually cannot resolve this and adds an avoidable compaction+retry cycle.

  3. Low internal/llm/errors.go:23-30, internal/llm/errors.go:96-107
    IsContextTooLongMessage relies on broad free-text hints ("context window", "too many tokens", etc.). This can misclassify unrelated provider errors as context-too-long and trigger reactive compaction/retry in non-context failures.

Validation

  • Ran targeted tests: go test ./internal/agent ./internal/config ./internal/llm -v (pass).
  • No CLAUDE.md file was present in this repository; reviewed using provided AGENTS.md instructions.

Summary

Review completed with 3 findings above. No code changes were made.

View job run

@Nomikfk1215
Copy link
Copy Markdown
Collaborator Author

@xgopilot 所以现在这个pr需要大改吗,还是只需小修复让其通过ci合pr检测即可?

@xgopilot
Copy link
Copy Markdown

xgopilot bot commented Apr 13, 2026

结论

不需要大改,属于小修复范围。

当前这版我本地跑了:go test ./internal/agent ./internal/config ./internal/llm -v,都通过。也就是说从“过单测/CI”的角度,已经比较接近可合。

但为了避免合入后出现行为风险,建议至少补这 2 个关键点(都可定点修改,不是重构级):

  1. internal/agent/runner.go:347-356 / :341 附近
    maybeAutoCompactSession(...) 现在每轮循环都跑,可能在 tool loop 中途压缩上下文,影响 tool_result 语义。
  2. internal/agent/runner.go:349 + internal/agent/compaction.go:85-88
    本地 preflight 的 prompt too long 还会走 reactive retry,通常是无效重试,建议直接短路处理。

另一个是低优先级改进:

  • internal/llm/errors.go:96-106 的上下文超限文案匹配偏宽,可能误判。

回答你的问题

  • 如果目标是“尽快通过合 PR 检测”:小修即可
  • 如果目标是“合入后稳定性更高”:建议先把上面前两项修掉再合。

View job run

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 13, 2026

Codecov Report

❌ Patch coverage is 80.00000% with 22 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/agent/runner.go 59.25% 8 Missing and 3 partials ⚠️
internal/agent/compaction.go 84.37% 3 Missing and 2 partials ⚠️
internal/agent/prompt_too_long.go 81.81% 3 Missing and 1 partial ⚠️
internal/llm/errors.go 81.81% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@Nomikfk1215
Copy link
Copy Markdown
Collaborator Author

@xgopilot 现在能够合并了吗 还有什么大问题吗

@xgopilot
Copy link
Copy Markdown

xgopilot bot commented Apr 13, 2026

结论

可以合并,从代码风险角度看目前没有阻塞级大问题。

Review Findings

  1. Low internal/llm/errors.go:23, internal/llm/errors.go:96
    IsContextTooLongMessage 仍然依赖关键词匹配(如 too many tokens / context window)。在部分厂商返回文案里,仍有小概率把非上下文超限错误误判成 context_too_long,触发不必要的 reactive compaction。这个是低优先级,可后续迭代收紧。

已核对项

  • 之前提到的两个关键风险已修复:
    • tool loop 中途自动压缩问题(step == 0 预检门控恢复)
    • 本地 preflight prompt too long 的无效 reactive retry
  • 本地已运行并通过:go test ./internal/agent ./internal/config ./internal/llm -v

View job run

@Nomikfk1215 Nomikfk1215 merged commit e6532f3 into 1024XEngineer:main Apr 14, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant