MUL-2539 fix(pi): strip leaked tool-call markup safely#2956
Conversation
|
Someone is attempting to deploy a commit to the IndexLabs Team on Vercel. A member of the Team first needs to authorize it. |
e1b89b2 to
5a0a6a5
Compare
|
Thanks for the detailed write-up. After internal discussion, here's where we landed — split by the three concerns in this PR. 1. Comment-read context reduction — please don't change CLI defaults hereWe agree with the underlying problem (long-issue context bloat in What's already in
That covers the "don't default to ingesting the entire timeline" goal without changing the CLI's public contract. Since 2. Agent-authored comment output hardening (HEREDOC /
|
bd2c604 to
2535cea
Compare
|
Thanks for the review. I followed the requested split:
Validation for this PR: go test ./pkg/agent -run 'TestStripPiToolCallMarkup|TestDrainPiTextBuffer|TestFlushPiTextBuffer' |
Bohan-J
left a comment
There was a problem hiding this comment.
Thanks for the clean follow-up, @drybalka-s — scoping this PR down to just the Pi sanitizer is exactly what we asked for, and the implementation looks solid.
What I verified on HEAD 2535ceaf:
- Scope is right. Only
server/pkg/agent/pi.goandserver/pkg/agent/pi_test.gochange. The CLI default-behavior change and the HEREDOC /--content-filework are out of this PR, as agreed. - The previous must-fix is resolved.
flushPiTextBuffernow applies only the control-token regex to leftover pending text, andsafePiTextEmitLenonly holds back trailing bytes that could still complete a markup prefix. Plain assistant text ending in unmatchedcall:/response:is preserved verbatim, with a regression test covering it (TestFlushPiTextBufferKeepsUnmatchedToolPrefixes). - Streaming correctness. Split tool-call and split control-token across chunks are both covered (
TestDrainPiTextBufferSplitToolCall,TestDrainPiTextBufferSplitControlToken). - Tests pass locally:
go test ./pkg/agent -run 'TestStripPiToolCallMarkup|TestDrainPiTextBuffer|TestFlushPiTextBuffer'— all green. - CI is green (backend, frontend, installer × 2).
One non-blocking nit for a future PR if you feel like it: when buffered text starts with call: / response: but the parser can't yet decide if it's markup (e.g. an assistant reply that legitimately says call: see below ...), drainPiSanitizedText holds it in the buffer until EOF. The final output is correct, but the live-stream emit gets delayed for that segment. Splitting scanPiToolMarkupEnd into not-markup vs incomplete-markup so the obvious-not-markup case can keep streaming would be nice — but it's polish, not a blocker.
Approving and merging. Thanks again!
Refs #3047
Summary
call:*,response:*,<tool_call|>, and control-token markup from assistant text.call:orresponse:prefixes.Problem solved
Pi can emit raw tool-call markup through
text_deltaevents. That markup can leak into visible assistant output and issue comments, especially when it is split across streaming deltas. The sanitizer must remove confirmed markup without dropping legitimate plain text.Validation
go test ./pkg/agent -run 'TestStripPiToolCallMarkup|TestDrainPiTextBuffer|TestFlushPiTextBuffer'