fix: prevent agent loop stall on write_to_file filesystem errors (#703)#727
fix: prevent agent loop stall on write_to_file filesystem errors (#703)#727awschmeder wants to merge 2 commits into
Conversation
…oo-Code-Org#703) - Remove unguarded createDirectoriesForFile call from handlePartial; the call was a redundant optimization (execute() already creates dirs before open()) and its unguarded throw caused the partial-block advancement gate in presentAssistantMessage to be skipped, permanently stalling the agent loop - Move createDirectoriesForFile in execute() inside the try block so EROFS/ EACCES errors route through handleError with diffViewProvider.reset() cleanup and consecutive-mistake counting, rather than escaping unhandled - Add regression tests covering both failure paths
…_file filesystem failure When write_to_file hits a filesystem error (EROFS/EACCES) the streaming phase left the "Zoo wants to edit this file" spinner running, surfaced the same error twice (handlePartial + execute), and spawned a new partial tool message on every subsequent streaming delta. - Add Task.finalizePartialToolAsk() to finalize a partial tool ask without blocking on user input, dismissing the spinner. - handlePartial swallows streaming filesystem errors (after finalizing the spinner and resetting the diff view) so only the authoritative execute() error is reported, eliminating the duplicate error bubble. - Track partialStreamFailed so later streaming deltas short-circuit instead of re-attempting and spawning repeated partial tool messages. - Add regression tests for spinner finalization, single-error reporting, and no repeated partial messages.
📝 WalkthroughWalkthrough
ChangesPartial write-to-file error recovery
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/core/task/Task.ts`:
- Around line 1739-1745: Persist the finalized tool-ask state in
finalizePartialToolAsk by updating the underlying clineMessages entry, not just
the webview. After clearing the partial flag on the last ask/tool message, make
sure the task state is saved through the same persistence path used by
task.ask(..., false) in EditFileTool so the finalized message is written before
any reload/resume. Use the existing updateClineMessage flow only as the UI
update, and add the missing persistence step around finalizePartialToolAsk and
the clineMessages mutation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: fe35d544-0a5c-4962-a19e-ead7a0d45383
📒 Files selected for processing (3)
src/core/task/Task.tssrc/core/tools/WriteToFileTool.tssrc/core/tools/__tests__/writeToFileTool.spec.ts
| async finalizePartialToolAsk(): Promise<void> { | ||
| const lastMessage = this.clineMessages.at(-1) | ||
|
|
||
| if (lastMessage && lastMessage.partial && lastMessage.type === "ask" && lastMessage.ask === "tool") { | ||
| lastMessage.partial = false | ||
| await this.updateClineMessage(lastMessage) | ||
| } |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win
Persist the finalized tool-ask state.
Line 1744 only calls updateClineMessage(), which updates the webview but does not save this.clineMessages. The existing finalization path in src/core/tools/EditFileTool.ts:150-171 goes through task.ask(..., false), which persists the row first. If a swallowed partial-stream failure is followed by a reload/resume before another save happens, this message is still stored as partial: true and the spinner can come back.
Suggested fix
async finalizePartialToolAsk(): Promise<void> {
const lastMessage = this.clineMessages.at(-1)
if (lastMessage && lastMessage.partial && lastMessage.type === "ask" && lastMessage.ask === "tool") {
lastMessage.partial = false
+ await this.saveClineMessages()
await this.updateClineMessage(lastMessage)
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async finalizePartialToolAsk(): Promise<void> { | |
| const lastMessage = this.clineMessages.at(-1) | |
| if (lastMessage && lastMessage.partial && lastMessage.type === "ask" && lastMessage.ask === "tool") { | |
| lastMessage.partial = false | |
| await this.updateClineMessage(lastMessage) | |
| } | |
| async finalizePartialToolAsk(): Promise<void> { | |
| const lastMessage = this.clineMessages.at(-1) | |
| if (lastMessage && lastMessage.partial && lastMessage.type === "ask" && lastMessage.ask === "tool") { | |
| lastMessage.partial = false | |
| await this.saveClineMessages() | |
| await this.updateClineMessage(lastMessage) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/core/task/Task.ts` around lines 1739 - 1745, Persist the finalized
tool-ask state in finalizePartialToolAsk by updating the underlying
clineMessages entry, not just the webview. After clearing the partial flag on
the last ask/tool message, make sure the task state is saved through the same
persistence path used by task.ask(..., false) in EditFileTool so the finalized
message is written before any reload/resume. Use the existing updateClineMessage
flow only as the UI update, and add the missing persistence step around
finalizePartialToolAsk and the clineMessages mutation.
write-to-file-error-correct.mov |
Related GitHub Issue
Closes: #703
Description
The bug (#703): When the model uses
write_to_fileto create a file whose parent directory cannot be created or written (e.g. a read-only mount or a path under a directory lacking write permission, producingEROFS/EACCES), the agent loop stalled permanently.handlePartialcalledcreateDirectoriesForFilewithout a.catch()guard. During streaming (block.partial === true) the unguarded throw was caught byBaseTool.handle->handleError, but the partial-block advancement gate inpresentAssistantMessagewas skipped (block.partialwas stilltrueand neitherdidRejectToolnordidAlreadyUseToolwas set), souserMessageContentReadywas never set and the task hung forever.The fix:
createDirectoriesForFilecall fromhandlePartial(execute()already creates parent directories beforediffViewProvider.open()).createDirectoriesForFileinexecute()inside thetryblock soEROFS/EACCESerrors route throughhandleErrorwithdiffViewProvider.reset()cleanup and consecutive-mistake counting instead of escaping unhandled.Correct streaming UI behavior under a filesystem error. Because the failure now occurs while a partial
write_to_fileblock is still streaming, the implementation is designed so the in-progress UI resolves cleanly rather than leaving artifacts. The following are properties of the implementation:Task.finalizePartialToolAsk()(a non-blocking helper that sets the last partialtoolask topartial: false) dismisses it. It deliberately avoidstask.ask(..., false), which would fall through to a blockingpWaitForthat would result in a non-terminating spinner left in the chat UI.handlePartialfinalizes the partial ask, resets the diff view, and swallows the streaming error (logged to the console) so only the authoritative non-partialexecute()path surfaces it to the user. This is safe because the loop advancement gate does not depend on the streaming throw -- it advances when the non-partial block arrives.partialStreamFailedflag (cleared between invocations viaresetPartialState()) short-circuits later deltas so they do not re-attemptopen()/update()and spawn a new partial tool message each time.Test Procedure
Automated (run from
src/):31 tests pass, including new regression tests that assert:
EROFSinhandlePartialdoes not callcreateDirectoriesForFileand does not stall.EROFS/EACCESinexecute()routes throughhandleErrorwithdiffViewProvider.reset()and does not proceed to open/save.handlePartialopen()/update()throws,finalizePartialToolAsk()anddiffViewProvider.reset()are called and the streaming error is swallowed (not surfaced viahandleError).execute()phases.open().npx tsc --noEmitpasses cleanly.Manual (Extension Development Host / F5):
Pre-Submission Checklist
Documentation Updates
Summary by CodeRabbit