Skip to content

fix(spaces): keep Current Widget transient fresh and trim-protected#29

Open
nsyring wants to merge 1 commit intoagent0ai:mainfrom
nsyring:fix/current-widget-transient-stability
Open

fix(spaces): keep Current Widget transient fresh and trim-protected#29
nsyring wants to merge 1 commit intoagent0ai:mainfrom
nsyring:fix/current-widget-transient-stability

Conversation

@nsyring
Copy link
Copy Markdown

@nsyring nsyring commented Apr 26, 2026

Summary

Refresh the Current Widget transient section after readWidget() and after a failed patchWidget() / renderWidget() / upsertWidget(), and mark the section as trimAllowed: false so the prompt-budget trimmer cannot mid-replace its content. Together these changes give the agent a stable authoritative widget source on every turn, eliminating the most reliable trigger of "agent rewrites the entire widget" loops.

Why

The space-widgets skill contract promises:

"After patchWidget(), renderWidget(), or reloadWidget(), use the refreshed Current Widget on the next turn"
"find must be one exact unique snippet copied from readWidget() output or from Current Widget source↓"

Currently the contract is broken at three points:

  1. readWidget() does not refresh the transient. Its result lands in chat history only, where the long-message middle-replacement (trimPromptLongMessage in app/L0/_all/mod/_core/agent_prompt/prompt-items.js) can later inject the placeholder string <<N characters removed to optimize context, ...>> directly into the visible widget source. The agent then either copies that placeholder into a find snippet (patch fails because the text does not exist in storage) or pastes it into a renderWidget(...) body (the placeholder reads as a JavaScript syntax error and the renderer crashes on first execution with Unexpected identifier 'characters').

  2. A failed write leaves the transient stale. When the runtime validator in spaces/storage.js throws (overlapping edits, "patchWidget is for partial renderer edits only", missing find snippet), the call short-circuits before buildWidgetToolResult(...) runs, so the transient still reflects the prior turn — or is missing entirely if no successful write has happened yet. The agent retries against an out-of-date or absent source, often falling back to a full renderWidget(...) rewrite which can drift further from the current file state.

  3. The transient itself was trim-eligible by default. Even with read and failed-write refreshes wired up, prompt-budget pressure would still let the trimmer mid-replace the transient's source content, reintroducing failure mode 1.

Together the three problems form the most reliable trigger of "agent rewrites the entire widget for a small request" that I reproduced repeatedly during local development. They are infrastructural — the skill rules tell the agent the right thing, but the runtime cannot deliver on the promise.

What changed

app/L0/_all/mod/_core/spaces/store.js:

  • New refreshCurrentWidgetTransientFromStorage({spaceId, widgetId}) helper reads the widget from storage and republishes the Current Widget transient section with the same shape used after a successful write. Silent no-op if the widget cannot be resolved.
  • readWidget(widgetName) calls the new helper after the storage read succeeds.
  • patchWidget, renderWidget, and upsertWidget wrap their storage call in try/catch, refresh the transient from the (still unchanged) on-disk source before re-throwing, so the agent's next turn sees the real current source.
  • The Current Widget transient section is now created with trimAllowed: false. A comment at the set-site documents why: the placeholder-as-renderer corruption mode is correctness-breaking, not just performance-degrading.

app/L0/_all/mod/_core/onscreen_agent/store.js and app/L0/_all/mod/_core/onscreen_agent/llm.js:

  • normalizeTransientSection(...) and createTransientPromptItem(...) now thread the trimAllowed flag through to the prompt-item definition. Without this propagation, the trimAllowed: false set on the section would be lost between transient.set(...) and the budget trimmer (which already honored trimAllowed: false in agent_prompt/prompt-items.js but had no caller passing it through).

The change is scoped to the onscreen agent surface. Admin chat does not currently engage the long-message trimmer — admin/views/agent/api.js:formatTransientMessageBlock concatenates transient sections directly without going through applyPromptPartBudget — so the trimAllowed flag would be unread metadata there today. If admin chat later adopts a trimmer, that PR would need to wire the flag through its own normalizeTransientSection in the same pattern; doing it here would just add dead code with no enforcement of the invariant.

No new dependencies, no behavior change for the success paths — they continue to refresh the transient via buildWidgetToolResult(...) exactly as before. The new failure-path refresh runs before the original error is re-thrown, so callers see the same exception they did pre-PR.

Failure-path refresh ordering

The refresh runs before throw error, so:

  • The original exception still surfaces to the caller (skill contract preserved)
  • The transient state is consistent with the actual on-disk file by the time the next prompt build runs
  • If the refresh itself fails for any reason (storage error, missing widget) it returns false silently — never masks the original write error

No regression for the success paths

patchWidget / renderWidget / upsertWidget still call buildWidgetToolResult(...) on success, which already refreshes the transient. The new try/catch only adds a refresh on the failure path; it does not change what happens when the call succeeds.

readWidget(...) previously emitted the read result as a chat-history message only. With this PR it additionally publishes the same result via the transient envelope; it does not change what readWidget returns to the caller.

Test plan

  • node --check on every modified file passes
  • Existing tests pass (tests/spaces_prompt_context_test.mjs, tests/spaces_widget_import_test.mjs)
  • Manual end-to-end verification with two providers in npm run desktop:pack builds:
    • gpt-5.4 via OpenAI Codex provider — repeated targeted patchWidget edits across multi-turn widget development without Unexpected identifier 'characters' syntax errors and without falling back to renderWidget rewrites for small changes
    • Local qwen3-coder model — the same provider-agnostic behavior. Even a smaller local model now produces consistent targeted edits when working on existing widgets, where the same setup against main reliably triggers full-renderer rewrites

Out of scope (possible follow-ups)

  • Long-message placeholder byte-length stabilization in prompt-items.js so the placeholder remains syntactically inert (e.g. emitted as a JavaScript block comment) even when it does land inside trimmed history. Orthogonal to this PR; useful even if the transient stays clean, because readWidget results in chat history can still be trimmed.
  • Graceful fallback when a single widget source exceeds the configured transient budget. The current PR pins trimAllowed: false and lets the budget overflow if the widget is genuinely huge. In practice the default 30% transient budget at 120k+ max-tokens accommodates any widget I have shipped; if this becomes a real problem, a "widget too large; call readWidget(id) directly" hint section would be the right shape.

🤖 Generated with Claude Code

The Current Widget transient envelope is the authoritative source the
agent uses to build exact-snippet patches. Three bugs let trimming and
stale state corrupt that source:

1. readWidget did not refresh the transient. Its result landed in chat
   history only, where the long-message middle-replacement could
   later inject the placeholder string into the visible widget source.
   The agent then either copied that placeholder into a `find` snippet
   (patch fails because the text does not exist in storage) or pasted
   it into a renderWidget(...) body (the placeholder reads as a
   JavaScript syntax error - "Unexpected identifier 'characters'" -
   and the renderer crashes on first execution).

2. A failed patchWidget / renderWidget / upsertWidget left the
   transient stale. The runtime validator throws (overlapping edits,
   "patchWidget is for partial only", missing find snippet) before
   buildWidgetToolResult runs, so the transient still reflects the
   prior turn or is missing entirely. The agent retries against an
   out-of-date or absent source, often falling back to a full
   renderWidget rewrite.

3. The transient itself was trim-eligible by default. Even with read
   and failed-patch refreshes wired up, the prompt-budget trimmer
   would still mid-replace the source content under pressure,
   reintroducing the placeholder-in-renderer corruption.

Three changes close the loop:

- New refreshCurrentWidgetTransientFromStorage(...) helper reads the
  widget from storage and republishes the Current Widget section.
  Called from readWidget on success, and from patchWidget,
  renderWidget, and upsertWidget try/catch blocks before re-throwing.
  Refresh failures are silent so the original error is never masked.

- Current Widget transient section is now marked trimAllowed: false.
  Comment at the set-site documents why (placeholder-as-renderer
  syntax error). The onscreen-agent transient runtime and prompt-item
  normalizer thread the flag through to the agent_prompt
  applyPromptPartBudget step, which already honored
  trimAllowed: false but had no caller passing it through.

The change is scoped to onscreen agent surface only. Admin chat does
not currently engage the long-message trimmer (it concatenates
transient sections directly in api.js:formatTransientMessageBlock).
If admin chat later adopts a trimmer, that PR would need to wire
trimAllowed through its own normalizeTransientSection in the same
pattern; this is intentionally left to that future change rather than
adding dead code here.

Verified end-to-end with both gpt-5.4 (Codex provider) and a local
qwen3-coder model: repeated targeted patchWidget edits across multiple
turns without falling back to renderWidget rewrites; transient source
remains byte-stable across budget pressure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant