Skip to content

fix(vscode-lm): reliable auto context condensing#710

Open
simurg79 wants to merge 6 commits into
Zoo-Code-Org:mainfrom
simurg79:fix/vscode-lm-condense
Open

fix(vscode-lm): reliable auto context condensing#710
simurg79 wants to merge 6 commits into
Zoo-Code-Org:mainfrom
simurg79:fix/vscode-lm-condense

Conversation

@simurg79

@simurg79 simurg79 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Related GitHub Issue

Closes #714

Description

Fixes unreliable automatic context condensing on the VS Code LM (vscode-lm) provider. The provider reports maxTokens: -1 (unlimited) and an inflated live context window, so the auto-condense gate computed usage against the wrong denominator and effectively never fired even when the UI context gauge showed the window as full.

  • Treat maxTokens: -1 (unlimited) as the default output reserve in both willManageContext and manageContext instead of letting a negative value distort the window math.
  • Measure usage against the available input space (contextWindow - reservedForOutput), matching the UI gauge, with a safe fallback to the full window when the reserve is unknown/unlimited.
  • Add an optional getCondenseContextWindow() ApiHandler seam; only the vscode-lm provider overrides it to drive the gate from the curated static-table maxInputTokens. Every other provider falls back to modelInfo.contextWindow (behavior unchanged).
  • Refresh the VS Code LM model catalog and default model (claude-sonnet-4.5).
  • UI guards so the context bar treats a negative maxTokens as zero reserve and resolves an unlisted family to the default-model window.

Test Procedure

  • packages/typesvscode-llm.spec.ts: model catalog invariants.
  • srccontext-management.spec.ts: reserve guard + available-input denominator, including the availableInputTokens <= 0 → 100% fallback and end-to-end manageContext summarization.
  • srcvscode-lm.spec.ts: getCondenseContextWindow() resolution (static family, live fallback, non-positive guard).
  • webview-uiTaskHeader.spec.tsx, useSelectedModel.spec.ts: UI reserve/window guards.
  • Full CI: all checks green; codecov/patch at 88.88% (≥ 80% target).

Pre-Submission Checklist

  • Issue Linked: Closes vscode-lm: automatic context condensing never triggers (maxTokens -1 + inflated window) #714 (see "Related GitHub Issue" above).
  • Scope: Changes are scoped to the vscode-lm auto-condense fix (one fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: Added/updated unit tests for the changed logic.
  • Documentation Impact: Considered; no user-facing docs required (behavior aligns auto-condense with the existing context gauge).
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.
  • No extension version bump (Zoo-Code release versioning is maintainer-managed); no agent-created changeset.
  • All CI checks pass.

Documentation Updates

  • No documentation updates are required; behavior aligns auto-condense with the existing context gauge.

Additional Notes

Port of simurg79/Roo-Code#11 into Zoo-Code. Applied by context (paths map 1:1); the upstream version bump was intentionally omitted.

Summary by CodeRabbit

Summary

  • New Features

    • Refreshed VS Code LLM provider model catalog and updated the default model selection.
    • Improved provider-aware auto-condense context-window behavior.
  • Bug Fixes

    • Auto-summarization thresholds now use available input space, with safer handling when output is “unlimited”.
    • Fixed context usage percentage display so negative/unlimited output settings don’t skew results.
  • Tests

    • Expanded coverage for VS Code LLM model selection, context-window derivation, and auto-condense gating scenarios.

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Updates the VS Code LM model catalog, adds a getCondenseContextWindow() seam on ApiHandler/VsCodeLmHandler, changes context-management threshold math to use available input space, and adjusts webview model selection and reserved-output handling.

Changes

VS Code LM condense gate fix

Layer / File(s) Summary
Catalog and API contract
packages/types/src/providers/vscode-llm.ts, src/api/index.ts, packages/types/src/__tests__/vscode-llm.spec.ts
vscodeLlmDefaultModelId changes to "claude-sonnet-4.5", vscodeLlmModels is replaced with a curated catalog that distinguishes contextWindow from maxInputTokens, and getCondenseContextWindow?() is added to ApiHandler. Catalog tests validate field positivity, exclusions, and default-model presence.
VsCodeLmHandler condense window
src/api/providers/vscode-lm.ts, src/api/providers/__tests__/vscode-lm.spec.ts
getCondenseContextWindow() is added to VsCodeLmHandler, using vscodeLlmModels[family].maxInputTokens with fallback to getModel().info.contextWindow. Tests cover large advertised values, small values, non-numeric fallback, static-table hits, unknown-family fallback, no-family fallback, and non-positive static entry guard.
Context management math
src/core/context-management/index.ts, src/core/context-management/__tests__/context-management.spec.ts
willManageContext and manageContext now treat non-positive maxTokens as unlimited and compute contextPercent against available input space (contextWindow - reservedForOutput) when enabled by the new flag. Regression tests cover the updated threshold math, maxTokens: -1, and reserve-≥-window edge cases.
Task condense-window wiring
src/core/task/Task.ts
Both the truncation path and the context-management pre-check in attemptApiRequest now read getCondenseContextWindow?.() when available, falling back to modelInfo.contextWindow. Abort-signal listeners are reformatted to multiline addEventListener with { once: true }.
Webview model selection and display
webview-ui/src/components/ui/hooks/useSelectedModel.ts, webview-ui/src/components/chat/TaskHeader.tsx, webview-ui/src/components/chat/__tests__/TaskHeader.spec.tsx, webview-ui/src/components/ui/hooks/__tests__/useSelectedModel.spec.ts
TaskHeader clamps negative maxTokens to zero for reserved output math. The vscode-lm branch of useSelectedModel introduces a default-model fallback and sets contextWindow explicitly from listedModel.maxInputTokens. Tests cover the unlimited-output display case and listed vs. unlisted family resolution.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Zoo-Code-Org/Zoo-Code#674: Also changes src/core/task/Task.ts cancellation plumbing, so it is related at the request-abort handling level.

Suggested labels

awaiting-review

Suggested reviewers

  • navedmerchant
  • hannesrudolph
  • edelauna
  • JamesRobert20

Poem

🐇 A window once wandered, a gate lost its way,
Now tokens and thresholds hop side by side every day.
Claude and Copilot share one tidy tune,
And the rabbit says: “condense will bloom soon!” 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately reflects the main change: fixing vscode-lm auto context condensing.
Description check ✅ Passed The description follows the template and includes issue linkage, summary, test procedure, checklist, and docs notes.
Linked Issues check ✅ Passed The changes address the linked issue by fixing reserve math, using available input space, and scoping the provider-specific override to vscode-lm.
Out of Scope Changes check ✅ Passed No clear out-of-scope changes are present; the model catalog, UI guards, and tests all support the vscode-lm condensing fix.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.changeset/vscode-lm-condense-fix.md:
- Around line 1-5: The changeset file `.changeset/vscode-lm-condense-fix.md` was
created but should not be included in this PR since changesets are managed by
maintainers outside the normal development workflow. Delete the entire
`.changeset/vscode-lm-condense-fix.md` file and allow maintainers to create the
proper changeset entry separately.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 011fdda6-5a92-4a6e-a80c-e1b4d20c91ab

📥 Commits

Reviewing files that changed from the base of the PR and between e8acc6a and 1eadaea.

📒 Files selected for processing (13)
  • .changeset/vscode-lm-condense-fix.md
  • packages/types/src/__tests__/vscode-llm.spec.ts
  • packages/types/src/providers/vscode-llm.ts
  • src/api/index.ts
  • src/api/providers/__tests__/vscode-lm.spec.ts
  • src/api/providers/vscode-lm.ts
  • src/core/context-management/__tests__/context-management.spec.ts
  • src/core/context-management/index.ts
  • src/core/task/Task.ts
  • webview-ui/src/components/chat/TaskHeader.tsx
  • webview-ui/src/components/chat/__tests__/TaskHeader.spec.tsx
  • webview-ui/src/components/ui/hooks/__tests__/useSelectedModel.spec.ts
  • webview-ui/src/components/ui/hooks/useSelectedModel.ts

Comment thread .changeset/vscode-lm-condense-fix.md Outdated
@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.90909% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/core/task/Task.ts 81.81% 2 Missing ⚠️
src/core/context-management/index.ts 92.30% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Add targeted tests for the previously-uncovered ported branches: the availableInputTokens<=0 fallback to 100% in willManageContext/manageContext, getCondenseContextWindow() guard fallbacks, and the vscode-lm UI family-miss window resolution. Raises patch coverage to satisfy the codecov/patch 80% gate.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/api/providers/__tests__/vscode-lm.spec.ts`:
- Around line 504-521: The test is mutating the static model row for the
selector family, but the mocked client currently uses a different family so
VsCodeLmHandler never consults that row. Update the test setup in the
VsCodeLmHandler/getCondenseContextWindow case so the mockLanguageModelChat
family matches "claude-opus-4.8", or avoid assigning client so the selector
family path is used; this ensures the zeroed maxInputTokens row is actually
exercised and the live-window fallback assertion is valid.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7cff98c0-cb87-4d62-9e6d-29e09bf5e9b0

📥 Commits

Reviewing files that changed from the base of the PR and between 1eadaea and 62a556c.

📒 Files selected for processing (2)
  • src/api/providers/__tests__/vscode-lm.spec.ts
  • src/core/context-management/__tests__/context-management.spec.ts

Comment thread src/api/providers/__tests__/vscode-lm.spec.ts Outdated
…w guard test

- Remove .changeset/vscode-lm-condense-fix.md (changesets are maintainer-managed per AGENTS.md; CodeRabbit flagged).

- Fix getCondenseContextWindow() non-positive-guard test so the selector family (claude-opus-4.8) drives the lookup and the zeroed static row actually exercises the maxInputTokens > 0 guard before falling back.

@edelauna edelauna left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing this, had some comments around the implementation.

Comment on lines +318 to +320
// contextWindow MUST equal maxInputTokens: that is the exact value the gate consumes via
// getModel().info.contextWindow = Math.max(0, client.maxInputTokens) in src/api/providers/vscode-lm.ts,
// so the UI bar and the condense gate share a single source of truth.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says the gate consumes getModel().info.contextWindow, but Task.ts now calls getCondenseContextWindow?.() as the primary path (which returns the static table's maxInputTokens). getModel() is the fallback, not the primary. Worth updating the comment so it doesn't mislead future readers?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d128a5c — corrected the comment to state the condense gate's primary window comes from getCondenseContextWindow() (the static-table maxInputTokens), with getModel().info.contextWindow as the fallback.

expect(result.current.provider).toBe("vscode-lm")
expect(result.current.id).toBe(`copilot/${family}`)
// The bar and the condense gate share one source of truth: contextWindow === maxInputTokens.
expect(result.current.info?.contextWindow).toBe(vscodeLlmModels[family].maxInputTokens)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses vscodeLlmDefaultModelId (claude-sonnet-4.5) where contextWindow === maxInputTokens === 167790. If someone accidentally swapped listedModel.maxInputTokens for listedModel.contextWindow on line 324 of useSelectedModel.ts, this test would still pass. Adding one test with claude-opus-4.8 (the only row where contextWindow: 679560maxInputTokens: 197897) would catch that mutation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in d128a5c — added a claude-opus-4.8 case (contextWindow 679560 != maxInputTokens 197897) asserting contextWindow === maxInputTokens and not the advertised window, so a field swap is now caught.

Comment thread src/core/context-management/index.ts Outdated
Comment on lines +202 to +204
const reservedForOutput = maxTokens && maxTokens > 0 ? maxTokens : 0
const availableInputTokens = contextWindow - reservedForOutput
const contextPercent = availableInputTokens > 0 ? (100 * prevContextTokens) / availableInputTokens : 100

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the contextPercent denominator from contextWindow to contextWindow - maxTokens for every provider where maxTokens > 0, not just vscode-lm. For example, Anthropic with a 200K window and maxTokens=8192 will see condense fire ~4% earlier than before. Was that intentional? If so, might be worth a note in the PR description since it's a behavioral change for all providers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in d128a5c — scoped the available-input-space denominator to vscode-lm only via an opt-in flag (useAvailableInputForContextPercent), derived in Task.ts from the getCondenseContextWindow seam (only vscode-lm implements it). All other providers now divide by the full context window again, so there's no behavioral change for Anthropic/etc. The maxTokens:-1 reserve guard remains global. Added tests proving both the default full-window path and the vscode-lm opt-in path.

@github-actions github-actions Bot added the awaiting-author PR is waiting for the author to address requested changes label Jun 25, 2026
Bertan Ari and others added 3 commits June 26, 2026 06:14
…lm; address review

Address review feedback from edelauna on Zoo-Code-Org#710:
- Scope the available-input-space condense percent denominator to vscode-lm only (via the getCondenseContextWindow seam); all other providers keep dividing by the full context window. The maxTokens:-1 reserve guard remains global.
- Correct the misleading useSelectedModel comment: the gate's primary window is getCondenseContextWindow() (static maxInputTokens), not getModel().info.contextWindow.
- Strengthen the listed-family test with a claude-opus-4.8 case (contextWindow != maxInputTokens) to catch a field swap.
Simplify comments added in PR Zoo-Code-Org#710 to be brief and rationale-focused; no logic, assertions, or test values changed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-author PR is waiting for the author to address requested changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

vscode-lm: automatic context condensing never triggers (maxTokens -1 + inflated window)

2 participants