web_fetch_content tool#749
Conversation
|
Important Review skippedNo new commits to review since the last review. ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds a new built-in Changesfetch_web_content built-in tool
Sequence Diagram(s)sequenceDiagram
participant NativeToolCallParser
participant presentAssistantMessage
participant FetchWebContentTool
participant User as User Approval
participant fetch as HTTP fetch
NativeToolCallParser->>presentAssistantMessage: parsed fetch_web_content ToolUse (url, prompt)
presentAssistantMessage->>FetchWebContentTool: handle(params, task, callbacks)
FetchWebContentTool->>FetchWebContentTool: validate URL + enforce HTTP/HTTPS
FetchWebContentTool->>User: askApproval(fetchWebContent, url)
User-->>FetchWebContentTool: approved
FetchWebContentTool->>fetch: GET url (AbortController timeout, max bytes)
fetch-->>FetchWebContentTool: streaming response body
FetchWebContentTool->>FetchWebContentTool: htmlToText / JSON.stringify / raw text
FetchWebContentTool-->>presentAssistantMessage: pushToolResult(formatted content)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
7599218 to
fc37493
Compare
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
0767e1d to
6b7c05a
Compare
6b7c05a to
ba0ca59
Compare
|
@coderabbitai please review |
|
I’ll review the PR now. ✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/shared/__tests__/modes.spec.ts (1)
607-617: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winAdd the matching
code-mode assertion forweb.The new contract is that the
webgroup is available in bothdebugandcode, but this spec only pinsdebug. Ifcodedropsweb, this layer will still stay green.As per coding guidelines, "Use package-local unit tests for pure logic, parsing, state transitions, validation, serialization, request construction, retry decisions, and error handling."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shared/__tests__/modes.spec.ts` around lines 607 - 617, The debug mode spec only asserts that the `web` group exists on `debug`, so it can miss regressions in `code`. Update `modes.spec.ts` by adding a matching assertion for the `code` mode using the same `modes.find(...)`/`toMatchObject(...)` pattern, and verify its `groups` also include `web` alongside the existing `debug` checks.Source: Coding guidelines
🧹 Nitpick comments (1)
src/shared/tools.ts (1)
119-119: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick winAlign
promptnullability with the native args contract.
FetchWebContentTool.executeand the new tool schema both acceptprompt: null, butNativeToolArgs.fetch_web_contentnarrows it tostring | undefined. The parser currently casts through that mismatch, so future callers lose type protection here. Preferprompt?: string | nullifnullremains part of the wire format.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shared/tools.ts` at line 119, The `NativeToolArgs.fetch_web_content` type is narrower than the actual `FetchWebContentTool.execute` and tool schema contract because it excludes `null` for `prompt`. Update the `fetch_web_content` entry in `NativeToolArgs` to allow `prompt?: string | null`, and make sure the parser and any related type references in `FetchWebContentTool` continue to align with that wire format so callers keep proper type safety.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts`:
- Around line 376-389: The NativeToolCallParser streaming tests only cover a
complete JSON payload in one call, so they miss accumulator and partial-JSON
behavior in NativeToolCallParser.processStreamingChunk. Update the
fetch_web_content and related streaming specs in NativeToolCallParser.spec.ts to
split url and prompt across multiple fragmented chunks, asserting intermediate
partial results and final assembled nativeArgs so the accumulator logic in
startStreamingToolCall/processStreamingChunk is exercised. Use the existing
NativeToolCallParser and processStreamingChunk test cases as the place to add
these multi-chunk assertions.
In `@src/core/tools/FetchWebContentTool.ts`:
- Around line 175-215: The FetchWebContentTool validation currently only blocks
non-http(s) schemes, so it still allows SSRF targets like localhost,
private/link-local IPs, and metadata endpoints; add host-target filtering in
FetchWebContentTool before the fetch call and in the redirect path. Use the
existing url/parsedUrl handling in FetchWebContentTool to resolve the hostname,
reject internal or loopback destinations, and re-check each redirect destination
before following it. Also update src/core/auto-approval/tools.ts so
fetchWebContent is not treated as read-only auto-approval until these
network-safety checks are in place.
In `@webview-ui/src/components/chat/__tests__/ChatRow.fetch-web-content.spec.tsx`:
- Around line 60-136: Add a Vitest case in ChatRow.fetch-web-content.spec.tsx to
cover the completed fetch state handled by ChatRow’s fetchWebContent rendering
path. Right now the suite only exercises type: "ask" with ask: "tool", so the
didFetch branch for type: "say" and say: "tool" remains untested; add a local
render assertion that the completed web content UI still shows the expected
URL/message. Use the existing ChatRow test helpers and keep the coverage under
webview-ui/src/components/chat/__tests__ rather than adding an e2e test.
In `@webview-ui/src/components/chat/ChatRow.tsx`:
- Around line 1019-1027: The `didFetch` branch in `ChatRow` is unreachable
because `fetchWebContent` is handled only in the `message.ask === "tool"` path,
while completed fetch events are routed through the later `message.type ===
"say"` switch and never rendered by `case "tool"`. Update the message routing so
`fetchWebContent` is handled in the `message.type === "say"` tool-rendering flow
as well, or otherwise make the `fetchWebContent` case reachable for completed
tool fetches. Use the existing `case "fetchWebContent"` and `case "tool"`
branches in `ChatRow` to keep the want-to-fetch/did-fetch states aligned.
---
Outside diff comments:
In `@src/shared/__tests__/modes.spec.ts`:
- Around line 607-617: The debug mode spec only asserts that the `web` group
exists on `debug`, so it can miss regressions in `code`. Update `modes.spec.ts`
by adding a matching assertion for the `code` mode using the same
`modes.find(...)`/`toMatchObject(...)` pattern, and verify its `groups` also
include `web` alongside the existing `debug` checks.
---
Nitpick comments:
In `@src/shared/tools.ts`:
- Line 119: The `NativeToolArgs.fetch_web_content` type is narrower than the
actual `FetchWebContentTool.execute` and tool schema contract because it
excludes `null` for `prompt`. Update the `fetch_web_content` entry in
`NativeToolArgs` to allow `prompt?: string | null`, and make sure the parser and
any related type references in `FetchWebContentTool` continue to align with that
wire format so callers keep proper type safety.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: a036f0c0-1591-4915-a7c7-f740d0b9ce47
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (53)
packages/types/src/mode.tspackages/types/src/tool.tspackages/types/src/vscode-extension-host.tsschemas/roomodes.jsonsrc/core/assistant-message/NativeToolCallParser.tssrc/core/assistant-message/__tests__/NativeToolCallParser.spec.tssrc/core/assistant-message/presentAssistantMessage.tssrc/core/auto-approval/tools.tssrc/core/prompts/tools/native-tools/fetch_web_content.tssrc/core/prompts/tools/native-tools/index.tssrc/core/tools/FetchWebContentTool.tssrc/core/tools/__tests__/fetchWebContentTool.spec.tssrc/package.jsonsrc/shared/__tests__/modes.spec.tssrc/shared/tools.tswebview-ui/src/components/chat/ChatRow.tsxwebview-ui/src/components/chat/__tests__/ChatRow.fetch-web-content.spec.tsxwebview-ui/src/i18n/locales/ca/chat.jsonwebview-ui/src/i18n/locales/ca/prompts.jsonwebview-ui/src/i18n/locales/de/chat.jsonwebview-ui/src/i18n/locales/de/prompts.jsonwebview-ui/src/i18n/locales/en/chat.jsonwebview-ui/src/i18n/locales/en/prompts.jsonwebview-ui/src/i18n/locales/es/chat.jsonwebview-ui/src/i18n/locales/es/prompts.jsonwebview-ui/src/i18n/locales/fr/chat.jsonwebview-ui/src/i18n/locales/fr/prompts.jsonwebview-ui/src/i18n/locales/hi/chat.jsonwebview-ui/src/i18n/locales/hi/prompts.jsonwebview-ui/src/i18n/locales/id/chat.jsonwebview-ui/src/i18n/locales/id/prompts.jsonwebview-ui/src/i18n/locales/it/chat.jsonwebview-ui/src/i18n/locales/it/prompts.jsonwebview-ui/src/i18n/locales/ja/chat.jsonwebview-ui/src/i18n/locales/ja/prompts.jsonwebview-ui/src/i18n/locales/ko/chat.jsonwebview-ui/src/i18n/locales/ko/prompts.jsonwebview-ui/src/i18n/locales/nl/chat.jsonwebview-ui/src/i18n/locales/nl/prompts.jsonwebview-ui/src/i18n/locales/pl/chat.jsonwebview-ui/src/i18n/locales/pl/prompts.jsonwebview-ui/src/i18n/locales/pt-BR/chat.jsonwebview-ui/src/i18n/locales/pt-BR/prompts.jsonwebview-ui/src/i18n/locales/ru/chat.jsonwebview-ui/src/i18n/locales/ru/prompts.jsonwebview-ui/src/i18n/locales/tr/chat.jsonwebview-ui/src/i18n/locales/tr/prompts.jsonwebview-ui/src/i18n/locales/vi/chat.jsonwebview-ui/src/i18n/locales/vi/prompts.jsonwebview-ui/src/i18n/locales/zh-CN/chat.jsonwebview-ui/src/i18n/locales/zh-CN/prompts.jsonwebview-ui/src/i18n/locales/zh-TW/chat.jsonwebview-ui/src/i18n/locales/zh-TW/prompts.json
| describe("fetch_web_content tool", () => { | ||
| it("should emit a partial ToolUse with nativeArgs.url during streaming", () => { | ||
| const id = "toolu_streaming_fetch_1" | ||
| NativeToolCallParser.startStreamingToolCall(id, "fetch_web_content") | ||
|
|
||
| const fullArgs = JSON.stringify({ url: "https://example.com", prompt: "Find info" }) | ||
| const result = NativeToolCallParser.processStreamingChunk(id, fullArgs) | ||
|
|
||
| expect(result).not.toBeNull() | ||
| expect(result?.nativeArgs).toBeDefined() | ||
| const nativeArgs = result?.nativeArgs as { url: string; prompt?: string } | ||
| expect(nativeArgs.url).toBe("https://example.com") | ||
| expect(nativeArgs.prompt).toBe("Find info") | ||
| }) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Exercise fragmented streaming chunks here.
These cases only pass a complete JSON payload in one processStreamingChunk() call, so they never hit the accumulator/partial-JSON behavior in src/core/assistant-message/NativeToolCallParser.ts:250-288. A bug that only appears when url/prompt arrive across multiple chunks would still pass this suite.
As per coding guidelines, "Use package-local unit tests for pure logic, parsing, state transitions, validation, serialization, request construction, retry decisions, and error handling."
Also applies to: 423-468
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/core/assistant-message/__tests__/NativeToolCallParser.spec.ts` around
lines 376 - 389, The NativeToolCallParser streaming tests only cover a complete
JSON payload in one call, so they miss accumulator and partial-JSON behavior in
NativeToolCallParser.processStreamingChunk. Update the fetch_web_content and
related streaming specs in NativeToolCallParser.spec.ts to split url and prompt
across multiple fragmented chunks, asserting intermediate partial results and
final assembled nativeArgs so the accumulator logic in
startStreamingToolCall/processStreamingChunk is exercised. Use the existing
NativeToolCallParser and processStreamingChunk test cases as the place to add
these multi-chunk assertions.
Source: Coding guidelines
| // Only allow http and https protocols | ||
| if (!["http:", "https:"].includes(parsedUrl.protocol)) { | ||
| task.consecutiveMistakeCount++ | ||
| task.recordToolError("fetch_web_content") | ||
| task.didToolFailInCurrentTurn = true | ||
| pushToolResult( | ||
| formatResponse.toolError(`Invalid protocol: ${parsedUrl.protocol}. Only http and https are supported.`), | ||
| ) | ||
| return | ||
| } | ||
|
|
||
| task.consecutiveMistakeCount = 0 | ||
|
|
||
| // Build the approval message | ||
| const sharedMessageProps: ClineSayTool = { | ||
| tool: "fetchWebContent", | ||
| url: url, | ||
| } | ||
|
|
||
| const completeMessage = JSON.stringify(sharedMessageProps satisfies ClineSayTool) | ||
| const didApprove = await askApproval("tool", completeMessage) | ||
|
|
||
| if (!didApprove) { | ||
| return | ||
| } | ||
|
|
||
| // Execute the fetch | ||
| const controller = new AbortController() | ||
| const timeout = setTimeout(() => controller.abort(), DEFAULT_TIMEOUT_MS) | ||
|
|
||
| try { | ||
| const response = await fetch(url, { | ||
| method: "GET", | ||
| headers: { | ||
| "User-Agent": "Mozilla/5.0 (compatible; ZooCode/1.0.0)", | ||
| Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,text/plain;q=0.8,*/*;q=0.7", | ||
| "Accept-Language": "en-US,en;q=0.9", | ||
| }, | ||
| redirect: "follow", | ||
| signal: controller.signal, | ||
| }) |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift
Block private-network and metadata targets before fetching.
The current validation only checks scheme, so this still permits SSRF to localhost, RFC1918/private ranges, link-local addresses, and cloud metadata endpoints, and then returns that response body back into the chat. That is especially risky here because src/core/auto-approval/tools.ts now treats fetchWebContent as read-only, so auto-approve can bypass the manual consent gate entirely. Deny internal/loopback/link-local targets, resolve and re-check every redirect destination, and keep this tool out of read-only auto-approval until those checks exist.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/core/tools/FetchWebContentTool.ts` around lines 175 - 215, The
FetchWebContentTool validation currently only blocks non-http(s) schemes, so it
still allows SSRF targets like localhost, private/link-local IPs, and metadata
endpoints; add host-target filtering in FetchWebContentTool before the fetch
call and in the redirect path. Use the existing url/parsedUrl handling in
FetchWebContentTool to resolve the hostname, reject internal or loopback
destinations, and re-check each redirect destination before following it. Also
update src/core/auto-approval/tools.ts so fetchWebContent is not treated as
read-only auto-approval until these network-safety checks are in place.
| describe("ChatRow - fetchWebContent tool", () => { | ||
| beforeEach(() => { | ||
| vi.clearAllMocks() | ||
| }) | ||
|
|
||
| it("should display fetchWebContent ask message with URL", () => { | ||
| const message: any = { | ||
| type: "ask", | ||
| ask: "tool", | ||
| ts: Date.now(), | ||
| text: JSON.stringify({ | ||
| tool: "fetchWebContent", | ||
| url: "https://example.com", | ||
| }), | ||
| partial: false, | ||
| } | ||
|
|
||
| renderChatRowWithProviders(message) | ||
|
|
||
| expect(screen.getByText("Zoo wants to fetch web content")).toBeInTheDocument() | ||
| expect(screen.getByText("https://example.com")).toBeInTheDocument() | ||
| }) | ||
|
|
||
| it("should display the Globe icon for fetchWebContent", () => { | ||
| const message: any = { | ||
| type: "ask", | ||
| ask: "tool", | ||
| ts: Date.now(), | ||
| text: JSON.stringify({ | ||
| tool: "fetchWebContent", | ||
| url: "https://docs.example.com/api", | ||
| }), | ||
| partial: false, | ||
| } | ||
|
|
||
| renderChatRowWithProviders(message) | ||
|
|
||
| expect(screen.getByLabelText("Web fetch icon")).toBeInTheDocument() | ||
| }) | ||
|
|
||
| it("should display the URL in the tool use block", () => { | ||
| const message: any = { | ||
| type: "ask", | ||
| ask: "tool", | ||
| ts: Date.now(), | ||
| text: JSON.stringify({ | ||
| tool: "fetchWebContent", | ||
| url: "https://api.github.com/repos/owner/repo", | ||
| }), | ||
| partial: false, | ||
| } | ||
|
|
||
| renderChatRowWithProviders(message) | ||
|
|
||
| expect(screen.getByText("https://api.github.com/repos/owner/repo")).toBeInTheDocument() | ||
| }) | ||
|
|
||
| it("should not return null for fetchWebContent tool (regression test)", () => { | ||
| const message: any = { | ||
| type: "ask", | ||
| ask: "tool", | ||
| ts: Date.now(), | ||
| text: JSON.stringify({ | ||
| tool: "fetchWebContent", | ||
| url: "https://www.delfi.lt", | ||
| }), | ||
| partial: false, | ||
| } | ||
|
|
||
| const { container } = renderChatRowWithProviders(message) | ||
|
|
||
| // The container should have rendered content (not null) | ||
| expect(container.innerHTML).not.toBe("") | ||
| expect(screen.getByText("Zoo wants to fetch web content")).toBeInTheDocument() | ||
| expect(screen.getByText("https://www.delfi.lt")).toBeInTheDocument() | ||
| }) | ||
| }) |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Add a say/tool coverage case for the completed fetch state.
Every test here uses ask/tool, so the didFetch path is untested. A local Vitest case for type: "say", say: "tool" would catch the current regression and matches the webview testing guidance. As per coding guidelines, "Prefer local webview-ui tests for React/webview behavior such as component rendering, local state, hooks, form dirty-state, validation, or prop wiring. Add or update Vitest coverage under webview-ui/src/**/__tests__ instead of reaching for apps/vscode-e2e."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@webview-ui/src/components/chat/__tests__/ChatRow.fetch-web-content.spec.tsx`
around lines 60 - 136, Add a Vitest case in ChatRow.fetch-web-content.spec.tsx
to cover the completed fetch state handled by ChatRow’s fetchWebContent
rendering path. Right now the suite only exercises type: "ask" with ask: "tool",
so the didFetch branch for type: "say" and say: "tool" remains untested; add a
local render assertion that the completed web content UI still shows the
expected URL/message. Use the existing ChatRow test helpers and keep the
coverage under webview-ui/src/components/chat/__tests__ rather than adding an
e2e test.
Source: Coding guidelines
| case "fetchWebContent": | ||
| return ( | ||
| <> | ||
| <div style={headerStyle}> | ||
| <Globe className="w-4 shrink-0" aria-label="Web fetch icon" /> | ||
| <span style={{ fontWeight: "bold" }}> | ||
| {message.type === "ask" | ||
| ? t("chat:webFetch.wantsToFetch") | ||
| : t("chat:webFetch.didFetch")} |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
didFetch is unreachable here.
This branch sits behind tool, but tool is only parsed for message.ask === "tool". say/tool fetch events still go through the later message.type === "say" switch, whose case "tool" does not render fetchWebContent, so completed fetches disappear from the timeline.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@webview-ui/src/components/chat/ChatRow.tsx` around lines 1019 - 1027, The
`didFetch` branch in `ChatRow` is unreachable because `fetchWebContent` is
handled only in the `message.ask === "tool"` path, while completed fetch events
are routed through the later `message.type === "say"` switch and never rendered
by `case "tool"`. Update the message routing so `fetchWebContent` is handled in
the `message.type === "say"` tool-rendering flow as well, or otherwise make the
`fetchWebContent` case reachable for completed tool fetches. Use the existing
`case "fetchWebContent"` and `case "tool"` branches in `ChatRow` to keep the
want-to-fetch/did-fetch states aligned.
edelauna
left a comment
There was a problem hiding this comment.
So in reviewing this I noticed a couple implementation/security details we should consider, but I also just realized - how is this feature any different than asking the agent to just curl a website?
I wonder if instead this feature should be more stand alone, and a type of structured web parser.
Based on this article: https://mikhail.io/2025/10/claude-code-web-tools/ it seems like Claude fetches a page, converts it to markdown then has a different model go through and parse the contents? I wonder if this feature should be similar to the code embeddings feature where a user can specify a provider and model that will perform a structured parsing of a web fetch?
| "searchFiles", | ||
| "codebaseSearch", | ||
| "runSlashCommand", | ||
| "fetchWebContent", |
There was a problem hiding this comment.
Does grouping an outbound network fetch with file reads make sense here? With alwaysAllowReadOnly on, every fetch_web_content call auto-approves with no user prompt. I think network calls warrant their own opt-in flag, or a setting to always require explicit approval?
| signal: controller.signal, | ||
| }) | ||
|
|
||
| clearTimeout(timeout) |
There was a problem hiding this comment.
The timeout is cancelled here as soon as headers arrive, so the body-read loop below runs with no deadline. A server that sends headers quickly but streams the body slowly could hold the connection open indefinitely — the 30s limit only covers time-to-first-byte. Should the timer stay active through the full read?
| Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,text/plain;q=0.8,*/*;q=0.7", | ||
| "Accept-Language": "en-US,en;q=0.9", | ||
| }, | ||
| redirect: "follow", |
There was a problem hiding this comment.
With redirect: "follow", a server can redirect to a private IP or cloud metadata endpoint (169.254.169.254, localhost, etc.) and the redirect destination is never validated — only the original URL was checked. Would limiting redirects to the same host work here?
| redirect: "follow", | |
| redirect: "manual", |
(Then check response.status >= 300 && response.status < 400 and only follow if new URL(response.headers.get('location')!, url).origin === new URL(url).origin)
| content = text | ||
| } | ||
| } else { | ||
| content = text |
There was a problem hiding this comment.
If the server responds with image/jpeg, application/pdf, or another binary type (the Accept header allows */*), this falls through and returns the raw decoded bytes to the model as text. Should binary MIME types be rejected before attempting to decode?
| `Size: ${totalSize} bytes`, | ||
| ``, | ||
| `--- Content ---`, | ||
| content.slice(0, MAX_CONTENT_CHARS), |
There was a problem hiding this comment.
Web content lands directly in the model's context with no trust boundary marker. If a page contains visible text like "Ignore all previous instructions...", it passes through as-is. Would wrapping this in something like <untrusted_web_content source="..."> (and noting in the tool description that content inside this tag is third-party data, not instructions) help?
| } | ||
|
|
||
| if (prompt) { | ||
| outputLines.push(``, `--- Analysis Request ---`, `Prompt: ${prompt}`) |
There was a problem hiding this comment.
The user's analysis prompt arrives after up to 50k chars of untrusted content. A malicious page could also embed --- Analysis Request --- in its visible text before the real one appears. Would putting the prompt instruction before the content block help with both ordering and injection resistance?
| clearTimeout(timeout) | ||
|
|
||
| if (!response.ok) { | ||
| pushToolResult(formatResponse.toolError(`HTTP ${response.status}: ${response.statusText}`)) |
There was a problem hiding this comment.
Other tools (e.g. GenerateImageTool) call task.say("error", ...) before pushToolResult so the error surfaces as a visible bubble in chat — without it, only the model sees this. Same gap exists at the size-limit path a bit further down.
| // Process content based on type | ||
| let content: string | ||
| if (contentType.includes("text/html") || contentType.includes("application/xhtml")) { | ||
| content = htmlToText(text) |
There was a problem hiding this comment.
cheerio.load() is synchronous. For responses near the 5 MB limit, parsing can block the extension host event loop for hundreds of milliseconds, freezing IntelliSense and file watchers. Would slicing the HTML to a smaller cap before parsing (e.g. 500 KB) be a reasonable trade-off, since MAX_CONTENT_CHARS will truncate the output anyway?
Related GitHub Issue
Closes: #748
Description
Implements web_fetch_content tool.
Test Procedure
Manual testing.
Pre-Submission Checklist
Screenshots / Videos
Documentation Updates
Additional Notes
Please let me know if this is something that can be added to ZooCode and I will finish polishing it. I suspect at least docs will need updates.
Summary by CodeRabbit
New Features
Bug Fixes