[security] fix(core): guard RSS transcript fetches#238
Conversation
|
Codex review: needs maintainer review before merge. Reviewed June 4, 2026, 10:29 PM ET / 02:29 UTC. Summary Reproducibility: yes. from source inspection: current main decodes the RSS transcript URL and fetches it with automatic redirect following. The PR body also includes redacted terminal proof showing a loopback transcript URL blocked with Review metrics: 2 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land a maintainer-approved security hardening that treats feed-declared transcript URLs as untrusted and keeps core RSS transcript behavior aligned with the daemon URL guard semantics. Do we have a high-confidence way to reproduce the issue? Yes from source inspection: current main decodes the RSS transcript URL and fetches it with automatic redirect following. The PR body also includes redacted terminal proof showing a loopback transcript URL blocked with Is this the best way to solve the issue? Yes, with maintainer acceptance of the compatibility change: validating before fetch, pinning DNS, and revalidating redirects is the narrow maintainable fix for this trust boundary. The safer long-term refinement would be sharing this guard with the existing daemon URL guard to avoid drift. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 821e76613ded. Label changesLabel changes:
Label justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper re-review I pushed a follow-up at
Local validation and GitHub CI are both green:
|
|
🦞🧹 I asked ClawSweeper to review this item again. |
|
@clawsweeper re-review |
|
🦞👀 Command router queued. I will update this comment with the next step. Re-review progress:
|
|
Closing this as superseded by #239, which fixes the same RSS podcast:transcript SSRF boundary in the same core path and has the latest runtime proof/body updates, including blocked local/private proof, public transcript success proof, and the core-vs-daemon guard boundary note.\n\nKeeping #239 open as the active PR to avoid duplicate open fixes for the same issue. |
Summary
This PR hardens the RSS podcast transcript trust boundary so feed-controlled
<podcast:transcript>URLs cannot drive host-side transcript fetching into local or private network services.Security issues covered
Before this PR
tryFetchTranscriptFromFeedXml()decoded the feed-controlled transcripturlattribute and passed it directly to the transcript fetch implementation.After this PR
dns.lookup(..., { all: true, verbatim: true }); empty results or any blocked address reject the transcript URL before fetch.Locationtarget repeats the same URL, DNS, private-address, and pinned-dispatcher handling before the next request.Why this matters
Podcast/RSS feed XML is content controlled by the feed publisher, and feed URLs may be user-supplied in deployments that summarize external podcasts. Without a URL and DNS boundary at the nested transcript-fetch layer, feed content can move the host process from normal remote content retrieval into requests against services that were only intended to be reachable from the local machine or private network.
Attack flow
Affected code
packages/core/src/content/transcript/providers/podcast/rss-transcript.ts,tests/security.rss-transcript-ssrf.test.ts,packages/core/package.json,pnpm-lock.yamlRoot cause
Issue: RSS podcast transcript URLs can trigger SSRF to local/private network services
CVSS assessment
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:LRationale:
Safe reproduction / after-fix proof
Direct loopback transcript URL
<podcast:transcript>element whoseurlpoints at a loopback literal such ashttp://127.0.0.1:8080/transcript.txt.Redacted terminal output from an after-fix runtime scenario with a real local HTTP listener:
Public URL redirecting or rebinding to private space
<podcast:transcript>URL that initially points at a public-looking HTTPS URL.The regression tests added in this PR use mocked fetches/DNS where appropriate and do not contact production services or real private-network endpoints.
Expected vulnerable behavior
localhost, loopback/private address literals, or hostnames resolving to private space can be sent to the host-side fetch implementation.Changes in this PR
redirect: "manual".Locationbefore following it and applies the same DNS/private-address/pinned-dispatcher behavior to redirect targets.Files changed
packages/core/src/content/transcript/providers/podcast/rss-transcript.tspackages/core/package.json,pnpm-lock.yamltests/security.rss-transcript-ssrf.test.tsMaintainer impact
Fix rationale
Type of change
Test plan
pnpm -s test tests/security.rss-transcript-ssrf.test.ts—5tests passed.git diff --check && pnpm -s typecheck— passed.pnpm -s format:check— passed; all matched files used the correct format.pnpm -s lint— passed with0warnings and0errors.pnpm -s tsx /tmp/summarize-rss-transcript-ssrf-proof.ts— runtime proof passed; local listener received0requests.Executed with:
pnpm -s test tests/security.rss-transcript-ssrf.test.tsgit diff --check && pnpm -s typecheckpnpm -s format:checkpnpm -s lintpnpm -s tsx /tmp/summarize-rss-transcript-ssrf-proof.tsNote: local validation ran under Node
v22.22.2even though the repo declaresnode >=24; the same targeted tests, typecheck, format check, and lint completed successfully in this workspace.Disclosure notes