Skip to content

Enhance embed URL handling and validation system#7

Open
zaibkhan wants to merge 1 commit into
embed-url-handling-prefrom
embed-url-handling-post
Open

Enhance embed URL handling and validation system#7
zaibkhan wants to merge 1 commit into
embed-url-handling-prefrom
embed-url-handling-post

Conversation

@zaibkhan

Copy link
Copy Markdown

No description provided.

@codoki-pr-intelligence

codoki-pr-intelligence Bot commented Sep 10, 2025

Copy link
Copy Markdown

Codoki PR Review

Summary: Remove unsafe postMessage origin, prevent data loss
What’s good: Cook method propagation in PostCreator and configurable validation in PostRevisor indicate thoughtful extensibility.
Review Status: ❌ Requires changes
Overall Priority: Critical
Review Update:
• Coverage: Reviewed all 28 files across 3 batches

Issues (Critical & High only)

Severity Issue Why it matters
Critical Security — Unsanitized raw HTML import enables XSS …/models/topic_embed.rb
Security: Untrusted HTML is being persisted and later rendered with 'raw' by using cook_method :raw_html and skipping validations. This allows XSS (e.g., <img src=x onerror=alert(1)> from a feed item will execute in the embed). Additionally, earlier in this method the appended footer interpolates url directly into an href attribute without escaping, enabling attribute injection if the URL contains quotes. Please sanitize imported HTML and escape the appended URL. At minimum, avoid :raw_html here and let the standard cooking/sanitization pipeline run.
Critical Correctness — Invalid ERB syntax (end if) breaks rendering …/embed/best.html.erb
ERB syntax error: end if is invalid and will crash template rendering. Close the conditional with a plain end.
Critical Security — XSS risk by returning unsanitized raw HTML in Post#cook …/models/post.rb
This returns raw HTML directly and bypasses the normal sanitization pipeline, which opens an XSS vector if the content originates from remote feeds or untrusted sources (e.g., '<script>alert(1)</script>' would be rendered as-is). At minimum, sanitize the HTML before returning, or route it through a sanitizer that matches Discourse's allowlist.
Critical Security — XSS risk and incorrect targetOrigin in postMessage …/layouts/embed.html.erb
Security: request.referer is user-controlled and is embedded into a JS string without JS-context escaping, enabling XSS via crafted Referer. Additionally, postMessage targetOrigin should be a trusted origin (scheme+host+port), not a full URL, and referer can be nil causing runtime errors. Parse and restrict to a known/whitelisted origin and JSON-escape it before embedding.
High Correctness — Controller enqueues undefined job instead of TopicRetriever …/controllers/embed_controller.rb
The controller enqueues a job :retrieve_topic that isn't defined in this PR, and the spec asserts that TopicRetriever is instantiated and retrieve is called. This will break tests and the feature at runtime. Delegate to TopicRetriever here (and if you later add a job, have it call the same retriever to keep guards consistent).

Showing top 5 issues. Critical: 4, High: 4. See inline suggestions for more.

Key Feedback (click to expand)
  • Needs improvement: Remove force: true from create_table calls to avoid unintended data loss; sanitize and restrict postMessage targetOrigin to a trusted origin and use JS-safe escaping.
  • Testing: Add a view/request spec that renders app/views/layouts/embed.html.erb with a crafted Referer containing quotes and verifies the script remains syntactically valid and does not echo raw input (e.g., assert response.body includes a JSON-encoded origin). Consider a migration spec or safety check to ensure no destructive table drops (force: true) make it into production migrations.
  • Documentation: Document acceptable values and semantics for posts.cook_method and who may set skip_validations during revisions, to prevent misuse.
  • Compatibility: Using force: true in create_table will drop existing tables on deploy, which is disruptive for rolling upgrades; prefer additive migrations to maintain compatibility.
  • Performance: No performance concerns in the provided changes.
  • Security: Interpolating request.referer directly into a JavaScript string in the embed layout creates an XSS vector and misuses postMessage's targetOrigin (expects an origin, not an arbitrary URL). Parse and whitelist the origin, and JSON-escape it before embedding.
  • Open questions: Who is allowed to pass skip_validations to PostRevisor#update_post? If external callers can set it, how is misuse prevented? Also, should topic_embeds uniqueness be embed_url-only or a composite key with topic_id/post_id?

Confidence: 1/5 — Unsafe to merge (4 critical · 4 high · status: Requires changes · scope: large PR (28 files))

Sequence Diagram

sequenceDiagram
    participant Parent
    participant Embed
    Parent->>Embed: load iframe
    Embed->>Parent: postMessage({type: 'discourse-resize', height: document.body.offsetHeight}, origin)
Loading

React with 👍 or 👎 if you found this review useful.

Comment thread app/models/topic_embed.rb
# If there is no embed, create a topic, post and the embed.
if embed.blank?
Topic.transaction do
creator = PostCreator.new(user, title: title, raw: absolutize_urls(url, contents), skip_validations: true, cook_method: Post.cook_methods[:raw_html])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛑 Critical: Security: Untrusted HTML is being persisted and later rendered with 'raw' by using cook_method :raw_html and skipping validations. This allows XSS (e.g., <img src=x onerror=alert(1)> from a feed item will execute in the embed). Additionally, earlier in this method the appended footer interpolates url directly into an href attribute without escaping, enabling attribute injection if the URL contains quotes. Please sanitize imported HTML and escape the appended URL. At minimum, avoid :raw_html here and let the standard cooking/sanitization pipeline run.

Suggested change
creator = PostCreator.new(user, title: title, raw: absolutize_urls(url, contents), skip_validations: true, cook_method: Post.cook_methods[:raw_html])
creator = PostCreator.new(user, title: title, raw: absolutize_urls(url, contents), skip_validations: true, cook_method: Post.cook_methods[:regular])

if topic_id
@topic_view = TopicView.new(topic_id, current_user, {best: 5})
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High: The controller enqueues a job :retrieve_topic that isn't defined in this PR, and the spec asserts that TopicRetriever is instantiated and retrieve is called. This will break tests and the feature at runtime. Delegate to TopicRetriever here (and if you later add a job, have it call the same retriever to keep guards consistent).

Suggested change
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
TopicRetriever.new(embed_url).retrieve

<%= link_to(I18n.t('embed.title'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%- else %>
<%= link_to(I18n.t('embed.start_discussion'), @topic_view.topic.url, class: 'button', target: '_blank') %>
<%- end if %>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛑 Critical: ERB syntax error: end if is invalid and will crash template rendering. Close the conditional with a plain end.

Suggested change
<%- end if %>
<%- end %>

Comment thread app/models/post.rb

def cook(*args)
# For some posts, for example those imported via RSS, we support raw HTML. In that
# case we can skip the rendering pipeline.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛑 Critical: This returns raw HTML directly and bypasses the normal sanitization pipeline, which opens an XSS vector if the content originates from remote feeds or untrusted sources (e.g., '<script>alert(1)</script>' would be rendered as-is). At minimum, sanitize the HTML before returning, or route it through a sanitizer that matches Discourse's allowlist.

Suggested change
# case we can skip the rendering pipeline.
return Sanitize.fragment(raw, Sanitize::Config::RELAXED) if cook_method == Post.cook_methods[:raw_html]

user = User.where(id: args[:user_id]).first
end

TopicRetriever.new(args[:embed_url], no_throttle: user.try(:staff?)).retrieve

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High: Fetching an arbitrary embed_url without validating its host risks SSRF and unintended internal calls. Validate the URL's host against SiteSetting.embeddable_host (allowing subdomains if intended) before retrieving.

Suggested change
TopicRetriever.new(args[:embed_url], no_throttle: user.try(:staff?)).retrieve
uri = URI.parse(args[:embed_url]) rescue nil
raise Discourse::InvalidParameters.new(:embed_url) unless uri && uri.host && SiteSetting.embeddable_host.present? && (uri.host == SiteSetting.embeddable_host || uri.host.end_with?(".#{SiteSetting.embeddable_host}"))
TopicRetriever.new(args[:embed_url], no_throttle: user.try(:staff?)).retrieve


function postMessageReceived(e) {
if (!e) { return; }
if (discourseUrl.indexOf(e.origin) === -1) { return; }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔷 Medium: The origin check is reversed; it should verify that e.origin starts with the configured discourseUrl, otherwise a crafty partial match could pass or legitimate origins with trailing slashes/ports could fail.

Suggested change
if (discourseUrl.indexOf(e.origin) === -1) { return; }
if (!e.origin || e.origin.indexOf(discourseUrl.replace(/\/$/, '')) !== 0) { return; }

window.onload = function() {
if (parent) {
// Send a post message with our loaded height
parent.postMessage({type: 'discourse-resize', height: document['body'].offsetHeight}, '<%= request.referer %>');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛑 Critical: Security: request.referer is user-controlled and is embedded into a JS string without JS-context escaping, enabling XSS via crafted Referer. Additionally, postMessage targetOrigin should be a trusted origin (scheme+host+port), not a full URL, and referer can be nil causing runtime errors. Parse and restrict to a known/whitelisted origin and JSON-escape it before embedding.

Suggested change
parent.postMessage({type: 'discourse-resize', height: document['body'].offsetHeight}, '<%= request.referer %>');
parent.postMessage({type: 'discourse-resize', height: document.body.offsetHeight}, <%= (begin uri = URI.parse(request.referer.to_s); "#{uri.scheme}://#{uri.host}#{(uri.port && ![80,443].include?(uri.port)) ? ":#{uri.port}" : ""}" rescue "*").to_json %>);

@@ -0,0 +1,13 @@
class CreateTopicEmbeds < ActiveRecord::Migration
def change
create_table :topic_embeds, force: true do |t|

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High: Using force: true will drop the table if it exists, which can cause irreversible data loss during deploys. Prefer additive migrations and explicit drops only when intentionally migrating data with backups.

Suggested change
create_table :topic_embeds, force: true do |t|
create_table :topic_embeds do |t|

class CreateTopTopics < ActiveRecord::Migration
def change
create_table :top_topics do |t|
create_table :top_topics, force: true do |t|

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High: force: true will drop top_topics if present on deploy, risking data loss and downtime. Avoid destructive table recreation unless absolutely intended and coordinated.

Suggested change
create_table :top_topics, force: true do |t|
create_table :top_topics do |t|

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants