Skip to content

Enhance embed URL handling and validation system#4

Open
everettbu wants to merge 1 commit into
embed-url-handling-pre-copilotfrom
embed-url-handling-post-copilot
Open

Enhance embed URL handling and validation system#4
everettbu wants to merge 1 commit into
embed-url-handling-pre-copilotfrom
embed-url-handling-post-copilot

Conversation

@everettbu

Copy link
Copy Markdown

Test 4

Copilot AI review requested due to automatic review settings July 26, 2025 19:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive embed URL handling and validation system for Discourse, enabling external sites to embed discussions through RSS/ATOM feeds and direct URL imports. The implementation includes controllers, models, jobs, and validation logic to manage embedded content.

Key changes:

  • Added TopicEmbed model with URL validation and content import functionality
  • Implemented feed polling system with RSS/ATOM support
  • Created embed controller with URL validation and security checks

Reviewed Changes

Copilot reviewed 26 out of 28 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
app/models/topic_embed.rb Core model for managing embedded topics with URL validation and content import
app/controllers/embed_controller.rb Controller handling embed requests with security validation
app/jobs/scheduled/poll_feed.rb Scheduled job for RSS/ATOM feed polling and import
lib/topic_retriever.rb Service class for retrieving and validating embed URLs
spec/models/topic_embed_spec.rb Test coverage for TopicEmbed model functionality
spec/controllers/embed_controller_spec.rb Test coverage for embed controller validation
config/site_settings.yml Configuration settings for embedding functionality
app/views/embed/best.html.erb View template for displaying embedded content

Comment thread lib/topic_retriever.rb
Comment on lines +14 to +20
def invalid_host?
SiteSetting.embeddable_host != URI(@embed_url).host
rescue URI::InvalidURIError
# An invalid URI is an invalid host
true
end

Copilot AI Jul 26, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The host validation allows any embeddable_host value without validating that it's a legitimate domain. Consider adding validation to ensure embeddable_host contains only valid domain characters and formats.

Suggested change
def invalid_host?
SiteSetting.embeddable_host != URI(@embed_url).host
rescue URI::InvalidURIError
# An invalid URI is an invalid host
true
end
def invalid_host?
return true unless valid_domain?(SiteSetting.embeddable_host)
SiteSetting.embeddable_host != URI(@embed_url).host
rescue URI::InvalidURIError
# An invalid URI is an invalid host
true
end
def valid_domain?(domain)
# Regular expression to validate domain names
domain_regex = /\A(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}\z/
domain_regex.match?(domain)
end

Copilot uses AI. Check for mistakes.
Comment thread app/models/topic_embed.rb
require 'ruby-readability'

opts = opts || {}
doc = Readability::Document.new(open(url).read,

Copilot AI Jul 26, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using open(url).read without proper URL validation or timeout controls can lead to SSRF vulnerabilities and denial of service attacks. Consider using a safer HTTP client with timeout and URL validation.

Suggested change
doc = Readability::Document.new(open(url).read,
uri = URI.parse(url)
raise ArgumentError, "Invalid URL" unless uri.is_a?(URI::HTTP) || uri.is_a?(URI::HTTPS)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = (uri.scheme == "https")
http.read_timeout = 10 # Set a timeout of 10 seconds
response = http.get(uri.request_uri)
raise "Failed to fetch URL: #{response.code}" unless response.is_a?(Net::HTTPSuccess)
doc = Readability::Document.new(response.body,

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +33
raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host

response.headers['X-Frame-Options'] = "ALLOWALL"
rescue URI::InvalidURIError
raise Discourse::InvalidAccess.new('invalid referer host')
end


Copilot AI Jul 26, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referer header validation can be bypassed as it's controlled by the client. This security check is unreliable for preventing unauthorized embedding. Consider implementing a more robust authentication mechanism.

Suggested change
raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host
response.headers['X-Frame-Options'] = "ALLOWALL"
rescue URI::InvalidURIError
raise Discourse::InvalidAccess.new('invalid referer host')
end
validate_embed_token!
response.headers['X-Frame-Options'] = "ALLOWALL"
end
def validate_embed_token!
token = params[:embed_token]
raise Discourse::InvalidAccess.new('missing embed token') if token.blank?
raise Discourse::InvalidAccess.new('invalid embed token') unless valid_embed_token?(token)
end
def valid_embed_token?(token)
# Replace `SiteSetting.embed_secret` with your actual secret or token validation logic
ActiveSupport::SecurityUtils.secure_compare(token, SiteSetting.embed_secret)
end

Copilot uses AI. Check for mistakes.
return if user.blank?

require 'simple-rss'
rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)

Copilot AI Jul 26, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using open() without timeout or size limits can lead to SSRF attacks and resource exhaustion. Consider using a safer HTTP client with proper constraints.

Copilot uses AI. Check for mistakes.
Comment thread app/models/topic_embed.rb
Comment on lines +65 to +71
a['href'] = "#{prefix}/#{href.sub(/^\/+/, '')}"
end
end
fragment.css('img').each do |a|
src = a['src']
if src.present? && src.start_with?('/')
a['src'] = "#{prefix}/#{src.sub(/^\/+/, '')}"

Copilot AI Jul 26, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL construction logic concatenates prefix with a slash and then removes leading slashes from href, which could result in malformed URLs. Consider using URI.join for proper URL resolution.

Suggested change
a['href'] = "#{prefix}/#{href.sub(/^\/+/, '')}"
end
end
fragment.css('img').each do |a|
src = a['src']
if src.present? && src.start_with?('/')
a['src'] = "#{prefix}/#{src.sub(/^\/+/, '')}"
a['href'] = URI.join(prefix, href).to_s
end
end
fragment.css('img').each do |a|
src = a['src']
if src.present? && src.start_with?('/')
a['src'] = URI.join(prefix, src).to_s

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +35
post.cooked.start_with?("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">").should be_true

Copilot AI Jul 26, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] This test assertion uses start_with? with a very long hardcoded string, making it brittle and hard to maintain. Consider breaking this into multiple focused assertions or using more flexible matching patterns.

Suggested change
post.cooked.start_with?("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">").should be_true
post.cooked.should include("hello world new post")
post.cooked.should include("<a href=\"http://eviltrout.com/hello\">hello</a>")
post.cooked.should include("<img src=\"http://eviltrout.com/images/wat.jpg\">")

Copilot uses AI. Check for mistakes.
@github-actions

Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has been open for 60 days with no activity. To keep it open, remove the stale tag, push code, or add a comment. Otherwise, it will be closed in 14 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants