Enhance embed URL handling and validation system#7
Conversation
There was a problem hiding this comment.
Pull request overview
This PR enhances the embed functionality by introducing a comprehensive system for displaying external website content within Discourse forums. The implementation includes RSS/ATOM feed polling capabilities to automatically import and update posts from external sources.
Changes:
- Added embedding infrastructure to display external content in responsive iframes with dynamic height adjustment
- Implemented RSS/ATOM feed polling system to automatically create and update posts from external feeds
- Added comprehensive test coverage for embed functionality, feed polling, and topic retrieval
Reviewed changes
Copilot reviewed 26 out of 28 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| spec/models/topic_embed_spec.rb | Tests for TopicEmbed model validations and import functionality |
| spec/jobs/poll_feed_spec.rb | Tests for feed polling job execution and validation |
| spec/controllers/embed_controller_spec.rb | Tests for embed controller access control and topic retrieval |
| spec/components/topic_retriever_spec.rb | Tests for topic retriever host validation and throttling |
| lib/topic_retriever.rb | Core logic for retrieving and validating embedded topics |
| lib/tasks/disqus.thor | Updated Disqus import task to use TopicEmbed.import_remote |
| lib/post_revisor.rb | Added skip_validations option for post revisions |
| lib/post_creator.rb | Added support for custom cook_method in post creation |
| db/migrate/20131223171005_create_top_topics.rb | Added force: true to table creation |
| db/migrate/20131219203905_add_cook_method_to_posts.rb | Migration to add cook_method column to posts |
| db/migrate/20131217174004_create_topic_embeds.rb | Migration to create topic_embeds table |
| config/site_settings.yml | Added embedding configuration settings |
| config/routes.rb | Added route for embed/best endpoint |
| config/locales/server.en.yml | Added embed-related server-side translations |
| config/locales/client.en.yml | Added "Embedding" to client settings navigation |
| app/views/layouts/embed.html.erb | Layout for embedded content with iframe height messaging |
| app/views/embed/loading.html.erb | Loading view for embed retrieval with auto-reload |
| app/views/embed/best.html.erb | Display view for embedded topic with top posts |
| app/models/topic_embed.rb | Model for managing embedded topics with URL absolutization |
| app/models/post.rb | Added cook_methods enum and raw_html cooking support |
| app/jobs/scheduled/poll_feed.rb | Scheduled job for RSS/ATOM feed polling |
| app/jobs/regular/retrieve_topic.rb | Background job for asynchronous topic retrieval |
| app/controllers/embed_controller.rb | Controller for embed endpoints with security checks |
| app/assets/stylesheets/embed.css.scss | Styles for embedded content display |
| app/assets/javascripts/embed.js | Client-side iframe communication and resizing |
| Gemfile | Added ruby-readability and simple-rss dependencies |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| TopicEmbed.import(user, "invalid url", title, contents).should be_nil | ||
| TopicEmbed.count.should == 0 |
There was a problem hiding this comment.
The test uses deprecated RSpec 'should' syntax. Consider using the modern 'expect' syntax instead: expect(TopicEmbed.import(user, 'invalid url', title, contents)).to be_nil and expect(TopicEmbed.count).to eq(0).
| TopicEmbed.import(user, "invalid url", title, contents).should be_nil | |
| TopicEmbed.count.should == 0 | |
| expect(TopicEmbed.import(user, "invalid url", title, contents)).to be_nil | |
| expect(TopicEmbed.count).to eq(0) |
| let!(:post) { TopicEmbed.import(user, url, title, contents) } | ||
|
|
||
| it "works as expected with a new URL" do | ||
| post.should be_present |
There was a problem hiding this comment.
The test uses deprecated RSpec 'should' syntax. Consider using the modern 'expect' syntax instead: expect(post).to be_present.
| post.cook_method.should == Post.cook_methods[:raw_html] | ||
| post.cooked.should == post.raw |
There was a problem hiding this comment.
The test uses deprecated RSpec 'should' syntax. Consider using the modern 'expect' syntax instead: expect(post.cook_method).to eq(Post.cook_methods[:raw_html]) and expect(post.cooked).to eq(post.raw).
| post.cooked.should == post.raw | ||
|
|
||
| # It converts relative URLs to absolute | ||
| post.cooked.start_with?("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">").should be_true |
There was a problem hiding this comment.
The test uses deprecated RSpec 'should' syntax. Consider using the modern 'expect' syntax instead: expect(post.cooked).to start_with(\"hello world new post <a href=\\\"http://eviltrout.com/hello\\\">hello</a> <img src=\\\"http://eviltrout.com/images/wat.jpg\\\">\").
| # It converts relative URLs to absolute | ||
| post.cooked.start_with?("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">").should be_true | ||
|
|
||
| TopicEmbed.where(topic_id: post.topic_id).should be_present |
There was a problem hiding this comment.
The test uses deprecated RSpec 'should' syntax. Consider using the modern 'expect' syntax instead: expect(TopicEmbed.where(topic_id: post.topic_id)).to be_present.
| response.should be_success | ||
| response.headers['X-Frame-Options'].should == "ALLOWALL" |
There was a problem hiding this comment.
The test uses deprecated RSpec 'should' syntax. Consider using the modern 'expect' syntax instead: expect(response).to be_success and expect(response.headers['X-Frame-Options']).to eq('ALLOWALL').
| setTimeout(function() { | ||
| document.location.reload(); | ||
| }, 30000); |
There was a problem hiding this comment.
The 30-second reload interval is a magic number that should be extracted to a configuration constant or variable with a descriptive name for better maintainability.
| setTimeout(function() { | |
| document.location.reload(); | |
| }, 30000); | |
| const EMBED_LOADING_RELOAD_INTERVAL_MS = 30000; | |
| setTimeout(function() { | |
| document.location.reload(); | |
| }, EMBED_LOADING_RELOAD_INTERVAL_MS); |
| require 'ruby-readability' | ||
|
|
||
| opts = opts || {} | ||
| doc = Readability::Document.new(open(url).read, |
There was a problem hiding this comment.
Using open(url).read without URL validation or network timeouts could lead to SSRF (Server-Side Request Forgery) vulnerabilities or DoS through slow/unresponsive URLs. Consider adding URL validation, timeout limits, and restricting to allowed domains.
| return if user.blank? | ||
|
|
||
| require 'simple-rss' | ||
| rss = SimpleRSS.parse open(SiteSetting.feed_polling_url) |
There was a problem hiding this comment.
Using open() without timeout limits or error handling could lead to vulnerabilities or system hangs. Consider adding network timeouts, error handling for HTTP failures, and validation of the response.
| raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank? | ||
| raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host | ||
|
|
||
| response.headers['X-Frame-Options'] = "ALLOWALL" |
There was a problem hiding this comment.
Setting X-Frame-Options to 'ALLOWALL' is not a valid value. The valid values are 'DENY', 'SAMEORIGIN', or 'ALLOW-FROM uri'. This could leave the application vulnerable to clickjacking attacks. To allow embedding from specific origins, use 'ALLOW-FROM' with the embeddable_host.
| response.headers['X-Frame-Options'] = "ALLOWALL" | |
| response.headers['X-Frame-Options'] = "ALLOW-FROM #{request.protocol}#{SiteSetting.embeddable_host}" |
Test 4
Summary by CodeRabbit
New Features
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.
Replicated from ai-code-review-evaluation/discourse-coderabbit#4