From 3ac34cf0574d98c84b1075a5873e8eb2737224e7 Mon Sep 17 00:00:00 2001 From: "R.S.I." <31208859+polywock@users.noreply.github.com> Date: Sun, 30 Mar 2025 03:27:34 -0400 Subject: [PATCH 1/9] Add proposal for `topFrameMatches` and `excludeTopFrameMatches` API proposal to allow content script registration (both static and dynamic) to be restricted based on the origin of the top-level frame. --- .../content-script-top-frame-matching.md | 139 ++++++++++++++++++ 1 file changed, 139 insertions(+) create mode 100644 proposals/content-script-top-frame-matching.md diff --git a/proposals/content-script-top-frame-matching.md b/proposals/content-script-top-frame-matching.md new file mode 100644 index 00000000..9f51268f --- /dev/null +++ b/proposals/content-script-top-frame-matching.md @@ -0,0 +1,139 @@ + +# Proposal: Content Script Top Frame Origin Matching + +**Summary** + +API proposal to allow content script registration (both static and dynamic) to be restricted based on the origin of the top-level frame using standard [match patterns](https://developer.chrome.com/docs/extensions/develop/concepts/match-patterns), enabling more intuitive and secure site blocking/allowing functionality for extensions. + +**Document Metadata** + +* **Author:** [Polywock](https://github.com/polywock) +* **Sponsoring Browser:** *(Seeking browser sponsorship)* +* **Status:** Draft *(Seeking feedback and browser interest)* +* **Proposal Champions:** [Kzar](https://github.com/kzar), [Carlosjeurissen](https://github.com/carlosjeurissen), [Polywock](https://github.com/polywock) +* **Created:** 2025-03-30 +* **Related Issues:** + * [w3c/webextensions#117](https://github.com/w3c/webextensions/issues/117) + * [w3c/webextensions#668](https://github.com/w3c/webextensions/issues/668) + * [Chromium Issue 40202338](https://issues.chromium.org/issues/40202338) + +## Motivation + +### Objective + +This proposal introduces a mechanism to further restrict content script injection by adding a filter based on the **origin** of the **top-level document**. This allows for more precise control over where scripts execute, while also enabling developers to create site blocklists or allowlists that better align with user expectations, improve performance, and enhance security. + + +### Use Cases + +1. **Intuitive Site Blocking/Allowing:** Many extensions offer users the ability to disable functionality on specific websites. Currently, using frame-level exclusion rules (like `excludeMatches`) on a dynamic content script is a common approach. However, this leads to counter-intuitive behavior: + * If a user blocks `https://example.com/*` using `excludeMatches`, the extension's content script *still runs* on `https://example.com/page` if an embedded iframe loads content from a *different*, non-blocked domain (assuming `all_frames: true`). + * Conversely, if a user visits `https://anothersite.com/*` which embeds an iframe from the blocked `https://example.com/*`, the extension *is blocked* from running within that embedded `example.com` frame, even though the user likely only intended to block the extension when visiting `example.com` directly as the main page. + +2. **Security:** As highlighted in [#117](https://github.com/w3c/webextensions/issues/117), restricting content scripts based on the top-level frame's origin enhances security. For scripts registered with `all_frames: true`, developers can ensure they only execute when the main page's origin is one they expect, preventing accidental injection into sensitive contexts or reducing the impact of potential vulnerabilities. + +3. **Performance:** By preventing script injection at the browser level based on top-frame origin criteria, extensions avoid the performance implications of the [current workaround](#alternatives). + + +### Known Consumers + +Developer interest is evident in the related GitHub and Chromium discussion ([#117](https://github.com/w3c/webextensions/issues/117), [#668](https://github.com/w3c/webextensions/issues/668), [40202338](https://issues.chromium.org/issues/40202338)). This feature addresses a common pattern (site blocking/allowing by origin) that currently requires less secure and efficient workarounds. + +## Specification + +This proposal expands the definition of content scripts in both the `manifest.json` and the scripting API (`scripting.registerContentScripts`). + +### Schema + +#### Manifest `content_scripts` Entry + +The object definition within the `content_scripts` array in `manifest.json` is expanded to include two new optional properties accepting arrays of match patterns. + +```json5 +{ + // If specified: Only inject if the top-level frame's origin matches at least one of these patterns. + "top_frame_matches": ["MatchPattern"], + + // If specified: Only inject if the top-level frame's origin isn't a match for any pattern. + "exclude_top_frame_matches": ["MatchPattern"], +``` +*Where `MatchPattern` is a string conforming to the standard [Extension Match Pattern syntax](https://developer.chrome.com/docs/extensions/develop/concepts/match-patterns).* + +#### `scripting.RegisteredContentScript` Type + +The `RegisteredContentScript` type used by `scripting.registerContentScripts()` and `scripting.updateContentScripts()` is expanded similarly: + +```typescript +dictionary RegisteredContentScript { + + // If provided, only inject if the top-level frame's origin matches at least one of these patterns. + MatchPattern[]? topFrameMatches; + + // If provided, Only inject if the top-level frame's origin isn't a match for any pattern. + MatchPattern[]? excludeTopFrameMatches; +} +``` + +### Behavior / Implementation + +1. **Validation:** When processing `content_scripts` from `manifest.json` or a call to `scripting.registerContentScripts` / `scripting.updateContentScripts`: + * The browser must validate all patterns provided in `topFrameMatches` and `excludeTopFrameMatches`. + * If any pattern contains a path component other than the wildcard path `/*` (i.e., it specifies a specific path like `/foo` or `/bar/*`), the browser must treat this as an error. Patterns without an explicit path or those explicitly using `/*` are considered valid. + * For static declarations in `manifest.json`, this should likely result in a manifest parsing error, preventing the extension from loading. + * For dynamic API calls (`registerContentScripts`, `updateContentScripts`), the promise **must** be rejected with an appropriate error (e.g., `Match patterns must not specify a path other than '/*'.`). + +2. **Injection Logic:** Assuming validation passes, a content script will be injected into a frame if and only if *all* the following conditions are met: + * All existing checks based on the frame's own URL and context are satisfied. + * And if `topFrameMatches` was specified, the origin of the top-level frame's URL matches at least one pattern in `topFrameMatches`. + * And if `excludeTopFrameMatches` was specified, the origin of the top-level frame's URL does *not* match any validated pattern in `excludeTopFrameMatches`. + +**Origin Definition:** The origin consists of the scheme, host, and port (e.g., `https://www.example.com:443`). Match patterns are compared against this origin. The path component of the top-level URL is ignored during matching, as enforced by the validation rule disallowing specific paths in the patterns. + +**Example** An extension that applies a dark theme to all sites and frames is active globally, except for https://www.example.com, where the user has disabled it. + +```json +{ + "matches": ["https://*/*"], + "exclude_top_frame_matches": ["https://www.example.com/*"], + "all_frames": true, + "js": ["force_dark_theme.js"] +} +``` + +### New Permissions + +No new permissions are required. The `topFrameMatches` and `excludeTopFrameMatches` properties only serve to *restrict* where a content script can run, based on the host permissions already requested by the `matches` property. Existing host permission warnings remain appropriate and sufficient. + + +## Security and Privacy + +### Exposed Sensitive Data + +This API does not expose any new data to the extension. It uses the origin of the top-level frame, which is generally less specific than the full URL and already implicitly available by content scripts within that frame. + +### Additional Security Considerations + +This feature enhances the principle of least privilege by allowing developers to be more specific about the top-level origins in which their scripts should operate. + +## Alternatives + +### Existing Workarounds + +Developers can achieve similar *behavior* (but not with same performance or security) by: + +1. Registering a content script with broad `matches` (e.g., ``). +2. Inside the content script, determine the top-level frame's origin `location.ancestorOrigins` or by messaging the background script. +3. Asynchronously fetch the user's blocklist/allowlist (likely stored by origin) from `browser.storage`. +4. Compare the top-level origin against the list. +5. If the origin is blocked (or not allowed), exit early. + +**Limitations of Workarounds:** + +1. **Inefficiency:** The content script still must be injected, potentially across dozens or hundreds of tabs. Even though it exits immediately without further logic, the effect of having these scripts loaded may have significant performance implications. +3. **Asynchronous:** Checking `browser.storage` is asynchronous. Scripts needing synchronous initialization (e.g., modifying the DOM early via `run_at: document_start`) cannot reliably block themselves before potentially executing some code. +4. **Attack Surface:** The content script still must be injected, potentially in sensitive sites that the user intended to block. Vulnerabilities in the script or its dependencies could theoretically be exploited. +5. **Conflicts:** The mere act of injecting a script (especially via the `MAIN` content script world) can cause conflicts with website code, potentially due to polyfills, bundler runtime code, etc. + +### Open Web API + +N/A. *This feature is specific to the WebExtensions model.* From 9396eb465a221c9dada0583c90c10ca963b69133 Mon Sep 17 00:00:00 2001 From: "R.S.I." <31208859+polywock@users.noreply.github.com> Date: Sun, 30 Mar 2025 04:21:25 -0400 Subject: [PATCH 2/9] Update to include missing links. --- proposals/content-script-top-frame-matching.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/content-script-top-frame-matching.md b/proposals/content-script-top-frame-matching.md index 9f51268f..a871acc1 100644 --- a/proposals/content-script-top-frame-matching.md +++ b/proposals/content-script-top-frame-matching.md @@ -13,6 +13,7 @@ API proposal to allow content script registration (both static and dynamic) to b * **Proposal Champions:** [Kzar](https://github.com/kzar), [Carlosjeurissen](https://github.com/carlosjeurissen), [Polywock](https://github.com/polywock) * **Created:** 2025-03-30 * **Related Issues:** + * [w3c/webextensions#763](https://github.com/w3c/webextensions/issues/763) * [w3c/webextensions#117](https://github.com/w3c/webextensions/issues/117) * [w3c/webextensions#668](https://github.com/w3c/webextensions/issues/668) * [Chromium Issue 40202338](https://issues.chromium.org/issues/40202338) @@ -30,14 +31,13 @@ This proposal introduces a mechanism to further restrict content script injectio * If a user blocks `https://example.com/*` using `excludeMatches`, the extension's content script *still runs* on `https://example.com/page` if an embedded iframe loads content from a *different*, non-blocked domain (assuming `all_frames: true`). * Conversely, if a user visits `https://anothersite.com/*` which embeds an iframe from the blocked `https://example.com/*`, the extension *is blocked* from running within that embedded `example.com` frame, even though the user likely only intended to block the extension when visiting `example.com` directly as the main page. -2. **Security:** As highlighted in [#117](https://github.com/w3c/webextensions/issues/117), restricting content scripts based on the top-level frame's origin enhances security. For scripts registered with `all_frames: true`, developers can ensure they only execute when the main page's origin is one they expect, preventing accidental injection into sensitive contexts or reducing the impact of potential vulnerabilities. +2. **Security:** Restricting content scripts based on the top-level frame's origin enhances security. For scripts registered with `all_frames: true`, developers can ensure they only execute when the main page's origin is one they expect, preventing accidental injection into sensitive contexts or reducing the impact of potential vulnerabilities. 3. **Performance:** By preventing script injection at the browser level based on top-frame origin criteria, extensions avoid the performance implications of the [current workaround](#alternatives). ### Known Consumers - -Developer interest is evident in the related GitHub and Chromium discussion ([#117](https://github.com/w3c/webextensions/issues/117), [#668](https://github.com/w3c/webextensions/issues/668), [40202338](https://issues.chromium.org/issues/40202338)). This feature addresses a common pattern (site blocking/allowing by origin) that currently requires less secure and efficient workarounds. +Developer interest is evident in the related GitHub and Chromium discussion ([#763](https://github.com/w3c/webextensions/issues/763), [#117](https://github.com/w3c/webextensions/issues/117), [#668](https://github.com/w3c/webextensions/issues/668), [40202338](https://issues.chromium.org/issues/40202338)). This feature addresses a common pattern (site blocking/allowing by origin) that currently requires less secure and efficient workarounds. ## Specification From a3bcfa856f4ee8e00b2e0b3855b81b856bc2a6d2 Mon Sep 17 00:00:00 2001 From: "R.S.I." <31208859+polywock@users.noreply.github.com> Date: Thu, 8 May 2025 16:57:07 -0400 Subject: [PATCH 3/9] Make changes based on notes by @Rob--W and @carlosjeurissen --- .../content-script-top-frame-matching.md | 73 ++++++++++++------- 1 file changed, 47 insertions(+), 26 deletions(-) diff --git a/proposals/content-script-top-frame-matching.md b/proposals/content-script-top-frame-matching.md index a871acc1..e6db58bf 100644 --- a/proposals/content-script-top-frame-matching.md +++ b/proposals/content-script-top-frame-matching.md @@ -11,7 +11,7 @@ API proposal to allow content script registration (both static and dynamic) to b * **Sponsoring Browser:** *(Seeking browser sponsorship)* * **Status:** Draft *(Seeking feedback and browser interest)* * **Proposal Champions:** [Kzar](https://github.com/kzar), [Carlosjeurissen](https://github.com/carlosjeurissen), [Polywock](https://github.com/polywock) -* **Created:** 2025-03-30 +* **Created:** 2025-03-30 * **Related Issues:** * [w3c/webextensions#763](https://github.com/w3c/webextensions/issues/763) * [w3c/webextensions#117](https://github.com/w3c/webextensions/issues/117) @@ -47,17 +47,22 @@ This proposal expands the definition of content scripts in both the `manifest.js #### Manifest `content_scripts` Entry -The object definition within the `content_scripts` array in `manifest.json` is expanded to include two new optional properties accepting arrays of match patterns. +The object definition within the `content_scripts` array in `manifest.json` is expanded to include two new optional properties accepting arrays of match patterns. ```json5 { - // If specified: Only inject if the top-level frame's origin matches at least one of these patterns. - "top_frame_matches": ["MatchPattern"], + // ... existing content_script properties like "matches", "exclude_matches", etc. - // If specified: Only inject if the top-level frame's origin isn't a match for any pattern. - "exclude_top_frame_matches": ["MatchPattern"], + // If specified: Only inject if the top-level frame's origin matches at least one of these patterns. + "top_frame_matches": ["MatchPattern"], + + // If specified: Only inject if the top-level frame's origin isn't a match for any pattern. + "exclude_top_frame_matches": ["MatchPattern"] +} ``` -*Where `MatchPattern` is a string conforming to the standard [Extension Match Pattern syntax](https://developer.chrome.com/docs/extensions/develop/concepts/match-patterns).* + +*Where `MatchPattern` is a string conforming to the standard [Extension Match Pattern syntax](https://developer.chrome.com/docs/extensions/develop/concepts/match-patterns). + #### `scripting.RegisteredContentScript` Type @@ -65,35 +70,51 @@ The `RegisteredContentScript` type used by `scripting.registerContentScripts()` ```typescript dictionary RegisteredContentScript { - - // If provided, only inject if the top-level frame's origin matches at least one of these patterns. + // ... existing RegisteredContentScript properties like "matches", "excludeMatches", etc. + + // If provided, only inject if the top-level frame's origin matches at least one of these patterns. MatchPattern[]? topFrameMatches; - // If provided, Only inject if the top-level frame's origin isn't a match for any pattern. + // If provided, Only inject if the top-level frame's origin isn't a match for any pattern. MatchPattern[]? excludeTopFrameMatches; } ``` -### Behavior / Implementation +### Behavior / Implementation 1. **Validation:** When processing `content_scripts` from `manifest.json` or a call to `scripting.registerContentScripts` / `scripting.updateContentScripts`: - * The browser must validate all patterns provided in `topFrameMatches` and `excludeTopFrameMatches`. - * If any pattern contains a path component other than the wildcard path `/*` (i.e., it specifies a specific path like `/foo` or `/bar/*`), the browser must treat this as an error. Patterns without an explicit path or those explicitly using `/*` are considered valid. - * For static declarations in `manifest.json`, this should likely result in a manifest parsing error, preventing the extension from loading. - * For dynamic API calls (`registerContentScripts`, `updateContentScripts`), the promise **must** be rejected with an appropriate error (e.g., `Match patterns must not specify a path other than '/*'.`). + * The browser must first validate all patterns provided in `topFrameMatches` and `excludeTopFrameMatches` as they would match patterns provided by `matches` and `excludeMatches`. + * Additionally, if any pattern contains a path component other than the wildcard path `/*` (i.e., it specifies a specific path like `/foo` or `/bar/*`), the browser must treat this as an error. Patterns without an explicit path or those explicitly using `/*` are considered valid. This restriction ensures these patterns are intended to match origins. + * For static declarations in `manifest.json`, this should result in a manifest parsing error, preventing the extension from loading. + * For dynamic API calls (`registerContentScripts`, `updateContentScripts`), the promise must be rejected with an appropriate error (e.g., `Match patterns for top_frame_matches/exclude_top_frame_matches must not specify a path.`). 2. **Injection Logic:** Assuming validation passes, a content script will be injected into a frame if and only if *all* the following conditions are met: - * All existing checks based on the frame's own URL and context are satisfied. - * And if `topFrameMatches` was specified, the origin of the top-level frame's URL matches at least one pattern in `topFrameMatches`. - * And if `excludeTopFrameMatches` was specified, the origin of the top-level frame's URL does *not* match any validated pattern in `excludeTopFrameMatches`. + * All existing checks based on the frame's own URL and context are satisfied (e.g., `matches`, `excludeMatches`). + * And if `topFrameMatches` was specified, the **top-level document's origin** matches at least one pattern in `topFrameMatches`. + * And if `excludeTopFrameMatches` was specified, the **top-level document's origin** does *not* match any pattern in `excludeTopFrameMatches`. + +**Determining the Origin for Top-Level Document Matching** + +The **top-level document's origin** is determined as follows: + +1. First, obtain the "URL for matching" for the top-level document by applying the "Determine the URL for matching a document" algorithm, as specified in the W3C WebExtensions specification ([section 18.1](https://w3c.github.io/webextensions/specification/index.html#determine-the-url-for-matching-a-document)). The `match_origin_as_fallback` parameter of this algorithm must be interpreted as `true`. + +2. If the W3C algorithm returns a "URL for matching": + * This URL is then canonicalized to its origin part for the purpose of this matching. This means retaining the scheme and authority (hostname and port, if specified or non-default), while any path, query, or fragment components are discarded. + * The resulting string (e.g., `https://example.com`, `http://localhost:8080`) is the "top-level document's origin" that is compared against the patterns in `top_frame_matches` and `exclude_top_frame_matches`. + + +**Handling Undeterminable Origins for Matching** + +If the top-level document's origin cannot be determined, the `topFrameMatches` and `excludeTopFrameMatches` criteria are not applied. The determination of whether to inject the content script will then depend solely on other factors (e.g., the frame's own URL against matches and excludeMatches). + -**Origin Definition:** The origin consists of the scheme, host, and port (e.g., `https://www.example.com:443`). Match patterns are compared against this origin. The path component of the top-level URL is ignored during matching, as enforced by the validation rule disallowing specific paths in the patterns. -**Example** An extension that applies a dark theme to all sites and frames is active globally, except for https://www.example.com, where the user has disabled it. +**Static Usage Example:** An extension that applies a dark theme to all frames except for when the top frame's origin matches `https://www.example.com/*`. ```json { - "matches": ["https://*/*"], + "matches": [">"], "exclude_top_frame_matches": ["https://www.example.com/*"], "all_frames": true, "js": ["force_dark_theme.js"] @@ -109,7 +130,7 @@ No new permissions are required. The `topFrameMatches` and `excludeTopFrameMatch ### Exposed Sensitive Data -This API does not expose any new data to the extension. It uses the origin of the top-level frame, which is generally less specific than the full URL and already implicitly available by content scripts within that frame. +This API does not expose any new data to the extension. It uses the top-level document's origin which is generally less specific than the full URL and already implicitly available to content scripts running within frames of that top-level document. ### Additional Security Considerations @@ -122,18 +143,18 @@ This feature enhances the principle of least privilege by allowing developers to Developers can achieve similar *behavior* (but not with same performance or security) by: 1. Registering a content script with broad `matches` (e.g., ``). -2. Inside the content script, determine the top-level frame's origin `location.ancestorOrigins` or by messaging the background script. +2. Inside the content script, determine the top-level frame's origin `location.ancestorOrigins` or by messaging the background script. 3. Asynchronously fetch the user's blocklist/allowlist (likely stored by origin) from `browser.storage`. 4. Compare the top-level origin against the list. -5. If the origin is blocked (or not allowed), exit early. +5. If the origin is blocked (or not allowed), exit early. **Limitations of Workarounds:** 1. **Inefficiency:** The content script still must be injected, potentially across dozens or hundreds of tabs. Even though it exits immediately without further logic, the effect of having these scripts loaded may have significant performance implications. 3. **Asynchronous:** Checking `browser.storage` is asynchronous. Scripts needing synchronous initialization (e.g., modifying the DOM early via `run_at: document_start`) cannot reliably block themselves before potentially executing some code. 4. **Attack Surface:** The content script still must be injected, potentially in sensitive sites that the user intended to block. Vulnerabilities in the script or its dependencies could theoretically be exploited. -5. **Conflicts:** The mere act of injecting a script (especially via the `MAIN` content script world) can cause conflicts with website code, potentially due to polyfills, bundler runtime code, etc. +5. **Conflicts:** The mere act of injecting a script (especially via the MAIN content script world) can cause conflicts with website code that are often hard to diagnose, potentially due to factors outside an extension developer's immediate control like polyfills, bundler runtime code, etc. ### Open Web API -N/A. *This feature is specific to the WebExtensions model.* +N/A. *This feature is specific to the WebExtensions model.* \ No newline at end of file From fa51bbeee86ae11010484b0f51b84ee42e4e6197 Mon Sep 17 00:00:00 2001 From: "R.S.I." <31208859+polywock@users.noreply.github.com> Date: Thu, 8 May 2025 17:07:04 -0400 Subject: [PATCH 4/9] Update content-script-top-frame-matching.md --- proposals/content-script-top-frame-matching.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/proposals/content-script-top-frame-matching.md b/proposals/content-script-top-frame-matching.md index e6db58bf..9e2d438c 100644 --- a/proposals/content-script-top-frame-matching.md +++ b/proposals/content-script-top-frame-matching.md @@ -80,10 +80,11 @@ dictionary RegisteredContentScript { } ``` + ### Behavior / Implementation 1. **Validation:** When processing `content_scripts` from `manifest.json` or a call to `scripting.registerContentScripts` / `scripting.updateContentScripts`: - * The browser must first validate all patterns provided in `topFrameMatches` and `excludeTopFrameMatches` as they would match patterns provided by `matches` and `excludeMatches`. + * The browser must first validate all patterns provided in `topFrameMatches` and `excludeTopFrameMatches` as they would validate patterns provided by `matches` and `excludeMatches`. * Additionally, if any pattern contains a path component other than the wildcard path `/*` (i.e., it specifies a specific path like `/foo` or `/bar/*`), the browser must treat this as an error. Patterns without an explicit path or those explicitly using `/*` are considered valid. This restriction ensures these patterns are intended to match origins. * For static declarations in `manifest.json`, this should result in a manifest parsing error, preventing the extension from loading. * For dynamic API calls (`registerContentScripts`, `updateContentScripts`), the promise must be rejected with an appropriate error (e.g., `Match patterns for top_frame_matches/exclude_top_frame_matches must not specify a path.`). @@ -93,6 +94,7 @@ dictionary RegisteredContentScript { * And if `topFrameMatches` was specified, the **top-level document's origin** matches at least one pattern in `topFrameMatches`. * And if `excludeTopFrameMatches` was specified, the **top-level document's origin** does *not* match any pattern in `excludeTopFrameMatches`. + **Determining the Origin for Top-Level Document Matching** The **top-level document's origin** is determined as follows: @@ -109,7 +111,6 @@ The **top-level document's origin** is determined as follows: If the top-level document's origin cannot be determined, the `topFrameMatches` and `excludeTopFrameMatches` criteria are not applied. The determination of whether to inject the content script will then depend solely on other factors (e.g., the frame's own URL against matches and excludeMatches). - **Static Usage Example:** An extension that applies a dark theme to all frames except for when the top frame's origin matches `https://www.example.com/*`. ```json @@ -157,4 +158,4 @@ Developers can achieve similar *behavior* (but not with same performance or secu ### Open Web API -N/A. *This feature is specific to the WebExtensions model.* \ No newline at end of file +N/A. *This feature is specific to the WebExtensions model.* From 5aff0ac276b85c12e4725debff5dc6cb93e39ae1 Mon Sep 17 00:00:00 2001 From: "R.S.I." <31208859+polywock@users.noreply.github.com> Date: Fri, 9 May 2025 16:19:17 -0400 Subject: [PATCH 5/9] Apply suggestions from code review Co-authored-by: carlosjeurissen <1038267+carlosjeurissen@users.noreply.github.com> --- proposals/content-script-top-frame-matching.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/proposals/content-script-top-frame-matching.md b/proposals/content-script-top-frame-matching.md index 9e2d438c..6abc3e81 100644 --- a/proposals/content-script-top-frame-matching.md +++ b/proposals/content-script-top-frame-matching.md @@ -61,7 +61,7 @@ The object definition within the `content_scripts` array in `manifest.json` is e } ``` -*Where `MatchPattern` is a string conforming to the standard [Extension Match Pattern syntax](https://developer.chrome.com/docs/extensions/develop/concepts/match-patterns). +*Where `MatchPattern` is a string containing a single match pattern as specified in [mdn [Match patterns](https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns). #### `scripting.RegisteredContentScript` Type @@ -94,7 +94,9 @@ dictionary RegisteredContentScript { * And if `topFrameMatches` was specified, the **top-level document's origin** matches at least one pattern in `topFrameMatches`. * And if `excludeTopFrameMatches` was specified, the **top-level document's origin** does *not* match any pattern in `excludeTopFrameMatches`. +If any of the match patterns is not supported or understood by the browser, a soft error should be thrown. In `topFrameMatches`, the pattern will be skipped. If `topFrameMatches` is, or thus will end up being empty, no content scripts will be injected. +If `excludeTopFrameMatches` is empty, the property will be ignored and not considered when injecting content scripts. **Determining the Origin for Top-Level Document Matching** The **top-level document's origin** is determined as follows: From b7f2813a30771e9624fe1a0b989199b668fb713a Mon Sep 17 00:00:00 2001 From: "R.S.I." <31208859+polywock@users.noreply.github.com> Date: Sat, 10 May 2025 00:10:59 -0400 Subject: [PATCH 6/9] Update content-script-top-frame-matching.md --- .../content-script-top-frame-matching.md | 20 +++++++++---------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/proposals/content-script-top-frame-matching.md b/proposals/content-script-top-frame-matching.md index 6abc3e81..b8a0360c 100644 --- a/proposals/content-script-top-frame-matching.md +++ b/proposals/content-script-top-frame-matching.md @@ -84,22 +84,21 @@ dictionary RegisteredContentScript { ### Behavior / Implementation 1. **Validation:** When processing `content_scripts` from `manifest.json` or a call to `scripting.registerContentScripts` / `scripting.updateContentScripts`: - * The browser must first validate all patterns provided in `topFrameMatches` and `excludeTopFrameMatches` as they would validate patterns provided by `matches` and `excludeMatches`. + * The browser must first validate all patterns provided in `topFrameMatches` and `excludeTopFrameMatches` as they would validate patterns provided through `matches` and `excludeMatches`. That includes validating that all provided patterns are not malformed. If malformed URL patterns are found, the browser must treat this as an error. + * Empty arrays are valid values for both `topFramesMatches` and `excludeTopFramesMatches`. * Additionally, if any pattern contains a path component other than the wildcard path `/*` (i.e., it specifies a specific path like `/foo` or `/bar/*`), the browser must treat this as an error. Patterns without an explicit path or those explicitly using `/*` are considered valid. This restriction ensures these patterns are intended to match origins. - * For static declarations in `manifest.json`, this should result in a manifest parsing error, preventing the extension from loading. - * For dynamic API calls (`registerContentScripts`, `updateContentScripts`), the promise must be rejected with an appropriate error (e.g., `Match patterns for top_frame_matches/exclude_top_frame_matches must not specify a path.`). + * Handling validation errors: + * For static declarations in `manifest.json`, validation errors should result in a manifest parsing error, preventing the extension from loading. + * For dynamic API calls (`registerContentScripts`, `updateContentScripts`), validation errors results in the promise being rejected with an with an appropriate error (e.g., `Match patterns for top_frame_matches must not specify a path.` or `One of more match patterns in top_frame_matches weren't able to be parsed`). -2. **Injection Logic:** Assuming validation passes, a content script will be injected into a frame if and only if *all* the following conditions are met: +3. **Injection Logic:** Assuming validation passes, a content script will be injected into a frame if and only if *all* the following conditions are met: * All existing checks based on the frame's own URL and context are satisfied (e.g., `matches`, `excludeMatches`). - * And if `topFrameMatches` was specified, the **top-level document's origin** matches at least one pattern in `topFrameMatches`. - * And if `excludeTopFrameMatches` was specified, the **top-level document's origin** does *not* match any pattern in `excludeTopFrameMatches`. + * And if `topFrameMatches` was specified, the **top-level document's origin** matches at least one pattern in `topFrameMatches`. If `topFrameMatches` is an empty array, the content script will effectively never run. + * And if `excludeTopFrameMatches` was specified, the **top-level document's origin** does *not* match any pattern in `excludeTopFrameMatches`. If `excludeTopFrameMatches` is an empty array, the property will be ignored and not considered when injecting content scripts. -If any of the match patterns is not supported or understood by the browser, a soft error should be thrown. In `topFrameMatches`, the pattern will be skipped. If `topFrameMatches` is, or thus will end up being empty, no content scripts will be injected. -If `excludeTopFrameMatches` is empty, the property will be ignored and not considered when injecting content scripts. -**Determining the Origin for Top-Level Document Matching** -The **top-level document's origin** is determined as follows: +The **Top-level document's origin** is determined as follows: 1. First, obtain the "URL for matching" for the top-level document by applying the "Determine the URL for matching a document" algorithm, as specified in the W3C WebExtensions specification ([section 18.1](https://w3c.github.io/webextensions/specification/index.html#determine-the-url-for-matching-a-document)). The `match_origin_as_fallback` parameter of this algorithm must be interpreted as `true`. @@ -107,7 +106,6 @@ The **top-level document's origin** is determined as follows: * This URL is then canonicalized to its origin part for the purpose of this matching. This means retaining the scheme and authority (hostname and port, if specified or non-default), while any path, query, or fragment components are discarded. * The resulting string (e.g., `https://example.com`, `http://localhost:8080`) is the "top-level document's origin" that is compared against the patterns in `top_frame_matches` and `exclude_top_frame_matches`. - **Handling Undeterminable Origins for Matching** If the top-level document's origin cannot be determined, the `topFrameMatches` and `excludeTopFrameMatches` criteria are not applied. The determination of whether to inject the content script will then depend solely on other factors (e.g., the frame's own URL against matches and excludeMatches). From 131fcc58ebfffcec5ff3c4281882a547b164e86a Mon Sep 17 00:00:00 2001 From: "R.S.I." <31208859+polywock@users.noreply.github.com> Date: Sun, 11 May 2025 18:42:04 -0400 Subject: [PATCH 7/9] Update content-script-top-frame-matching.md --- proposals/content-script-top-frame-matching.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/content-script-top-frame-matching.md b/proposals/content-script-top-frame-matching.md index b8a0360c..482f9481 100644 --- a/proposals/content-script-top-frame-matching.md +++ b/proposals/content-script-top-frame-matching.md @@ -93,8 +93,8 @@ dictionary RegisteredContentScript { 3. **Injection Logic:** Assuming validation passes, a content script will be injected into a frame if and only if *all* the following conditions are met: * All existing checks based on the frame's own URL and context are satisfied (e.g., `matches`, `excludeMatches`). - * And if `topFrameMatches` was specified, the **top-level document's origin** matches at least one pattern in `topFrameMatches`. If `topFrameMatches` is an empty array, the content script will effectively never run. - * And if `excludeTopFrameMatches` was specified, the **top-level document's origin** does *not* match any pattern in `excludeTopFrameMatches`. If `excludeTopFrameMatches` is an empty array, the property will be ignored and not considered when injecting content scripts. + * And if `topFrameMatches` was specified, the **top-level document's origin** must match at least one pattern in `topFrameMatches`. If `topFrameMatches` is an empty array, the content script will effectively never run. + * And if `excludeTopFrameMatches` was specified, the **top-level document's origin** must *not* match any pattern in `excludeTopFrameMatches`. If `excludeTopFrameMatches` is an empty array, the property will be ignored and not considered when injecting content scripts. From 4eb25eee862f92413547c3eebbac543502484694 Mon Sep 17 00:00:00 2001 From: polywock <31208859+polywock@users.noreply.github.com> Date: Fri, 24 Oct 2025 04:16:44 -0400 Subject: [PATCH 8/9] Fixing typos, links. --- proposals/content-script-top-frame-matching.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/proposals/content-script-top-frame-matching.md b/proposals/content-script-top-frame-matching.md index 482f9481..89e44d6f 100644 --- a/proposals/content-script-top-frame-matching.md +++ b/proposals/content-script-top-frame-matching.md @@ -3,14 +3,14 @@ **Summary** -API proposal to allow content script registration (both static and dynamic) to be restricted based on the origin of the top-level frame using standard [match patterns](https://developer.chrome.com/docs/extensions/develop/concepts/match-patterns), enabling more intuitive and secure site blocking/allowing functionality for extensions. +API proposal to allow content script registration (both static and dynamic) to be restricted based on the origin of the top-level frame using standard match patterns ([Mdn](https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns), [Chrome Docs](https://developer.chrome.com/docs/extensions/develop/concepts/match-patterns)), enabling more intuitive and secure site blocking/allowing functionality for extensions. **Document Metadata** * **Author:** [Polywock](https://github.com/polywock) * **Sponsoring Browser:** *(Seeking browser sponsorship)* * **Status:** Draft *(Seeking feedback and browser interest)* -* **Proposal Champions:** [Kzar](https://github.com/kzar), [Carlosjeurissen](https://github.com/carlosjeurissen), [Polywock](https://github.com/polywock) +* **Proposal Champions:** [Dave Vandyke](https://github.com/kzar), [Carlos Jeurissen](https://github.com/carlosjeurissen), [Raymond Hill](https://github.com/gorhill), [Polywock](https://github.com/polywock) * **Created:** 2025-03-30 * **Related Issues:** * [w3c/webextensions#763](https://github.com/w3c/webextensions/issues/763) @@ -61,7 +61,7 @@ The object definition within the `content_scripts` array in `manifest.json` is e } ``` -*Where `MatchPattern` is a string containing a single match pattern as specified in [mdn [Match patterns](https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns). +*Where `MatchPattern` is a string containing a single match pattern. #### `scripting.RegisteredContentScript` Type @@ -85,11 +85,11 @@ dictionary RegisteredContentScript { 1. **Validation:** When processing `content_scripts` from `manifest.json` or a call to `scripting.registerContentScripts` / `scripting.updateContentScripts`: * The browser must first validate all patterns provided in `topFrameMatches` and `excludeTopFrameMatches` as they would validate patterns provided through `matches` and `excludeMatches`. That includes validating that all provided patterns are not malformed. If malformed URL patterns are found, the browser must treat this as an error. - * Empty arrays are valid values for both `topFramesMatches` and `excludeTopFramesMatches`. + * Empty arrays are valid values for both `topFrameMatches` and `excludeTopFrameMatches`. * Additionally, if any pattern contains a path component other than the wildcard path `/*` (i.e., it specifies a specific path like `/foo` or `/bar/*`), the browser must treat this as an error. Patterns without an explicit path or those explicitly using `/*` are considered valid. This restriction ensures these patterns are intended to match origins. * Handling validation errors: * For static declarations in `manifest.json`, validation errors should result in a manifest parsing error, preventing the extension from loading. - * For dynamic API calls (`registerContentScripts`, `updateContentScripts`), validation errors results in the promise being rejected with an with an appropriate error (e.g., `Match patterns for top_frame_matches must not specify a path.` or `One of more match patterns in top_frame_matches weren't able to be parsed`). + * For dynamic API calls (`registerContentScripts`, `updateContentScripts`), validation errors results in the promise being rejected with an with an appropriate error (e.g., `Match patterns for top_frame_matches must not specify a path.` or `One or more match patterns in top_frame_matches weren't able to be parsed`). 3. **Injection Logic:** Assuming validation passes, a content script will be injected into a frame if and only if *all* the following conditions are met: * All existing checks based on the frame's own URL and context are satisfied (e.g., `matches`, `excludeMatches`). @@ -104,18 +104,18 @@ The **Top-level document's origin** is determined as follows: 2. If the W3C algorithm returns a "URL for matching": * This URL is then canonicalized to its origin part for the purpose of this matching. This means retaining the scheme and authority (hostname and port, if specified or non-default), while any path, query, or fragment components are discarded. - * The resulting string (e.g., `https://example.com`, `http://localhost:8080`) is the "top-level document's origin" that is compared against the patterns in `top_frame_matches` and `exclude_top_frame_matches`. + * The resulting string is the "top-level document's origin" that is compared against the patterns in `top_frame_matches` and `exclude_top_frame_matches`. **Handling Undeterminable Origins for Matching** -If the top-level document's origin cannot be determined, the `topFrameMatches` and `excludeTopFrameMatches` criteria are not applied. The determination of whether to inject the content script will then depend solely on other factors (e.g., the frame's own URL against matches and excludeMatches). +If the top-level document’s origin cannot be determined and either (a) `topFrameMatches` is specified, or (b) a non-empty `excludeTopFrameMatches` is specified, the browser MUST NOT inject the content script. This prevents accidental execution in ambiguous or sensitive contexts. **Static Usage Example:** An extension that applies a dark theme to all frames except for when the top frame's origin matches `https://www.example.com/*`. ```json { - "matches": [">"], + "matches": [""], "exclude_top_frame_matches": ["https://www.example.com/*"], "all_frames": true, "js": ["force_dark_theme.js"] From 7d53c2156bd01b51012f3772b856af43360de8ac Mon Sep 17 00:00:00 2001 From: polywock <31208859+polywock@users.noreply.github.com> Date: Sat, 25 Oct 2025 23:16:06 -0400 Subject: [PATCH 9/9] Slightly simplify. --- proposals/content-script-top-frame-matching.md | 16 ++-------------- 1 file changed, 2 insertions(+), 14 deletions(-) diff --git a/proposals/content-script-top-frame-matching.md b/proposals/content-script-top-frame-matching.md index 89e44d6f..d69dfaeb 100644 --- a/proposals/content-script-top-frame-matching.md +++ b/proposals/content-script-top-frame-matching.md @@ -94,7 +94,7 @@ dictionary RegisteredContentScript { 3. **Injection Logic:** Assuming validation passes, a content script will be injected into a frame if and only if *all* the following conditions are met: * All existing checks based on the frame's own URL and context are satisfied (e.g., `matches`, `excludeMatches`). * And if `topFrameMatches` was specified, the **top-level document's origin** must match at least one pattern in `topFrameMatches`. If `topFrameMatches` is an empty array, the content script will effectively never run. - * And if `excludeTopFrameMatches` was specified, the **top-level document's origin** must *not* match any pattern in `excludeTopFrameMatches`. If `excludeTopFrameMatches` is an empty array, the property will be ignored and not considered when injecting content scripts. + * And if `excludeTopFrameMatches` was specified, the **top-level document's origin** must *not* match any pattern in `excludeTopFrameMatches`. @@ -108,19 +108,7 @@ The **Top-level document's origin** is determined as follows: **Handling Undeterminable Origins for Matching** -If the top-level document’s origin cannot be determined and either (a) `topFrameMatches` is specified, or (b) a non-empty `excludeTopFrameMatches` is specified, the browser MUST NOT inject the content script. This prevents accidental execution in ambiguous or sensitive contexts. - - -**Static Usage Example:** An extension that applies a dark theme to all frames except for when the top frame's origin matches `https://www.example.com/*`. - -```json -{ - "matches": [""], - "exclude_top_frame_matches": ["https://www.example.com/*"], - "all_frames": true, - "js": ["force_dark_theme.js"] -} -``` +If the top-level document’s origin cannot be determined and either `topFrameMatches` or `excludeTopFrameMatches` is specified, the browser MUST NOT inject the content script. This prevents accidental execution in ambiguous or sensitive contexts. ### New Permissions