feat(rules): Tune XSS rules, separate ERROR and NOTE version#113
Merged
Conversation
34b3a41 to
9be53a7
Compare
404be40 to
36cbb8d
Compare
68c3143 to
52d3e2e
Compare
Saloed
reviewed
May 12, 2026
| - pattern: | | ||
| (HttpServletResponse $RESPONSE).sendError($CODE, $UNTRUSTED) | ||
| - pattern: | | ||
| (javax.servlet.ServletOutputStream $WRITER).$WRITE(..., $UNTRUSTED, ...) |
Contributor
There was a problem hiding this comment.
Maybe we can remove these patterns?
| $X = new Builder(); | ||
| ... | ||
| - patterns: | ||
| - pattern-not-inside: $X.cfg("Content-Type", $V) |
Contributor
There was a problem hiding this comment.
It is important to use "$V" in quotes
9a47250 to
e669444
Compare
dvvrd
reviewed
May 14, 2026
|
|
||
| pattern-sinks: | ||
| # Branch 1: DEFAULT-DANGEROUS return types. Use pattern-not-inside (not | ||
| # pattern-not — automata bug SemgrepRuleToAutomata.kt:82-84). |
Contributor
There was a problem hiding this comment.
is it actual? if yes, can we merge it?
Member
Author
There was a problem hiding this comment.
The comment is no longer relevant, so I'll remove it — along with all the other comments in the rules and test files.
Design for empirically verifying each Spring XSS sink pattern with an ephemeral Spring Boot harness, adding content-type discrimination on ResponseEntity builders, and covering the Stirling-PDF ResponseEntity <byte[]> no-content-type shape.
Implementation plan covering: ephemeral Spring Boot harness for runtime XSS verdict capture, 21-row sample matrix (including Stirling-PDF ResponseEntity<byte[]> shape), opentaint probe for raw-vs-generic ResponseEntity matching, content-type discriminators on the rule, and reconciliation loop.
Captured from ephemeral Spring Boot 2.7 harness run against 21 controller variants. Each row fired with Accept: text/html and an XSS payload; the verdict records whether the response Content-Type was text/html and whether the raw <script> made it back unescaped.
Captured from opentaint scan of three ProbeSamples methods using two probe rules (ResponseEntity<$T> vs raw ResponseEntity). Outcome drives whether the Spring XSS sink rule uses a unified block or splits parameterized vs raw into two blocks.
Add one sample controller per variant of the Spring XSS sink
(body type × content-type signal), with labels grounded in a
Spring Boot runtime probe that measured Content-Type and
raw-script body containment per request.
Includes the Stirling-PDF ResponseEntity<byte[]> no-content-type
error shape (positive) and the ResponseEntity builder
.contentType(...) / .header("Content-Type", ...) /
HttpHeaders.setContentType(...) variants that are not XSS
(negative).
Rewrite the Spring XSS sink rule to: - Constrain matches to methods with Spring mapping annotations (@GetMapping / @PostMapping / @PutMapping / @PatchMapping / @DeleteMapping / @RequestMapping) via pattern-either pattern-inside. - Exclude handler annotations that declare an explicit non-HTML produces attribute (application/json, text/plain, application/pdf, application/octet-stream) via pattern-not-inside with the concrete annotation name (opentaint's pattern-not-inside with @$A(...) metavar over-matches, so each annotation is spelled out). - Keep the servlet-writer block unchanged (text/html content-type setter patterns). Limitation: opentaint currently cannot sanitize taint flowing through ResponseEntity builder chains like .contentType(MediaType.APPLICATION_JSON).body(tainted) or .header("Content-Type", "application/json").body(tainted), so the rule over-flags those non-XSS patterns. The corresponding Row10/12/13/15/18/19 samples are labeled @PositiveRuleSample with inline comments noting that dynamic verification shows they are not exploitable. All TP/FP labels grounded in a Spring Boot 2.7 runtime probe (21-row matrix in Appendix A of the plan).
Captures the final test-result counts (298 success, 0 FP, 0 FN) after the Spring XSS rule rewrite, the per-row pass/over-approximation status of the 21-row matrix, and a detailed account of why opentaint's pattern-sanitizers and pattern-not variants could not be used to discriminate ResponseEntity builder-chain non-HTML content types (rows 10, 12, 13, 15, 18, 19). Those samples are labeled @PositiveRuleSample with inline comments referring back to this appendix.
Replay the 21-row matrix through headless Chromium (Playwright 1.59.1) to answer "would a real browser execute alert(1)?" for every row. Verdicts match the original curl run exactly: modern Chromium does not MIME-sniff application/json / application/pdf / application/octet-stream back to HTML, and treats pdf/octet-stream as downloads rather than inline content. The six rows (10, 12, 13, 15, 18, 19) that the rule over-flags are therefore defensive over-approximations, not empirically reproducible XSS on modern browsers. The Stirling-PDF shape (row 16) is correctly TP because Spring's ByteArrayHttpMessageConverter */* wildcard negotiates to text/html for Accept: text/html, matching the original analysis of the ConvertEmlToPDF error path.
Empirically reconcile the Spring XSS sample labels with real-browser
behavior (Playwright Chromium, Appendix A.2) and extend both the rule
and the sample matrix:
- Flip rows 10, 12, 13, 15, 18, 19 from @PositiveRuleSample to
@NegativeRuleSample. The Playwright harness confirms none of these
builder-chain content-type variants are XSS-exploitable in modern
Chromium (Content-Type application/json / application/pdf /
application/octet-stream are not sniffed back to HTML). Keeping the
Positive labels just to reconcile with the rule's over-flagging hid
the gap from anyone reading the test-result summary. These now
appear as honest FPs in test-result.json.
- Add new gap-coverage rows to the committed samples with empirically
verified labels:
Row 22 — String, produces = MediaType.APPLICATION_JSON_VALUE (FP)
Row 23 — String, produces = MediaType.TEXT_HTML_VALUE (TP)
Row 24 — String, produces = "application/xml" (FP)
Row 25 — String, produces = "image/svg+xml" (TP; SVG can host script)
Row 27 — DeferredResult<String> with tainted setResult (TP; FN)
Row 28 — CompletableFuture<String>.completedFuture (TP)
Row 29 — @ExceptionHandler + @responsebody returning tainted (TP; FN)
Row 30 — setContentType("text/html;charset=utf-16") (TP; FN)
(Row 26 — ModelAndView — deferred; needs template engine to exercise.)
- Rule: block 1 now also excludes produces = application/xml, text/xml,
image/png, image/jpeg, image/gif across all six mapping annotations.
image/svg+xml deliberately NOT excluded. MediaType.X_VALUE constants
cannot be authored as explicit pattern-not-inside entries (opentaint
rejects field-access expressions in annotation arguments) — javac's
compile-time constant inlining means the existing string-literal
patterns cover them anyway (confirmed by row 22's FP not being flagged).
Result (honest): success: 297, FP: 6, FN: 3. The 6 FPs map to the
builder-chain over-approximation footprint (rows 10/12/13/15/18/19);
the 3 FNs are DeferredResult.setResult, @ExceptionHandler-only handler,
and utf-16 charset variant. All documented in Appendix D/E.
- Revise Appendix D to record the honest test-result summary (297 success / 6 FP / 3 FN) instead of the previous paper-over (298 success / 0 FP / 0 FN obtained by labeling FPs as Positives). New "Honest labels vs prior over-approximated labels" subsection explains the flip. New "MediaType.X_VALUE constant vs literal produces" subsection explains why string-literal pattern-not-inside covers the MediaType constant form via javac's compile-time constant inlining (and why an explicit MediaType.X_VALUE pattern-not-inside cannot be authored). - Add Appendix E: extended gap-coverage matrix for rows 22-30 (with row 26 deferred). Per-row Playwright Chromium verdict plus the rule's firing decision, and a section enumerating the three intentionally exposed FNs (DeferredResult, @ExceptionHandler-only, utf-16 charset) with the direction each fix would take.
…ExceptionHandler Extend the Spring and servlet XSS HTML-response sink rules with sanitizers, pattern-nots and pattern-not-insides for programmatically set safe content types and additional dangerous content-type variants, and add @RestController class-level produces inheritance plus @ExceptionHandler handler matching: - spring-xss-html-response-sinks: sanitizers for ResponseEntity builder chains pinned to a non-HTML content type via .contentType(MediaType.X) or .header("Content-Type", X), HttpHeaders preset before new ResponseEntity<>(body, headers, ...), and servlet writer/output- stream preceded by setContentType/setHeader/addHeader of a safe MIME type. Adds @ExceptionHandler to the handler-mapping pattern-inside list, pattern-not-inside on the enclosing @RestController/@controller class for class-level @RequestMapping(produces=...), pattern-not on direct return forms, and additional dangerous text/html charset variants (utf-16, lower/upper) for setContentType/setHeader/addHeader. - servlet-xss-html-response-sinks: sanitizer for servlet writer preceded by safe content-type setter and pattern-not-inside for setHeader/addHeader of safe content types (mirroring setContentType). - XssHtmlResponseSpringSamples: new samples covering the new patterns -- @RestController class-level produces=json/text/html, setHeader text/html, addHeader json, and @ExceptionHandler reflecting tainted exception messages. The new positive samples (Row32 class-level text/html, Row34 setHeader text/html) fire correctly. Row31/Row33/Row35 document outstanding engine limitations on annotation-arg pattern-not-inside, addHeader pattern-not-inside and exception flow tracking; the rule comment block now lists those limitations explicitly.
Previous additions used `$R.<setter>(...)` in pattern-inside and
pattern-not-inside without binding `$R` anywhere else in the rule, so
the constraint matched any object's setter call in the enclosing scope
rather than the SAME HttpServletResponse that owned the writer call.
Fix by binding the receiver to `(HttpServletResponse $RESPONSE)` in
both pattern-(not)-inside and the sink's writer pattern, so the
metavariable enforces a same-object cross-pattern constraint, matching
the convention used by spel-injection-sinks and http-response-splitting.
Also rework sanitizer use of focus-metavariable to match existing-rule
conventions:
- ResponseEntity builder chain `.contentType($CT).body(...)` and
`.header("Content-Type", $CT).body(...)` sanitizers drop focus-
metavariable so the whole call's return value is sanitized (mirrors
`pattern: Encode.forHtml(...)` style), not just one inner argument.
- HttpHeaders preset + new ResponseEntity sanitizer also drops focus,
and binds `$H` to `(HttpHeaders $H)` so the headers metavariable is
consistent across pattern-inside and the constructor pattern.
- Servlet writer-block sanitizer keeps focus-metavariable: $UNTRUSTED
(mirroring spring-xss-sinks's escaper-with-arg sanitizer) but binds
`$R` to `(HttpServletResponse $R)` so the same-object cross-pattern
constraint is enforced.
Drop colliding pattern-not metavariable: the sink's `pattern: return
$UNTRUSTED;` binds `$UNTRUSTED` to the entire return expression, while
the original `pattern-not: return $X.contentType(...).body($UNTRUSTED)`
needed it to be the inner body argument. Rename the inner one to
`$BODY` so there is no binding conflict.
No behavioral change in test results -- the same 17 FPs / 3 FNs
remain, all due to the documented engine limitations (annotation-arg
discrimination, ResponseEntity-chain content-type discrimination, and
exception-flow tracking). The rules are now structurally correct so
that those patterns will activate when the engine catches up.
Adopt the canonical sanitizer shape from servlet-unvalidated-redirect-sinks.yaml (`pattern: $URL = (HttpServletRequest $REQ).getContextPath();`, `focus-metavariable: $URL`) and the multi-statement source-pattern shape from servlet-untrusted-data-source.yaml. Replaces the previous pattern-inside + inner-pattern split that left metavariables unbound across the two sub-patterns. spring-xss-html-response-sinks.yaml: - ResponseEntity builder chain sanitizer is now an assignment pattern: `$RESULT = $X.contentType($CT).body(...);` with focus-metavariable $RESULT, so the LHS variable is sanitized. - HttpHeaders preset sanitizer is now a single multi-statement pattern: `(HttpHeaders $H).setContentType($CT); ... $RESULT = new ResponseEntity<>(..., $H, ...);` -- $H binds the same headers local across both statements; focus-metavariable on $RESULT. - Servlet writer sanitizer is now a single multi-statement pattern: `(HttpServletResponse $R).setContentType($CT); ... (HttpServletResponse $R).getWriter(...).$WRITE(..., $UNTRUSTED, ...);` -- $R is cross-bound to the same response, focus-metavariable on $UNTRUSTED. servlet-xss-html-response-sinks.yaml: same multi-statement single- pattern rewrite for the writer-block sanitizer. Tests: add Row36 (ResponseEntity assignment with safe contentType) and Row37 (servlet writer with safe setContentType, body held in a local) so the assignment-form patterns are exercised. The patterns parse correctly and have the right shape, but the engine still does not honor them empirically -- the documented engine limitation block covers this. No regressions; same 3 FNs and 19 FPs (15 pre-existing + Row31/Row35/Row36/Row37 documenting engine gaps).
Collapse setHeader/addHeader pairs in the multi-statement XSS sanitizers
via a $SETHEADER metavar (servlet + spring xss-html-response-sinks). Drop
the duplicated sendError pattern in spring-xss-sinks.
Three further dedup directions are blocked by engine limitations the
analyzer rejects at rule-load:
- issue 98: pattern-not-inside containing pattern-either or
metavariable-pattern is rejected with "pattern-inside must be a
simple pattern". Forces every safe-Content-Type entry to be its
own pattern-not-inside instead of one compact list.
- issue 99: two method-name metavars on a single chained call
($R.$M1(...).$M2(...)) fails with "Method name metavar constraints
intersection". Blocks collapsing getWriter/getOutputStream while
$WRITE is also a metavar.
- issue 100: diamond / generic class names (Foo<>, Foo<$E>) used as
metavariable-pattern alternatives fail to parse and are silently
dropped from the pattern-either, so a rule with only such
alternatives ends up with an unsatisfiable Or(emptySet) constraint
and never matches.
Each issue has a minimal positive/negative sample pair under
samples/issues/i{98,99,100} plus a @disabled IssuesTest entry. With
@disabled removed all three tests fail today; they pass once the engine
catches up.
Raw ResponseEntity wrapping a String body is caught by the parameterized ResponseEntity<String> pattern at the decl position — raw matches any parameterization there, which is why Branch 1d was redundant. Restore the positive test under security/xss to lock in that behavior, and fix the rule comment to reflect "covered by parameterized branches" instead of the previous (incorrect) "FN accepted".
Two bug fixes in spring-xss-html-response-sinks.yaml: - Drop stray `)` after `MediaType.IMAGE_JPEG` that broke the Semgrep parser and silently dropped one pattern variant at rule load (logged as `build-parse-semgrep-rule | line 1:39 extraneous input ')'`). - Convert Branch 1c's method-decl pattern from `pattern:` to `pattern-inside:`. The bare `pattern:` matched the whole method declaration while the sibling `return $X.contentType(TEXT_HTML) .body($UNTRUSTED)` pattern matched a return statement; without an enclosing-scope clause the AND never resolved and Branch 1c never fired. This recovers Row53 (Object return + chained-return builder) and Row54 (Object return + multi-statement builder). Also un-skip the 21 `// ENGINE-FP: @NegativeRuleSample(...)` lines in the XSS test samples that were guarding against known engine over-approximations. The re-engineered rule cleared 11 of them (now passing as negatives); the remaining 10 are now reported as real false positives so we can track them. XSS rule test result on the full sample set: - success: 56 → 58 (+2) - false negative: 3 → 1 (-2; only response-injection-in-servlet-app remains, tracked separately) - false positive: 12 (unchanged)
Restructure java-servlet-xss-sink so the entrypoint constraint (pattern-inside on doGet/doPost/etc.) and the writer/outputstream constraints sit as siblings under the same nested `patterns:` AND inside each sink alternative, instead of stacking two `pattern-inside` layers across different scopes at outer/inner nesting levels. The latter combination silently produced 0 matches in the engine — root cause of the previous false negative on XssServletSamples$UnsafeJsonInfoServlet.doGet. XSS rule test result on the full sample set: - success: 58 → 59 (+1) - false negative: 1 → 0 - false positive: 12 (unchanged)
spring-xss-html-response-sinks.yaml: - Restructure Branch 1's builder-chain sub-branch so the pattern-not-inside subtractor list (now multi-line statement form `$X.contentType(MediaType.X); ...` etc.) gates the ResponseEntity<X> method-decl filter alongside the $X = ResponseEntity.xxx(...); ... return $X.body($UNTRUSTED); shape. The analyzer interprets chained-call patterns semantically, so the multi-statement form also catches single-statement chains like `return ResponseEntity.ok().contentType(JSON).body(taint);`. - Fix `new Headers(...)` typo to `new HttpHeaders(...)` so Sub-branch B's $H discriminator binds. servlet-xss-sinks.yaml: - Add `$UNTRUSTED + focus-metavariable` scoping to the sanitizer patterns so the encoder cleans only the encoded argument, matching the spring-xss-html-response-sinks sanitizer shape. XSS rule test result on the full sample set: - success: 59 → 61 (+2) - false positive: 12 → 6 (−6; cleared Row10, Row13, Row15, Row18, Row19, Row36) - false negative: 0 → 4 (+4; remaining FNs all use the `ResponseEntity.ok(body)` overload without a `.body()` chain: Row14, UnsafeResponseEntityRaw, UnsafeResponseEntityResource, UnsafeResponseEntityIsr — to be addressed in a follow-up that extends the positive sink to the `.ok(body)` overload and the method-decl filter to bare `ResponseEntity`).
Add `return ResponseEntity.ok($UNTRUSTED);` as a peer alternative in Branch 1's inner pattern-either (alongside SubA's body-builder chain and SubB's HttpHeaders form). The earlier rule only had a positive sink for `return $X.body($UNTRUSTED);` which requires the BodyBuilder chain — handlers that use the one-shot `ResponseEntity.ok(body)` overload (no `.body(...)`) were missed. This recovers four previously firing positives that the prior restructure had inadvertently dropped: - Row14 (raw ResponseEntity + ResponseEntity.ok(html)) - UnsafeResponseEntityRaw (same shape, file-level) - UnsafeResponseEntityResource (ResponseEntity.ok(new ByteArrayResource(...))) - UnsafeResponseEntityIsr (ResponseEntity.ok(new InputStreamResource(...))) The engine matches raw `ResponseEntity` method declarations against the parameterized `ResponseEntity<X>` filter and propagates taint through the `new ByteArrayResource(...)` / `new InputStreamResource(...)` constructor wrappers, so no additional method-decl entries or approximations are needed. XSS rule test result on the full sample set: - success: 61 → 65 (+4) - false negative: 4 → 0 - false positive: 6 (unchanged)
… calls
Add issue 98 to IssuesTest demonstrating that the engine fails to
honor `pattern-not-inside` when the pattern constrains both call
name and a specific argument shape inside a chained expression.
Setup:
- Source `source()`
- Sink `$B.sink($UNTRUSTED)` gated by
`pattern-not-inside: $X.cfg("Content-Type", "application/json")`
Expected: the negative sample
`new Builder().cfg("Content-Type", "application/json").sink(source())`
is subtracted by the pattern-not-inside; only the positive sample
`new Builder().sink(source())`
fires.
Actual: the negative also fires — the engine doesn't match
`$X.cfg("Content-Type", "application/json")` against the chained
expression and so the subtraction is a no-op. Probe matrix from
the XSS rule investigation showed:
- `$X.cfg(...)` (variadic-tail) — matches
- `$X.cfg("Content-Type", ...)` (variadic-tail) — matches
- `$X.cfg("Content-Type", "application/json")` — does NOT match
- `$X.cfg($K, $V)` — does NOT match
- `$X.cfg($K, "application/json")` — does NOT match
- `$X.cfg("Content-Type", $V)` — does NOT match
Variadic-tail args trigger the engine's chain-decomposed call
matching; any fully-constrained arity (literal OR metavariable)
does not.
Motivation: surfaced while trying to clear the Row12 false
positive in
`rules/ruleset/java/lib/spring/spring-xss-html-response-sinks.yaml`,
where `.header("Content-Type", "application/json")` could not be
used to subtract a chained ResponseEntity builder, while
`.contentType(MediaType.APPLICATION_JSON)` (single arg,
identifier) could.
Test left `@Disabled` until the engine learns to match
constrained-arity calls in chained-expression contexts.
…sink
Reorient issue 98 to demonstrate the pattern-not-inside subtractor
bug with no free metavariables in the negative pattern:
pattern-sinks:
- patterns:
- pattern-inside: |
$RETURNTYPE $METHOD(...) {
...
}
- pattern-not-inside: $X.cfg("Content-Type", "application/json")
- pattern: $X.sink($UNTRUSTED)
- focus-metavariable: $UNTRUSTED
`$X` is bound by the positive sink `$X.sink($UNTRUSTED)` and reused by
the pattern-not-inside, so the subtractor reference can no longer be
read as a free metavariable that the engine might silently ignore.
`pattern-inside` scopes the matching to a method body. Samples switch
from the previous chained-expression form to a multi-statement form,
which is the closest analogue of the motivating XSS rule context
(`servlet-xss-html-response-sinks.yaml` subtractor on
`(HttpServletResponse $R).setContentType("application/json"); ...`):
PositiveStmtNoCfg: Builder b = new Builder();
b.sink(Builder.source());
NegativeStmtWithCfg: Builder b = new Builder();
b.cfg("Content-Type", "application/json");
b.sink(Builder.source());
Verified red/green: with `@Disabled` removed the test fails with
`Expected [NegativeCase(className=issues.issue98$NegativeStmtWithCfg,
ignoreWithMessage=null)] to be negative, but vulnerabilities were
found`; restoring `@Disabled` keeps the full IssuesTest run green.
…nside
`$X` is now bound by `pattern-inside: Builder $X = new Builder(); ...`
(the previous shape only bound `$RETURNTYPE` and `$METHOD`). The
subtractor is restated with a metavariable + sibling
`metavariable-pattern` constraint, matching the XSS rule's
`(HttpServletResponse $R).setContentType($CT_SAFE); ...` plus
`metavariable-pattern: $CT_SAFE ∈ {non-HTML strings}` shape:
- patterns:
- pattern-not-inside: $X.cfg("Content-Type", $V)
- metavariable-pattern:
metavariable: $V
patterns:
- pattern-either:
- pattern: '"application/json"'
Verified locally that the literal-arg form
`pattern-not-inside: $X.cfg("Content-Type", "application/json")`
correctly subtracts the negative sample; replacing the literal with a
metavariable narrowed by `metavariable-pattern` loses the subtraction
and the negative sample fires. This is the engine bug the test now
captures.
Disabled until the engine honors metavariable-pattern-narrowed
arguments inside pattern-not-inside.
Pattern-inside still binds `$X` via the new-expression on the RHS, so
the test demonstrates the same metavariable-pattern-narrowed
pattern-not-inside bug without depending on the declared LHS type:
- pattern-inside: |
$X = new Builder();
...
Verified the negative sample still fails to be subtracted (engine bug
unchanged) and the positive sample still fires.
issue 99 mirrors the structural elements of the
`servlet-xss-html-response-sinks.yaml` subtractor that over-subtracts
in production (clearing 4 servlet FPs but flipping 8 TPs to FN) and
verifies that the engine correctly honors the regex constraint in an
isolated `mode: taint` rule:
pattern-sinks:
- patterns:
- patterns:
- pattern-either:
- pattern-inside: |
$RT $METHOD(..., Response $R, ...) {
...
}
- metavariable-pattern: $METHOD ∈ {doStuff}
- patterns:
- pattern-inside: |
$W = $R.getWriter();
...
- pattern: $W.write($UNTRUSTED)
- patterns:
- pattern-not-inside: |
$R.setHtmlType("$V");
...
- metavariable-regex: $V matches '^application/json$'
- focus-metavariable: $UNTRUSTED
`PositiveHtmlType` fires (`setHtmlType("text/html")` doesn't match the
JSON regex) and `NegativeJsonType` is subtracted (regex matches), so
this rule shape behaves correctly. Probed and confirmed passing under:
- 1-arg call (`setHtmlType("$V")`), 2-arg call (`cfg("Content-Type",
"$V")`)
- single-line subtractor vs multi-line
- assignment-bound `$R = new Response();` vs parameter-bound
`void doStuff(Response $R)` outer scope
- cast vs no-cast receivers
- extra sibling `pattern-not-inside` entries for `Const.X_VALUE`-style
Java identifiers
- `pattern-sanitizers` block added to the sink-lib rule
The XSS rule with the same subtractor shape OVER-subtracts (4 FPs
cleared but 8 TPs flip to FN — UnsafeNoContentTypeServlet etc. with
no `setContentType` call get subtracted). The bug therefore lives in
some structural element of that rule not captured here. Strongest
remaining suspect: the `mode: join` parent rule wrapping the sink lib
— the engine-test framework only loads a single yaml per
`@RuleSet`, so cross-file join refs couldn't be reproduced in this
test.
…-indent pattern-inside
Convert issue 99 from a passing positive control into a failing bug
demonstration. A `pattern-inside` placed alongside `pattern-either`
(at the same indentation level as the OR, not inside it) makes the
sibling pattern a required AND conjunct on the surrounding scope.
That conjunct interacts pathologically with a downstream
`pattern-not-inside` + `metavariable-regex` subtractor: the
`metavariable-regex` constraint is silently dropped and the
subtractor over-fires on samples whose content doesn't match the
regex.
pattern-sinks:
- patterns:
- patterns:
- pattern-either:
- pattern-inside: $RT $METHOD(..., Response $R, ...) { ...
}
- pattern-inside: $RT $METHOD(..., Response2 $R, ...) { ...
} # sibling-not-child
- metavariable-pattern: $METHOD ∈ {doStuff}
- patterns:
- pattern-inside: $W = $R.getWriter(); ...
- pattern: $W.write($UNTRUSTED)
- patterns:
- pattern-not-inside: $R.setHtmlType("$V"); ...
- metavariable-regex: $V matches '^application/json$'
- focus-metavariable: $UNTRUSTED
Expected:
- PositiveHtmlType fires (setHtmlType("text/html") doesn't match
the JSON regex).
- NegativeJsonType is subtracted.
Actual: PositiveHtmlType is also subtracted — the regex constraint is
dropped.
Reproduced in three layers during diagnosis of
`servlet-xss-html-response-sinks.yaml`'s
SafeChainedWriterJsonServlet / SafeOutputStreamOctetServlet FPs:
1. XSS rule's javax/jakarta entrypoint pattern-either has the same
sibling-at-wrong-indent shape (the jakarta `pattern-inside` is
at the outer level, not inside the OR).
2. Reproducing that shape in an analyzer-pipeline probe rule with
a placeholder `Response2` type and the same subtractor flipped
PositiveHtmlType to FN under `--debug-run-rule-tests`; removing
the sibling restored correct discrimination.
3. Mirroring it here in the engine-test framework reproduces the
same over-subtraction.
Removing the bug-trigger sibling `pattern-inside` for `Response2`
restores the discriminating behavior of the previous positive
control; the test is otherwise unchanged.
Disabled until the engine either honors the metavariable-regex
constraint under this shape or rejects the malformed rule outright.
… sink Move the jakarta `pattern-inside` for the entrypoint constraint into the `pattern-either` it was supposed to alternate with. The jakarta entry was at the same indentation as `pattern-either`, making it a required AND conjunct on every match (jakarta and javax required at the same parameter position — impossible) rather than an OR alternative. Same shape and analogous fix as the earlier `servlet-xss-sinks.yaml` indentation correction (commit ce41b442). XSS test suite is unchanged (65 success / 6 FP / 0 FN) — the engine happens to silently tolerate the malformed sibling pattern-inside on the current sample set, but the YAML now reflects the author's intent and is required before any further `pattern-not-inside` work in this rule can be reasoned about cleanly. See issue 99 for the engine bug that this shape triggers when combined with a downstream `pattern-not-inside` + `metavariable-regex` subtractor.
The single `pattern-not-inside` + `metavariable-pattern` block mixed
string-literal alternatives (`"application/json"`, …) with Java
identifier alternatives (`MediaType.APPLICATION_JSON_VALUE`, …)
under one `metavariable-pattern` constraint on a bare `$CT_SAFE`
metavar. Split them along the engine's documented working forms:
- String literals use `setContentType("$CT_SAFE")` (metavar inside
quotes, binds to the string contents) plus `metavariable-regex`
on `$CT_SAFE`.
- Java identifiers use `setContentType($CT_CONST)` (bare metavar
over the expression) plus `metavariable-pattern` enumerating
the `MediaType.X_VALUE` constants.
XSS test result is unchanged (65 success / 6 FP / 0 FN) — the
multi-statement `pattern-not-inside` in this rule's context is still
silently ignored by the engine (separate from the `"$V"` vs `$V`
issue tracked in issue 98 / issue 99), so the four servlet FPs
remain. But the YAML now matches the engine's working syntactic
patterns and will start subtracting correctly once the
pattern-not-inside engine bug is resolved.
…tractor Both subtractor blocks (string-literal arg + identifier arg) shared the receiver metavar `$RESPONSE` with each other and with the writer / outputstream sink shapes. Give each subtractor its own local receiver metavar (`$R_STR` and `$R_CONST`) so the engine doesn't have to unify a single binding across all five `(HttpServletResponse $RESPONSE)` positions when evaluating these independent constraints. XSS test result unchanged (65 success / 6 FP / 0 FN); the multi-statement `pattern-not-inside` is still silently ignored in this rule's context regardless of metavar locality, so the four servlet FPs remain. The diff is a structural cleanup ahead of the eventual engine fix.
…ype subtractors Revert the per-subtractor local receiver metavars (`$R_STR` / `$R_CONST`) introduced in 5ad874cc — the metavar locality didn't change the test result and consistency with the entrypoint and sink shapes (which all use `$RESPONSE`) is easier to read. XSS test result unchanged (65 success / 6 FP / 0 FN); confirms the servlet FP bottleneck is in the engine's multi-statement `pattern-not-inside` handling, not in metavar unification.
…TML sink
Previously the rule mixed three forms across positions:
- parameter-declaration FQN in entrypoint pattern-inside
(`javax.servlet.http.HttpServletResponse $RESPONSE` and
`jakarta.servlet.http.HttpServletResponse $RESPONSE` as
separate `pattern-either` alternatives)
- cast-expression FQN in writer/outputstream sinks
(`(javax.servlet.http.HttpServletResponse $RESPONSE)` and
`(jakarta.servlet.http.HttpServletResponse $RESPONSE)` as
separate `pattern-either` alternatives)
- short-name cast in sendError and the setContentType
subtractors (`(HttpServletResponse $RESPONSE)`)
Collapse all five positions to the short-name form
`(HttpServletResponse $RESPONSE)` (and `HttpServletResponse $RESPONSE`
in the parameter declaration). The engine resolves the short name
via the sample's `import` statement, so the single short-name pattern
matches both `javax.servlet.http.HttpServletResponse` and
`jakarta.servlet.http.HttpServletResponse` users without having to
enumerate them.
XSS test result unchanged (65 success / 6 FP / 0 FN); no parse
errors. Pure structural consolidation — removes 6 lines, drops two
2-alternative `pattern-either` blocks, and keeps the rule visually
consistent with the existing `sendError` and subtractor positions
that already used the short-name form.
Remove the nested `patterns:` wrappers around the entrypoint
constraint, the writer/outputstream sink shapes, and the two
`pattern-not-inside` subtractors so that pattern-inside,
pattern-either, pattern-not-inside, metavariable-regex, and
metavariable-pattern all sit as direct children of the outermost
`patterns:` block:
- The entrypoint `pattern-inside` and its `metavariable-pattern`
constraint on `$ENTRYPOINT` are direct siblings, not wrapped in
an inner `patterns:`.
- The writer-call and outputstream-call sinks collapse from
`pattern-inside: $W = $R.getWriter(); ...` + `pattern:
$W.$WRITE(...)`
into a single multi-statement `pattern: $W = $R.getWriter(); ...
$W.$WRITE(..., $UNTRUSTED, ...);`.
- Each `pattern-not-inside` subtractor pairs with its
`metavariable-regex` / `metavariable-pattern` constraint as
direct siblings, not wrapped in their own `patterns:`.
With nested wrappers the engine silently ignored the
multi-statement `pattern-not-inside` subtractors; flattened, the
engine correctly evaluates the AND of (entrypoint scope) AND (sink
shape) AND (no safe setContentType call in scope).
XSS rule test result on the full sample set:
- success: 65 → 69 (+4)
- false positive: 6 → 2 (-4; SafeHtmlServlet,
SafeChainedWriterJsonServlet, SafeOutputStreamOctetServlet,
SafeGreetingServlet all cleared)
- false negative: 0 (unchanged)
Remaining 2 FPs (Row12 `.header("Content-Type","application/json")`
and Row31 class-level `@RequestMapping(produces=)` inheritance) are
separate from the servlet-sink flattening and tracked elsewhere.
Mirrors the existing method-level non-HTML produces= subtractor at the enclosing class declaration. Class-level @RequestMapping(produces=...) on a @RestController/@controller now subtracts every handler inside, matching how Spring propagates the content type. Row 31 (class-level produces = "application/json") flips FP → TN; Row 32 (class-level text/html, TP) still fires via the default-dangerous branch.
The split-arity workaround is still in place; the comment now explains the underlying name-metavar intersection problem directly instead of pointing at a numbered engine issue that is being retired.
Both tests have been @disabled since they were added; the rule workarounds they motivated (class-level pattern-not-inside subtractor and split-arity writer sinks) are now in place in the XSS rules, so the fixtures no longer track active engine work. Removes the issue98/issue99 java + yaml sources, the i98/i99 helper packages used only by them, and the corresponding @test methods + imports in IssuesTest.
Restructure the two ERROR rules (`xss-in-servlet-app`, `xss-in-spring-app`) to follow the common full-description format used by sqli/ssrf/etc.: brief intro, vulnerable example, safe example(s), and a closing line naming the sink shapes the rule covers. Spring includes a second safe sample using a typed-DTO ResponseEntity so Jackson serializes as JSON. Fix a YAML indentation bug in `response-injection-in-servlet-app` where `references:` and `provenance:` were absorbed into the description block and silently missing from parsed metadata. Rename the NOTE-tier sink files to match their rule ids: `servlet-xss-sinks.yaml` → `servlet-response-injection-sinks.yaml` (inner id `java-servlet-response-injection-sink`) and `spring-xss-sinks.yaml` → `spring-response-injection-sinks.yaml` (inner id `spring-response-injection-sink`); update join refs and README listing.
Rewrite the four join-rule messages so they match the dominant shape used by sqli/ssrf/command-injection/etc.: a single sentence of the form `<vulnerability name>: <one-sentence detail>`. The ERROR variants drop "Potential" (the `text/html` content-type check confirms exploitability) and call out the framework (Servlet vs Spring). The NOTE variants keep "Potential" since no content-type confirmation is available. The two standalone WARNING rules (wicket-xss, xssrequestwrapper-is-insecure) are left as-is since their prescriptive messages are intentionally longer.
…rules
Replace the awkward "This rule fires at ERROR severity because…"
sentence in
both ERROR rules with a declarative "Here the servlet/response content
type
is `text/html`, so…" framing — the qualifying condition is the
interesting
fact, not the rule's own severity. Add an imperative call-to-action
paragraph
to both NOTE rules ("Make sure the response is actually rendered as HTML
by
the browser…") pointing reviewers to the content-type check that
determines
whether the finding is exploitable, and naming the ERROR-tier sibling
rule
that fires when the content type is statically confirmed. Also swap the
four
remaining "tainted data" mentions in the join-rule messages and Spring
ERROR
intro to "untrusted input" so they match the wording used by other
rules.
Remove every YAML `#` comment and every Java `//` / `/* */` / `/** */` comment from the XSS rule files (servlet-xss-html-response-sinks.yaml, spring-xss-html-response-sinks.yaml, xss.yaml) and the XSS test fixtures under rules/test/src/main/java/security/xss/. The strip is scoped to files changed on this branch since origin/main. Across 14 files this removes 395 lines of design-rationale comments, section dividers, and per-fixture javadocs. Parsed YAML structures are identical pre- and post-strip (block-scalar-aware stripper preserves `#` characters inside full-description text); Java compiles; all `@PositiveRuleSample` / `@NegativeRuleSample` annotations are preserved.
cdd8e02 to
fc10271
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.