feat(tesseract): Support SQL API grouped sub-query joins#11138
Conversation
|
Claude finished @ovr's task in 2m 38s —— View job Review complete — 1 high, 1 medium, 4 lowExpand for full reviewEnd-to-end threading of Findings
Other observations
No security issues from the cubesql producer side, but the trust boundary on |
| pub sql: String, | ||
| #[serde(rename = "joinType")] | ||
| pub join_type: Option<String>, | ||
| pub alias: String, | ||
| } |
There was a problem hiding this comment.
Alias contract is trust-based. alias is consumed verbatim (no quoting) and the inner sql is emitted verbatim into the outer query. The PR description states the SQL API sends a pre-quoted alias and a controlled inner SELECT, which is fine for the cubesql producer, but the trait itself doesn't enforce or document the invariant. If any other caller of buildSqlAndParamsRust ever populates subqueryJoins (the JS BaseQuery now forwards this.options.subqueryJoins from any caller), an unquoted/attacker-controlled alias would land in the FROM clause untouched.
Consider tightening the doc-comment on this struct to state explicitly that:
aliasMUST be a fully-quoted identifier produced by the caller, andsqlis emitted verbatim and MUST come from a trusted, planner-side path.
Optionally validate at compile time that alias starts/ends with the quote character (" or the dialect-specific one). Worth at least a // SAFETY: …-style note here so the next reader doesn't reintroduce quoting at a different layer.
|
Claude finished @ovr's task in 42s —— View job Re-review after
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #11138 +/- ##
==========================================
+ Coverage 79.17% 83.68% +4.50%
==========================================
Files 474 257 -217
Lines 96880 79384 -17496
Branches 3558 0 -3558
==========================================
- Hits 76708 66434 -10274
+ Misses 19654 12950 -6704
+ Partials 518 0 -518
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Claude finished @ovr's task in 1m 1s —— View job Re-review after
|
52f9357 to
5f8da2b
Compare
|
Claude finished @ovr's task in 50s —— View job Re-review (no new commits since
|
The SQL API encodes grouped join push-downs (e.g. `JOIN (SELECT ... GROUP BY ... ORDER BY ... LIMIT n) t ON ...`) as a `subqueryJoins` entry on the query. Only the legacy planner consumed it (via `customSubQueryJoins`); `buildSqlAndParamsRust` dropped it before the native call and Tesseract had no way to receive it. As a result the joined sub-query was omitted entirely: queries returned unfiltered rows (the inner LIMIT/ORDER vanished) and projections referencing the sub-query alias failed with "missing FROM-clause entry". Thread `subqueryJoins` end to end: - BaseQuery.js: pass `subqueryJoins` into both native queryParams. - New `SubqueryJoin` bridge (sql/joinType/alias + `on` member expression). - QueryPropertiesCompiler compiles each `on` into a `SqlCall`; stored on QueryProperties as `LogicalSubqueryJoinItem`s. - SimpleQueryPlanner folds them into the query's LogicalJoin. - Physical builder emits them as `SingleSource::RawSubquerySql` joins; the no-join fast path now also checks `subquery_joins` so alias-only projections still materialize the join. The opaque sub-query SQL (incl. its ORDER BY/LIMIT) is emitted verbatim, so top-N semantics are preserved. The SQL API sends the join alias pre-quoted and references it verbatim in the ON condition, so the raw sub-query source is emitted with its alias as-is (no re-quoting). Changes are gated on a non-empty `subquery_joins`, leaving existing plans untouched.
5f8da2b to
bd3af59
Compare
|
Claude finished @ovr's task in 53s —— View job Re-review after
|
|
Claude finished @ovr's task in 1m 0s —— View job Re-review after
|
|
Claude finished @ovr's task in 54s —— View job Re-review after
|
The SQL API encodes grouped join push-downs (e.g.
JOIN (SELECT ... GROUP BY ... ORDER BY ... LIMIT n) t ON ...) as asubqueryJoinsentry on the query. Only the legacy planner consumed it (viacustomSubQueryJoins);buildSqlAndParamsRustdropped it before the native call and Tesseract had no way to receive it. As a result the joined sub-query was omitted entirely: queries returned unfiltered rows (the inner LIMIT/ORDER vanished) and projections referencing the sub-query alias failed with "missing FROM-clause entry".Thread
subqueryJoinsend to end:subqueryJoinsinto both native queryParams.SubqueryJoinbridge (sql/joinType/alias +onmember expression).oninto aSqlCall; stored on QueryProperties asLogicalSubqueryJoinItems.SingleSource::RawSubquerySqljoins; the no-join fast path now also checkssubquery_joinsso alias-only projections still materialize the join.The opaque sub-query SQL (incl. its ORDER BY/LIMIT) is emitted verbatim, so top-N semantics are preserved. The SQL API sends the join alias pre-quoted and references it verbatim in the ON condition, so the raw sub-query source is emitted with its alias as-is (no re-quoting). Changes are gated on a non-empty
subquery_joins, leaving existing plans untouched.