Skip to content

Use the reserved __giql_ prefix for CLUSTER/MERGE synthesized identifiers #161

Description

@conradbzura

Use the reserved _giql prefix for CLUSTER/MERGE synthesized identifiers

Description

The CLUSTER and MERGE expanders (src/giql/expanders/cluster.py, src/giql/expanders/merge.py) synthesize internal SQL identifiers that do not use the __giql_ reserved namespace the epic #137 convention establishes for synthesized names (documented at giql/expander.py, EXPAND_ALIAS_PREFIX):

  • is_new_cluster — the per-row "new cluster" flag column projected by the inner lag_calc subquery and summed by the outer window.
  • lag_calc — the CLUSTER derived-table alias.
  • clustered — the MERGE derived-table alias.

MERGE's __giql_cluster_id already follows the convention; these three do not.

Expected Behavior

Synthesized identifiers should live in the reserved __giql_ namespace (e.g. __giql_is_new_cluster, __giql_lag_calc, __giql_clustered) so they cannot collide with user-supplied column names.

Root Cause

A user table whose projection includes a real column named is_new_cluster silently produces a wrong/ambiguous query: the inner lag_calc subquery projects two is_new_cluster columns and the outer SUM(is_new_cluster) OVER (...) becomes ambiguous. The lag_calc / clustered aliases would similarly shadow a same-named user CTE/relation.

This is pre-existing behavior carried over byte-for-byte from the legacy ClusterTransformer / MergeTransformer, so it was deliberately out of scope for #144 (a byte-identical relocation — renaming changes emitted SQL). Surfaced as advisory finding A12 in the #144 review (.sdlc/reviews/issue-#144/review-1.md).

Fix: rename the synthesized identifiers into the __giql_ namespace and add a regression test transpiling a CLUSTER over a table whose projection includes a user column named is_new_cluster (asserting correct disambiguation). Note this intentionally changes emitted SQL for affected shapes, so the CLUSTER/MERGE transpilation snapshots/oracle expectations must be updated in lockstep.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions