Skip to content

Resolve genomic columns for CLUSTER/MERGE over a derived-table FROM #164

Description

@conradbzura

Resolve genomic columns for CLUSTER/MERGE over a derived-table FROM

Description

genomic_columns (in src/giql/expanders/cluster.py) only resolves a custom column mapping when the enclosing SELECT's FROM clause is a bare exp.Table. When CLUSTER or MERGE runs over a derived table — e.g. (SELECT ... FROM custom) AS sub — the lookup finds no table name and silently falls back to the canonical chrom / start / end column names. If the underlying source uses a custom column mapping, the emitted SQL then references columns that do not exist.

Expected Behavior

CLUSTER/MERGE over a derived-table (or CTE) FROM should resolve the correct genomic columns rather than silently defaulting. At minimum, when the columns cannot be resolved through the subquery, it should raise a clear ValueError instead of emitting SQL that references non-existent columns.

Root Cause

genomic_columns reads the FROM table name only via isinstance(from_clause.this, exp.Table); a Subquery/derived-table FROM yields no name, so the function returns the canonical defaults. This is pre-existing behavior carried over byte-for-byte from the legacy ClusterTransformer and is currently untested. Surfaced in the PR #162 / #144 round-2 review (advisory A14).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions