Skip to content

Reduce the generator to a stock serializer, remove the migration flag, and document the extension hook — Closes #146#168

Merged
conradbzura merged 1 commit into
mainfrom
146-reduce-generator-to-serializer
Jul 2, 2026
Merged

Reduce the generator to a stock serializer, remove the migration flag, and document the extension hook — Closes #146#168
conradbzura merged 1 commit into
mainfrom
146-reduce-generator-to-serializer

Conversation

@conradbzura

Copy link
Copy Markdown
Collaborator

Summary

Complete epic #137. With every GIQL operator migrated onto the ExpandOperators AST pass and its (target, operator) registry, the transpiler no longer needs a custom generator: reduce it to stock per-target sqlglot serialization, remove the now-dead GIQL_EXPAND migration flag and legacy emit path, and promote the expander registry to a supported public extension point for custom targets and operator overrides.

The custom BaseGIQLGenerator and the giql.generators package are deleted; the final AST is serialized with the stock sqlglot serializer selected by the active target's sqlglot_dialect. The five NEAREST helpers and the distance-CASE string builder move out of the generator into the expander modules. The ExpanderRegistry becomes a target plugin hub — a custom Target is declared via register_target (or as a side effect of register) and selected with transpile(dialect="<name>") — and the hook symbols are exported from the top-level giql package and documented with a runnable worked example.

Trade-off worth noting: serialization is byte-identical to the previous generator for the generic and DataFusion targets; the only change is that the DuckDB target now makes DuckDB's default NULLS FIRST ordering explicit. The query-level expander seam for whole-query rewrites (the deferred INTERSECTS IEJoin fold and the NEAREST SELECT * column leak, #160) is intentionally left as follow-up.

Closes #146

Proposed changes

Reduce the generator to a stock serializer

Delete src/giql/generators/ (BaseGIQLGenerator + package) and serialize the expanded AST in transpile with ast.sql(dialect=target.sqlglot_dialect). Relocate the six former generator helpers out of the generator: the five NEAREST-only helpers (_nearest_output_encoding, _nearest_passthrough, _detect_nearest_mode, _raise_nearest_reference_error, _extract_bool_param) become module functions in giql.expanders.nearest, and the shared distance-CASE string builder moves to a new giql.expanders._distance.generate_distance_case.

Remove the GIQL_EXPAND flag and the legacy emit path

Drop the per-operator GIQL_EXPAND opt-in flag from all nine operator classes and the gate in the ExpandOperators pass. Every operator now expands unconditionally; a resolve miss raises a clear ValueError naming the operator and target instead of falling back to a legacy *_sql emitter that no longer exists.

Make the expander registry a target plugin hub

Add ExpanderRegistry.register_target and target, have register declare its target as a side effect, and extend clear/snapshot/restore (now via a RegistrySnapshot value type) to cover declared targets. resolve_target consults the registry for a non-built-in dialect name while built-in names still win, so transpile(dialect="<custom>") resolves a registered custom target end to end.

Export and document the extension hook

Re-export register, REGISTRY, ExpanderRegistry, ExpansionContext, OperatorExpander, Target, Capabilities, and the three built-in targets from the top-level giql package. Add docs/transpilation/extending.rst with runnable custom-target and operator-override examples plus the node-local boundary and the pre-quoted-identifier caveat; extend the API reference and cross-link schema-mapping.

Test cases

# Test Suite Given When Then Coverage Target
1 TestPublicApiSurface Each extension-hook name in giql.__all__ Accessed from the giql root and compared to its submodule origin It resolves and is the same object Public export surface
2 TestPublicApiSurface The deleted giql.generators package import giql.generators Raises ModuleNotFoundError Generator removal contract
3 TestCustomTargetInjection A capability-only custom target declared via register_target transpile(dialect="postgres") Transpiles through the generic expanders instead of raising resolve_target registry fallback
4 TestCustomTargetInjection A custom target with a Within expander override The query is transpiled against it The emitted SQL carries the override form Per-target operator override
5 TestCustomTargetInjection A custom target registered under a built-in name resolve_target("duckdb") Returns the built-in target, not the registry entry Built-in name precedence
6 TestTargetDrivesSerialization A custom target with sqlglot_dialect="duckdb" A window operator is transpiled under it versus generic Only the custom output carries NULLS FIRST Custom sqlglot_dialect threading
7 TestExpandOperatorsPass A cleared registry and a DISJOIN AST The pass runs against generic and against a custom target Raises ValueError naming the operator and target No-expander raise
8 TestRegistrySnapshot Snapshots straddling an expander or target-only registration Compared with == Reported unequal RegistrySnapshot equality
9 TestTargetRegistration A declared custom target, captured then cleared The snapshot is restored The target resolves again Target-aware snapshot/restore
10 TestTranspileDialects Every migrated operator including DISJOIN/CLUSTER/MERGE Transpiled for datafusion versus generic Byte-identical output Cross-target serialization parity
11 TestTranspileDialects A window-carrying operator query Null-ordering tokens stripped from duckdb and generic output The two are identical NULLS FIRST is the sole DuckDB delta
12 TestDistanceExpanderStringParity The four DISTANCE shapes over branch-covering rows expand_distance (AST) and generate_distance_case (string) evaluated in DuckDB Identical scalars per row Distance drift guard after relocation
13 TestNearestDataFusionFallbackShape A correlated signed NEAREST with max_distance on DataFusion Transpiled to the decorrelated window form Retains the signed CASE and the distance filter Relocated NEAREST fallback branches

@conradbzura conradbzura self-assigned this Jul 2, 2026
Every GIQL operator now expands to standard AST in the normalization
passes, so the custom generator earns its keep no longer. Delete the
giql.generators package and BaseGIQLGenerator and serialize the final
AST with the stock sqlglot serializer selected by the active target's
sqlglot_dialect. The five NEAREST helpers and the distance-CASE string
builder move out of the generator into the expander modules.

Remove the GIQL_EXPAND migration flag. With every operator migrated the
per-node opt-in gate is dead weight, so it and the legacy emit path are
gone; a resolve miss now raises a clear error instead of falling back to
a legacy emitter that no longer exists.

Make the expander registry a target plugin hub. A custom Target can be
declared with register_target, or as a side effect of register, and
selected with transpile(dialect=name), so the registry becomes a
supported public extension point. resolve_target consults the registry
for a non-built-in name while built-in names still win. Export the hook
symbols from the top-level giql package and document the extension point
with a runnable worked example.

Serialization is byte-identical to the previous generator for the
generic and DataFusion targets. The only change is that the DuckDB
target now makes DuckDB's default NULLS FIRST ordering explicit.

Claude-Session: https://claude.ai/code/session_01ALxmQysPad4W68wuWuft6W
@conradbzura conradbzura force-pushed the 146-reduce-generator-to-serializer branch from 9cf1ed4 to 3b92c43 Compare July 2, 2026 15:39
@conradbzura conradbzura marked this pull request as ready for review July 2, 2026 15:41
@conradbzura conradbzura merged commit 178d704 into main Jul 2, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce the generator to a stock serializer, remove the migration flag, and document the extension hook

1 participant