feat(storage): create-table from a source table + BigQuery partition/clustering#468
feat(storage): create-table from a source table + BigQuery partition/clustering#468yustme wants to merge 2 commits into
Conversation
…clustering Extend `storage create-table` to mirror keboola/connection#7697: - `--source-table-id` (+ optional `--source-branch-id`): copy an existing table's data into the requested partition/clustering layout instead of building from `--column`. The schema is derived from the source, so `--column`/`--not-null`/`--default` are forbidden. This is the supported way to repartition a populated BigQuery table; pair with `swap-tables`. - `--column` is now optional and mutually exclusive with `--source-table-id`. - New BigQuery layout flags (also usable on a plain columns create): `--time-partitioning-type`/`-field`/`-expiration-ms`, `--range-partitioning-field`/`-start`/`-end`/`-interval` (bounds are strings), `--clustering-field`. Time vs range partitioning are mutually exclusive. - BigQuery-only with a one-call pre-flight guard: when any source/layout flag is used, the project backend is verified first and a non-BigQuery project fails fast (exit 2) before the create. Plain `--column` creates are unaffected. Connection 422 codes remain as a server-side backstop. Client builds the tables-definition body conditionally (source XOR columns, optional layout). Adds unit tests (client/service/CLI), a backend-aware E2E step, and full agent doc-sync. Bumps to 0.66.0.
Code ReviewOverall a clean, well-structured PR. The 3-layer architecture (command → service → client) is maintained, validations are thorough, test coverage spans all layers, and documentation is updated across all mandatory surfaces. Backward compatibility for plain Findings (all addressed in 920b5bd)
What's done well
All findings addressed. LGTM 👍 |
- Reject --source-branch-id without --source-table-id (was silently dropped) - Keep skipped if-not-exists envelope schema-consistent with created path - Show range partitioning bounds in human output - Restore uv.lock revision 3 (unrelated downgrade)
What
Brings the Keboola connection capability from keboola/connection#7697 (DMD-1677) to the CLI:
storage create-tablecan now create a table by copying a source table into a different BigQuery partition/clustering layout, and exposes the partition/clustering layout flags thetables-definitionendpoint already supported but the CLI never surfaced.This is the supported way to repartition a populated BigQuery table: copy it into the new layout, then flip it into place with the existing
storage swap-tables.Changes
--source-table-id(+ optional--source-branch-id): derive the new table's schema from a source table and copy its rows into the requested layout. Mutually exclusive with--column(and--not-null/--default, which attach to column specs).--columnis now optional (was required); exactly one of--column/--source-table-idmust be given.--time-partitioning-type/-field/-expiration-ms,--range-partitioning-field/-start/-end/-interval(range bounds are strings),--clustering-field(repeatable). Time vs range partitioning are mutually exclusive.--columncreates make no extra call. The connection 422 codes (backendDoesNotSupportSourceTable,sourceAliasNotPersisted,sourceTableMissingReferencedColumn,sourceTableNotFound) remain a server-side backstop.The client (
KeboolaClient.create_table) builds thetables-definitionbody conditionally —sourceXORcolumns, plus the optionaltimePartitioning/rangePartitioning/clusteringobjects — mirroring the exact shapes connection expects.Example
Tests
tests/test_storage_create_table.py: client body shaping (source vs columns, partition/clustering, string range bounds), service-layer XOR + partition validation, the BigQuery pre-flight guard (fires before POST; skipped for plain creates), and CLI flag pass-through /--columnno longer required.tests/test_e2e.py: on BigQuery runs the source-copy + swap; on other backends asserts the pre-flight guard rejects with exit 2.test_storage_write.pycall-signature assertions.make checkgreen (lint, format, typecheck, skill, version, command-sync, changelog, error-codes, 4198 tests).Docs / version
context.py,CLAUDE.md,commands-reference.md,keboola-expert.md(tool matrix),gotchas.md,storage-types-workflow.md, regeneratedSKILL.md.make version-sync).