Skip to content

Add CREATE TABLE ... WITH (...) support for ad-hoc K8s-stored tables#193

Draft
ryannedolan wants to merge 2 commits intomainfrom
create-table-with
Draft

Add CREATE TABLE ... WITH (...) support for ad-hoc K8s-stored tables#193
ryannedolan wants to merge 2 commits intomainfrom
create-table-with

Conversation

@ryannedolan
Copy link
Collaborator

Summary

  • Added Table CRD.
  • Added create table ... with DDL.
  • Added planner rule to match such tables.

Details

Currently, all tables must exist within a Database, backed by a JDBC driver. Often, we want to point to a table-like object that doesn't exist in the catalog, or that differs appreciably from a table in the catalog. For this reason, we add create table ... with, which lets users add connector configuration to the catalog directly. Such tables exist in the catalog but do not necessarily have a physical Database associated with them. The pipeline planner simply injects whatever configuration is provided in the with clause, plus any configuration from matching TableTemplates.

Testing Done

0: Hoptimator> create table foo (name varchar) with (connector 'likafka', topic 'my-topic');
...
0: Hoptimator> !pipeline create materialized view venice.bar as select * from foo
CREATE DATABASE IF NOT EXISTS `DEFAULT` WITH ();
CREATE TABLE IF NOT EXISTS `DEFAULT`.`FOO` (`NAME` VARCHAR) WITH ('CONNECTOR'='likafka', 'TOPIC'='my-topic', 'offline.table.name'='ads_offline', 'job.properties.account'='foo', 'pipeline'='VENICE.BAR');
CREATE DATABASE IF NOT EXISTS `VENICE` WITH ();
CREATE TABLE IF NOT EXISTS `VENICE`.`BAR` (`NAME` VARCHAR) WITH ('connector'='venice', 'key.fields'='KEY', 'key.fields-prefix'='', 'key.type'='PRIMITIVE', 'partial-update-mode'='true', 'storeName'='BAR', 'value.fields-include'='EXCEPT_KEY', 'offline.table.name'='ads_offline', 'job.properties.account'='foo', 'pipeline'='VENICE.BAR');
INSERT INTO `VENICE`.`BAR` (`NAME`) SELECT `NAME` FROM `DEFAULT`.`FOO`;

Enable users to define tables with explicit connector configuration that
are stored as Table CRDs in Kubernetes, without requiring a pre-existing
database or catalog entry. This is similar to Flink's CREATE TABLE.

Syntax: CREATE TABLE foo.bar (id INT, name VARCHAR) WITH (connector 'kafka', topic 'my-topic')

Key changes:
- Custom SqlCreateTable replacing Calcite's built-in (adds options field)
- Grammar/parser updated to accept WITH clause on CREATE TABLE
- TableSource extends Source to carry column definitions + options
- Table CRD (tables.crd.yaml) stores columns, options, and schema path
- K8sStoredTable + StoredTableScanRule enables pure K8s tables in the
  planner pipeline without requiring a backing JDBC database
- K8sConnector merges stored options with template-derived config
  (WITH options take precedence)
- Fix !pipeline command to use query() for plain SELECTs (no sink)

Safe to deploy without the Table CRD installed -- only CREATE TABLE
WITH will fail; all other functionality is unaffected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines +70 to +71
- table
- database
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think database should be required. It looks like Claude is currently filling it in as DEFAULT, but it should be null in most cases. I think the only reason to include a database is if you explicitly want to opt-in to connector configuration from a database-specific TableTemplate.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think database should be null in this case?

Tables created without a physical database backing (e.g. `CREATE TABLE
foo (...)`) now have a null database rather than incorrectly using the
schema name. This better represents that the table is a pure K8s table.

- DDL executor sets database=null when schema is not a Database
- Table CRD no longer requires the database field
- K8sTableTable loads tables without a database
- K8sConnector falls back to schema name for template name variable
- VeniceDeployerProvider is now null-safe on source.database()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant