Skip to content

fix: preserve parameterized physicalType in SQL type converter#1093

Merged
jschoedl merged 8 commits intodatacontract:mainfrom
barry0451:fix/issue-1086-physicaltype-precision-in-sql-converter
Mar 30, 2026
Merged

fix: preserve parameterized physicalType in SQL type converter#1093
jschoedl merged 8 commits intodatacontract:mainfrom
barry0451:fix/issue-1086-physicaltype-precision-in-sql-converter

Conversation

@barry0451
Copy link
Copy Markdown
Contributor

Problem

When an ODCS SchemaProperty has a physicalType with precision or length constraints (e.g. VARCHAR(255), DECIMAL(10,2)), the SQL type converter ignores the constraints and returns the wrong type or None.

Root cause: The per-server converters match bare type names like ["string", "varchar", "text"]. A physicalType of "VARCHAR(255)" doesn't match "varchar", so the converter falls through and returns None.

Fix

Added an early return in convert_to_sql_type(): if the ODCS physicalType already contains parentheses, it is returned verbatim before any server-specific converter runs.

Also adds _extract_base_type() helper (strips params from a type string) for future use.

Tests

63 parametrized tests covering:

  • VARCHAR(255), DECIMAL(10,2), NUMERIC(18,4), CHAR(10), NVARCHAR(500), FLOAT(24) across all 8 server types
  • Regression tests confirming bare types still route through server-specific converters correctly

Fixes #1086

Co-authored-by: Garrett Sutula 11978161+garrettsutula@users.noreply.github.com

Copy link
Copy Markdown
Collaborator

@jschoedl jschoedl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some kind of platform-dependent code is still needed to check if the parametrized types are actually supported. If the physicalType is set to, e.g., NVARCHAR(50), exporting to Postgres would not work (Postgres has no NVARCHAR type).

@barry0451 barry0451 force-pushed the fix/issue-1086-physicaltype-precision-in-sql-converter branch from a205a93 to c49aa88 Compare March 25, 2026 00:25
@barry0451
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback — good catch.

The naive verbatim pass-through is replaced with proper server-aware translation:

  1. Strip parameters from the physicalType (e.g. NVARCHAR(50) → base=NVARCHAR, params=50)
  2. Run the base type through the server-specific converter
  3. Re-attach params only if the translated type accepts them on that server (via a _TYPES_ACCEPTING_PARAMS per-server mapping)

Examples of the new behaviour:

  • NVARCHAR(50) on Postgres → converter returns None for unknown base type → fallback verbatim NVARCHAR(50) (Postgres will reject this cleanly rather than silently producing wrong SQL)
  • VARCHAR(255) on Postgres → base varchartext, text doesn't accept params → text
  • DECIMAL(10,2) on Postgres → base decimaldecimal, accepts params → decimal(10,2)
  • VARCHAR(255) on Snowflake → base varcharVARCHAR, accepts params → VARCHAR(255)

21 new tests cover helpers and cross-server translation cases.

barry0451 and others added 2 commits March 25, 2026 00:30
Fixes #1086

When an ODCS SchemaProperty has a physicalType with precision/length
constraints (e.g. VARCHAR(255), DECIMAL(10,2)), convert_to_sql_type()
previously fell through to per-server converters that matched only bare
base types, causing them to return None.

Fix: short-circuit in convert_to_sql_type() before dispatching to any
server-specific converter when the ODCS physicalType already contains
parentheses. Also adds _extract_base_type() helper for future use.

Tests: 63 parametrized cases covering all server types and common
parameterized type patterns.
…verter

Addresses review feedback: instead of passing parameterized types verbatim,
strip the parameter, run base type through server converter, then re-attach
the parameter to the translated type where supported.

- Add _extract_base_type() and _extract_params() helpers
- Add _TYPES_ACCEPTING_PARAMS per-server mapping of types that accept params
- convert_to_sql_type() now: strips params -> translates base type -> re-attaches
  params only if the translated type accepts them on the target server
- NVARCHAR(50) on Postgres -> converter returns None for unknown type -> verbatim fallback
- VARCHAR(255) on Postgres -> translates to 'text', text doesn't accept params -> 'text'
- DECIMAL(10,2) on Postgres -> translates to 'decimal', accepts params -> 'decimal(10,2)'
- 21 new targeted tests covering helpers and cross-server param translation

Fixes #1086 (was originally mis-tagged #1039)

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>
@barry0451 barry0451 force-pushed the fix/issue-1086-physicaltype-precision-in-sql-converter branch from c49aa88 to 122fc59 Compare March 25, 2026 00:30
barry0451 and others added 2 commits March 25, 2026 00:38
Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>
Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>
@barry0451 barry0451 force-pushed the fix/issue-1086-physicaltype-precision-in-sql-converter branch from 80b41b0 to 783d2f9 Compare March 25, 2026 00:44
barry0451 and others added 4 commits March 25, 2026 00:58
…onvert_base_to_sql_type

The refactor dropped the original fallback return _get_type(field) that existed on
main when server_type is None or unrecognized. This broke SodaCL export which calls
convert_to_sql_type(prop, None) and expects the raw logical/physical type string back.

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>
…types

Compound physicalType values like TIMESTAMP(6) WITH TIME ZONE and
INTERVAL DAY(2) TO SECOND(6) contain '(' but are not simple parameterized
types — they have content after the closing ')'. The previous guard
intercepted any physicalType with '(' and destructively decomposed these
compound expressions.

Guard now requires both:
- physicalType ends with ')'
- physicalType contains exactly one '('

This passes compound Oracle types through verbatim (handled correctly by
the per-server oracle converter) while still intercepting simple cases
like VARCHAR(255) and DECIMAL(10,2) for cross-server translation.

Fixes test_test_oracle.py failing on TIMESTAMP(6) WITH TIME ZONE fields.

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>
…QL export (#1086)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@jschoedl jschoedl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM! I refined the listing of parametrizable types a bit. I think that NVARCHAR should actually be mapped to TEXT in the postgres case, and sql_type_converter could soon deserve a rewrite - but those are out of scope for this PR.

@jschoedl jschoedl merged commit 453c4aa into datacontract:main Mar 30, 2026
5 checks passed
davidb-tada pushed a commit to davidb-tada/datacontract-cli that referenced this pull request Mar 31, 2026
…ontract#1093)

* fix: preserve parameterized physicalType in SQL type converter

Fixes datacontract#1086

When an ODCS SchemaProperty has a physicalType with precision/length
constraints (e.g. VARCHAR(255), DECIMAL(10,2)), convert_to_sql_type()
previously fell through to per-server converters that matched only bare
base types, causing them to return None.

Fix: short-circuit in convert_to_sql_type() before dispatching to any
server-specific converter when the ODCS physicalType already contains
parentheses. Also adds _extract_base_type() helper for future use.

Tests: 63 parametrized cases covering all server types and common
parameterized type patterns.

* fix: translate parameterized physicalType through server-specific converter

Addresses review feedback: instead of passing parameterized types verbatim,
strip the parameter, run base type through server converter, then re-attach
the parameter to the translated type where supported.

- Add _extract_base_type() and _extract_params() helpers
- Add _TYPES_ACCEPTING_PARAMS per-server mapping of types that accept params
- convert_to_sql_type() now: strips params -> translates base type -> re-attaches
  params only if the translated type accepts them on the target server
- NVARCHAR(50) on Postgres -> converter returns None for unknown type -> verbatim fallback
- VARCHAR(255) on Postgres -> translates to 'text', text doesn't accept params -> 'text'
- DECIMAL(10,2) on Postgres -> translates to 'decimal', accepts params -> 'decimal(10,2)'
- 21 new targeted tests covering helpers and cross-server param translation

Fixes datacontract#1086 (was originally mis-tagged datacontract#1039)

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* fix: remove unused pytest import and sort imports per ruff

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* ci: trigger CI run after ruff fix

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* fix: restore fallback to _get_type for unknown/None server_type in _convert_base_to_sql_type

The refactor dropped the original fallback return _get_type(field) that existed on
main when server_type is None or unrecognized. This broke SodaCL export which calls
convert_to_sql_type(prop, None) and expects the raw logical/physical type string back.

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* fix: only intercept simple TYPE(params) pattern, not compound Oracle types

Compound physicalType values like TIMESTAMP(6) WITH TIME ZONE and
INTERVAL DAY(2) TO SECOND(6) contain '(' but are not simple parameterized
types — they have content after the closing ')'. The previous guard
intercepted any physicalType with '(' and destructively decomposed these
compound expressions.

Guard now requires both:
- physicalType ends with ')'
- physicalType contains exactly one '('

This passes compound Oracle types through verbatim (handled correctly by
the per-server oracle converter) while still intercepting simple cases
like VARCHAR(255) and DECIMAL(10,2) for cross-server translation.

Fixes test_test_oracle.py failing on TIMESTAMP(6) WITH TIME ZONE fields.

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* fix: add MySQL support and correct parameterized type whitelist for SQL export (datacontract#1086)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>
Co-authored-by: Jakob Schödl <jakob.schoedl@entropy-data.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
davidb-tada pushed a commit to davidb-tada/datacontract-cli that referenced this pull request Mar 31, 2026
…ontract#1093)

* fix: preserve parameterized physicalType in SQL type converter

Fixes datacontract#1086

When an ODCS SchemaProperty has a physicalType with precision/length
constraints (e.g. VARCHAR(255), DECIMAL(10,2)), convert_to_sql_type()
previously fell through to per-server converters that matched only bare
base types, causing them to return None.

Fix: short-circuit in convert_to_sql_type() before dispatching to any
server-specific converter when the ODCS physicalType already contains
parentheses. Also adds _extract_base_type() helper for future use.

Tests: 63 parametrized cases covering all server types and common
parameterized type patterns.

* fix: translate parameterized physicalType through server-specific converter

Addresses review feedback: instead of passing parameterized types verbatim,
strip the parameter, run base type through server converter, then re-attach
the parameter to the translated type where supported.

- Add _extract_base_type() and _extract_params() helpers
- Add _TYPES_ACCEPTING_PARAMS per-server mapping of types that accept params
- convert_to_sql_type() now: strips params -> translates base type -> re-attaches
  params only if the translated type accepts them on the target server
- NVARCHAR(50) on Postgres -> converter returns None for unknown type -> verbatim fallback
- VARCHAR(255) on Postgres -> translates to 'text', text doesn't accept params -> 'text'
- DECIMAL(10,2) on Postgres -> translates to 'decimal', accepts params -> 'decimal(10,2)'
- 21 new targeted tests covering helpers and cross-server param translation

Fixes datacontract#1086 (was originally mis-tagged datacontract#1039)

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* fix: remove unused pytest import and sort imports per ruff

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* ci: trigger CI run after ruff fix

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* fix: restore fallback to _get_type for unknown/None server_type in _convert_base_to_sql_type

The refactor dropped the original fallback return _get_type(field) that existed on
main when server_type is None or unrecognized. This broke SodaCL export which calls
convert_to_sql_type(prop, None) and expects the raw logical/physical type string back.

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* fix: only intercept simple TYPE(params) pattern, not compound Oracle types

Compound physicalType values like TIMESTAMP(6) WITH TIME ZONE and
INTERVAL DAY(2) TO SECOND(6) contain '(' but are not simple parameterized
types — they have content after the closing ')'. The previous guard
intercepted any physicalType with '(' and destructively decomposed these
compound expressions.

Guard now requires both:
- physicalType ends with ')'
- physicalType contains exactly one '('

This passes compound Oracle types through verbatim (handled correctly by
the per-server oracle converter) while still intercepting simple cases
like VARCHAR(255) and DECIMAL(10,2) for cross-server translation.

Fixes test_test_oracle.py failing on TIMESTAMP(6) WITH TIME ZONE fields.

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>

* fix: add MySQL support and correct parameterized type whitelist for SQL export (datacontract#1086)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Garrett Sutula <11978161+garrettsutula@users.noreply.github.com>
Co-authored-by: Jakob Schödl <jakob.schoedl@entropy-data.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SQL Type Converter cannot handle precision/length contraints for physicalType

2 participants