Description
This issue tracks the implementation of ANSI SQL semantics in the cuDF JIT expression engine.
The goal of this work is to enable ANSI-compliant operator execution through CUDF JIT code generation, allowing expression evaluation to remain fully device-side while preserving Spark-compatible overflow, nullability, precision, and error semantics.
This effort builds on the initial prototype introduced in #22224 and continues the broader work described in #21676.
The implementation focuses on:
- Extending transforms with ANSI-aware operators
- Preserving ANSI overflow and precision semantics during code generation
- Supporting nullable execution paths and TRY_* semantics
- Enabling decimal-aware arithmetic and rescaling
- Maintaining parity with existing libcudf expression semantics where applicable
Implemented Operators
Nullability / Conditional Semantics
ANSI Arithmetic
ANSI TRY Semantics
Precision / Decimal Semantics
Integer / Floating-Point Casts
Bitwise Operators
Implementation Notes
The ANSI operators are implemented as first-class JIT expression operators and lower directly into generated CUDA/PTX code paths. Special care is taken to:
- Preserve overflow detection semantics across signed arithmetic operations
- Propagate validity masks consistently for nullable execution
- Support decimal scale adjustments and precision enforcement in-device
- Avoid unnecessary host-side branching or fallback execution
- Maintain compatibility with Spark ANSI mode behavior
TRY_* operators follow ANSI semantics by converting runtime arithmetic failures into null outputs rather than raising execution errors.
Work Breakdown
The original prototype in #22224 was intentionally split into smaller reviewable units:
Future Work
Potential follow-on work includes:
- Additional ANSI comparison and predicate operators
- Expanded decimal intrinsic support
- LTO-IR/PTX codegen for operator fusion
- Consolidation of duplicated AST/JIT operator semantics
Description
This issue tracks the implementation of ANSI SQL semantics in the cuDF JIT expression engine.
The goal of this work is to enable ANSI-compliant operator execution through CUDF JIT code generation, allowing expression evaluation to remain fully device-side while preserving Spark-compatible overflow, nullability, precision, and error semantics.
This effort builds on the initial prototype introduced in #22224 and continues the broader work described in #21676.
The implementation focuses on:
Implemented Operators
Nullability / Conditional Semantics
NULLIFY_IFCOALESCEIF_ELSEANSI Arithmetic
ANSI_ADDANSI_SUBANSI_MULANSI_DIVANSI_MODANSI_ABSANSI_NEGANSI TRY Semantics
ANSI_TRY_ADDANSI_TRY_SUBANSI_TRY_MULANSI_TRY_DIVANSI_TRY_MODANSI_TRY_ABSANSI_TRY_NEGPrecision / Decimal Semantics
ANSI_PRECISION_CHECKANSI_TRY_PRECISION_CHECKRESCALECAST_TO_DEC32CAST_TO_DEC64CAST_TO_DEC128Integer / Floating-Point Casts
CAST_TO_B8CAST_TO_I8CAST_TO_I16CAST_TO_U8CAST_TO_U16CAST_TO_U32CAST_TO_U64CAST_TO_F32Bitwise Operators
BIT_SHIFT_LEFTBIT_SHIFT_RIGHTImplementation Notes
The ANSI operators are implemented as first-class JIT expression operators and lower directly into generated CUDA/PTX code paths. Special care is taken to:
TRY_* operators follow ANSI semantics by converting runtime arithmetic failures into null outputs rather than raising execution errors.
Work Breakdown
The original prototype in #22224 was intentionally split into smaller reviewable units:
Future Work
Potential follow-on work includes: