Skip to content

[FEA][STORY] Support ANSI Operators using JIT #22598

@lamarrr

Description

@lamarrr

Description

This issue tracks the implementation of ANSI SQL semantics in the cuDF JIT expression engine.

The goal of this work is to enable ANSI-compliant operator execution through CUDF JIT code generation, allowing expression evaluation to remain fully device-side while preserving Spark-compatible overflow, nullability, precision, and error semantics.

This effort builds on the initial prototype introduced in #22224 and continues the broader work described in #21676.

The implementation focuses on:

  • Extending transforms with ANSI-aware operators
  • Preserving ANSI overflow and precision semantics during code generation
  • Supporting nullable execution paths and TRY_* semantics
  • Enabling decimal-aware arithmetic and rescaling
  • Maintaining parity with existing libcudf expression semantics where applicable

Implemented Operators

Nullability / Conditional Semantics

  • NULLIFY_IF
  • COALESCE
  • IF_ELSE

ANSI Arithmetic

  • ANSI_ADD
  • ANSI_SUB
  • ANSI_MUL
  • ANSI_DIV
  • ANSI_MOD
  • ANSI_ABS
  • ANSI_NEG

ANSI TRY Semantics

  • ANSI_TRY_ADD
  • ANSI_TRY_SUB
  • ANSI_TRY_MUL
  • ANSI_TRY_DIV
  • ANSI_TRY_MOD
  • ANSI_TRY_ABS
  • ANSI_TRY_NEG

Precision / Decimal Semantics

  • ANSI_PRECISION_CHECK
  • ANSI_TRY_PRECISION_CHECK
  • RESCALE
  • CAST_TO_DEC32
  • CAST_TO_DEC64
  • CAST_TO_DEC128

Integer / Floating-Point Casts

  • CAST_TO_B8
  • CAST_TO_I8
  • CAST_TO_I16
  • CAST_TO_U8
  • CAST_TO_U16
  • CAST_TO_U32
  • CAST_TO_U64
  • CAST_TO_F32

Bitwise Operators

  • BIT_SHIFT_LEFT
  • BIT_SHIFT_RIGHT

Implementation Notes

The ANSI operators are implemented as first-class JIT expression operators and lower directly into generated CUDA/PTX code paths. Special care is taken to:

  • Preserve overflow detection semantics across signed arithmetic operations
  • Propagate validity masks consistently for nullable execution
  • Support decimal scale adjustments and precision enforcement in-device
  • Avoid unnecessary host-side branching or fallback execution
  • Maintain compatibility with Spark ANSI mode behavior

TRY_* operators follow ANSI semantics by converting runtime arithmetic failures into null outputs rather than raising execution errors.

Work Breakdown

The original prototype in #22224 was intentionally split into smaller reviewable units:

Future Work

Potential follow-on work includes:

  • Additional ANSI comparison and predicate operators
  • Expanded decimal intrinsic support
  • LTO-IR/PTX codegen for operator fusion
  • Consolidation of duplicated AST/JIT operator semantics

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status

Burndown

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions