Skip to content

Opt-in overflow tracking for decimal fixed-point arithmetic#22356

Draft
Avinash-Raj wants to merge 7 commits into
rapidsai:mainfrom
Avinash-Raj:avi/decimal-overflow
Draft

Opt-in overflow tracking for decimal fixed-point arithmetic#22356
Avinash-Raj wants to merge 7 commits into
rapidsai:mainfrom
Avinash-Raj:avi/decimal-overflow

Conversation

@Avinash-Raj
Copy link
Copy Markdown
Contributor

@Avinash-Raj Avinash-Raj commented May 2, 2026

Summary

Introduces an opt-in path for detecting overflow in decimal fixed-point arithmetic, at two layers:

  1. Value-type level – new numeric::decimal32_safe / decimal64_safe / decimal128_safe aliases backed by a new fixed_point<Rep, Rad, overflow_tracking::on> instantiation. These carry a sticky per-value bool that is set whenever a constructor, arithmetic operator, rescaled(), or float→decimal conversion would overflow the underlying integer representation. The flag propagates through +, -, *, % and rescaled(), so any expression built on the value-level operators (binaryop transforms, scans, reductions
    expressed through value ops) carries it for free.

  2. Column / binaryop level – new public API cudf::binary_operation_safe(..., rmm::device_scalar<uint32_t>& overflow_flag, ...) for the seven base-10 decimal arithmetic operators (ADD, SUB, MUL, DIV, MOD, PMOD, PYMOD). It updates a single device-side flag via atomicMax whenever any active (non-null) row overflows during arithmetic or rescale to the output scale. Designed for callers (e.g. velox-cudf) that need a one-shot “did any row overflow?” signal without per-row storage.

Motivation

Existing decimal32 / decimal64 / decimal128 silently wrap on overflow. Downstream engines that need stricter semantics today have to either implement their own checked arithmetic or rebuild libcudf. This PR keeps the default behavior unchanged but lets opt-in callers ask the same libcudf build for overflow-aware decimals.

@Avinash-Raj Avinash-Raj requested review from a team as code owners May 2, 2026 07:58
@github-actions github-actions Bot added libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue labels May 2, 2026
@Avinash-Raj Avinash-Raj changed the title Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) [Draft] Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) May 2, 2026
@Avinash-Raj Avinash-Raj marked this pull request as draft May 2, 2026 07:59
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 2, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@Avinash-Raj Avinash-Raj changed the title [Draft] Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) May 2, 2026
@bdice
Copy link
Copy Markdown
Contributor

bdice commented May 2, 2026

I would favor adding new operators/functions rather than a compile-time flag — we want to be able to ship one build of cuDF that works universally.

@devavret
Copy link
Copy Markdown
Contributor

devavret commented May 4, 2026

@bdice I'd imagine this is not limited to scalar functions like binaryops and such. We'd also need to detect overflow in aggregations like groupby and reduce. And those would require overloading the existing +,* operators.

What's your opinion on adding a new type maybe called safe_decimal.

Spark should also have a concept of arithmetic exception for overflows. How do they support this with libcudf decimal?

@mattgara
Copy link
Copy Markdown

mattgara commented May 4, 2026

What's your opinion on adding a new type maybe called safe_decimal.

Inline with this thinking, we could extend cudf's fixed_point to implement this:

enum class overflow_tracking { off, on };

template <typename Rep, Radix Rad,
          overflow_tracking Track = overflow_tracking::off>
class fixed_point{ ...

and use if constexpr (Track == overflow_tracking::on) to branch the implementation based on this configuration. Effectively, this should be compile-time semantically equivalent to having a safe_decimal, but with less code duplication.

@ttnghia
Copy link
Copy Markdown
Contributor

ttnghia commented May 4, 2026

This is related to ANSI support that I've recently filed: #21676
We can use some kind of input enum like mentioned above, and custom return type as my suggestion.

@Avinash-Raj
Copy link
Copy Markdown
Contributor Author

Ok, replaced the CMake CUDF_TRACK_FIXED_POINT_OVERFLOW flag with an opt-in overflow_tracking NTTP on numeric::fixed_point and add decimal*_safe / type_id::DECIMAL*_SAFE so velox-cudf gets per-element overflow detection without requiring a separate libcudf build.

@pmattione-nvidia
Copy link
Copy Markdown
Contributor

You might want to put overflow tracking into the floating <--> decimal conversion code as well (include/cudf/fixed_point/detail/floating_conversion.hpp). E.g. there are the guarded_left_shift() and guarded_right_shift() functions, and there are potential overflows in convert_floating_to_integral_shifting(), shift_to_decimal_pospow(), and shift_to_decimal_negpow(). All potential overflow sites are mentioned explicitly in the code comments in these functions.

@Avinash-Raj Avinash-Raj changed the title Add optional sticky fixed-point overflow tracking (CUDF_TRACK_FIXED_POINT_OVERFLOW) Opt-in overflow tracking for decimal fixed-point arithmetic May 11, 2026
@davidwendt
Copy link
Copy Markdown
Contributor

Could this be done without modifying the fixed-point class? The operator overloads are not really required for this to work since the dispatcher logic creates specialized code paths automatically. And somehow the overflow-awareness is communicated through to the appropriate operation/function and would generally require a special code path either way.

It seems the operators could be just free functions and not member functions or member operators. The return type would include an overflow flag along with the result. Similar to how the https://github.com/rapidsai/cudf/blob/main/cpp/include/cudf/fixed_point/detail/floating_conversion.hpp is a separate file handling specific operations for fixed-point.

The new free functions would be called as part of the specialized overflow-aware aggregation types for reduce, groupby, etc.
This would better isolate the specialized operation functions away from the current fixed-point implementations.

@Avinash-Raj
Copy link
Copy Markdown
Contributor Author

@davidwendt I have addressed your comment by adding free functions that return the operation result and has_overflow_occurred as a tuple, without making or reverting any changes to the fixed_point class.

Copy link
Copy Markdown
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the gtests only execute on decimal64.

* @return true if the shift would overflow `Rep`, false otherwise
*/
template <typename Rep, Radix Rad, typename T>
CUDF_HOST_DEVICE inline constexpr bool shift_overflows(T const& val, scale_type const& scale)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this function be moved to safe_arithmetic.hpp? Or does it need to be public?
Perhaps move it or create a new header file.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could move the shift_overflows function to safe_arithmetic.hpp since that is the only place where it is called.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! the relevant function gets moved to safe_arithmetic.hpp and also added safe arithmetic tests to deal with decimal64 overflows.

@mhaseeb123
Copy link
Copy Markdown
Member

Is this PR and #22261 related or overlapping?

return guarded_left_shift(static_cast<UnsignedRep>(shifting_rep), pow2);
}

/**
Copy link
Copy Markdown
Contributor

@pmattione-nvidia pmattione-nvidia May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicating the functionality by adding new high level functions for all of this code is brutal; we shouldn't have to duplicate all of this code logic. It's too easy for one function to get updated but not the other, etc. I would instead add a template parameter bool for whether to check or not. Then that bool can be passed to e.g. checked_narrow_cast() (potentially renaming) and if it's constexpr-false we just do the straight up cast.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition I feel the overflow bool should be returned and not passed in.

Copy link
Copy Markdown
Contributor Author

@Avinash-Raj Avinash-Raj May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like you suggested, instead of duplicating the high-level functions, I used compile-time branching inside each function based on the CheckOverflow template parameter. For example,

safe_arithmetic.hpp

template <typename Fixed,
          typename Floating,
          CUDF_ENABLE_IF(cuda::std::is_floating_point_v<Floating> &&
                         cudf::is_fixed_point<Fixed>())>
CUDF_HOST_DEVICE inline safe_result<typename Fixed::rep, Fixed::rad>
safe_convert_floating_to_fixed(Floating floating, scale_type scale)
{
  using Rep = typename Fixed::rep;
  if constexpr (Fixed::rad == Radix::BASE_10) {
    auto const [value, overflow] = convert_floating_to_integral<Rep, true>(floating, scale);
    return safe_result<Rep, Fixed::rad>{Fixed{scaled_integer<Rep>{value, scale}}, overflow};
  } else {
    Rep const value = static_cast<Rep>(shift<Rep, Fixed::rad>(floating, scale));
    return safe_result<Rep, Fixed::rad>{Fixed{scaled_integer<Rep>{value, scale}}, false};
  }
}

floating_conversion.hpp

/**
 * @brief Perform floating-point -> integer decimal conversion
 *
 * @tparam Rep The type of integer we are converting to, to store the decimal value
 * @tparam CheckOverflow Whether to return overflow detection with the converted value
 * @tparam FloatingType The type of floating-point object we are converting from
 * @param floating The floating point value to convert
 * @param scale The desired base-10 scale factor: decimal value = returned value * 10^scale
 * @return Integer representation of the floating-point value, or `{value, overflow}` when
 * `CheckOverflow` is true
 */
template <typename Rep,
          bool CheckOverflow = false,
          typename FloatingType,
          CUDF_ENABLE_IF(cuda::std::is_floating_point_v<FloatingType>)>
CUDF_HOST_DEVICE inline auto convert_floating_to_integral(FloatingType const& floating,
                                                          scale_type const& scale)
{
  // Extract components of the floating point number
  using converter        = floating_converter<FloatingType>;
  auto const integer_rep = converter::bit_cast_to_integer(floating);
  if (converter::is_zero(integer_rep)) { return maybe_with_overflow<CheckOverflow>(Rep{0}, false); }

  // Note that the significand here is an unsigned integer with sizeof(FloatingType)
  auto const is_negative                  = converter::get_is_negative(integer_rep);
  auto const [significand, floating_pow2] = converter::get_significand_and_pow2(integer_rep);

  // Add half a bit if truncating to yield expected value, see function for discussion.
  auto const pow10 = static_cast<int>(scale);
  auto const [base2_value, pow2] =
    add_half_if_truncates(floating, significand, floating_pow2, pow10);

  // Apply the powers of 2 and 10 to convert to decimal.
  auto const [magnitude_u, overflow] =
    convert_floating_to_integral_shifting<Rep, FloatingType, CheckOverflow>(
      base2_value, pow10, pow2);

  if constexpr (!CheckOverflow) {
    // Reapply the sign and return
    // NOTE: Cast can overflow!
    auto const signed_magnitude = static_cast<Rep>(magnitude_u);
    return is_negative ? -signed_magnitude : signed_magnitude;
  } else {
    // Reapply sign with saturation on representational overflow.
    using UnsignedRep = cuda::std::make_unsigned_t<Rep>;
    auto const umax   = static_cast<UnsignedRep>(cuda::std::numeric_limits<Rep>::max());

    if (!is_negative) {
      if (magnitude_u > umax) {
        return cuda::std::pair<Rep, bool>{cuda::std::numeric_limits<Rep>::max(), true};
      }
      return cuda::std::pair<Rep, bool>{static_cast<Rep>(magnitude_u), overflow};
    }

    // Negative range has one extra representable value for two's complement.
    // magnitude == max+1 maps to min.
    auto const umin_mag = umax + UnsignedRep{1};
    if (magnitude_u > umin_mag) {
      return cuda::std::pair<Rep, bool>{cuda::std::numeric_limits<Rep>::min(), true};
    }
    if (magnitude_u == umin_mag) {
      return cuda::std::pair<Rep, bool>{cuda::std::numeric_limits<Rep>::min(), overflow};
    }

    return cuda::std::pair<Rep, bool>{static_cast<Rep>(-static_cast<Rep>(magnitude_u)), overflow};
  }
}

I hope this is the right way, correct me if I'm wrong.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is we should not have both shift_to_decimal_pospow_checked() and shift_to_decimal_pospow(). It should just be one function with template bool on whether to check or not. Same for all of the others. Duplicating all of this code will just cause maintenance nightmares.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed and it got fixed on floating_conversion.hpp.

Copy link
Copy Markdown
Contributor Author

@Avinash-Raj Avinash-Raj May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidwendt “In addition, I feel the overflow bool should be returned and not passed in.” If my understanding is correct, you don’t want to pass a sticky overflow flag, rmm::device_scalar<std::uint32_t>& overflow_flag, to the high-level safe functions (refer below). Instead, the corresponding overflow value (bool) should be returned as the second output parameter. Is that correct?

template <typename LhsType, typename RhsType>
std::unique_ptr<column> binary_operation_safe(LhsType const& lhs,
                                              RhsType const& rhs,
                                              binary_operator op,
                                              data_type output_type,
                                              rmm::device_scalar<std::uint32_t>& overflow_flag,
                                              rmm::cuda_stream_view stream,
                                              rmm::device_async_resource_ref mr)
{

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Should this not return a bool per row?
I would've expected the result to be 2 columns. One for the fixed-point value and one of bool values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants