`Decimal`: use `UInt128` significand to speed up operations by xwu · Pull Request #2022 · swiftlang/swift-foundation

xwu · 2026-06-03T21:21:23Z

This PR introduces a new internal computed property (called _significand to distinguish itself from _mantissa) of type UInt128, which allows us to perform arithmetic operations bypassing VariableLengthInteger.

Although conceptually low-hanging fruit, fully threading the changes through the implementation represents an overhaul of some scale but with performance gains to match. The resulting implementations (written by hand) are fortunately imminently readable. Latent bugs are addressed along the way, substantially improving the precision of mathematical operations on Decimal.

Motivation:

#1754 demonstrated that making VariableLengthInteger non-allocating (and not really variable) dramatically improves performance. While improved, however, Decimal operations are still by no means optimized for performance. Sadly, this state of affairs encourages the erroneous impression that decimal floating-point is intrinsically much less performant than it could be as compared to alternative numeric representations.

The prior PR was a fantastic and inspiring first move. However, absent context about other advances in Swift, LLM-driven efforts overlook that performing arithmetic limb-by-limb (which is what VariableLengthInteger encapsulates) is no longer necessary for implementing basic operations, as the 128-bit mantissa can be bitwise copied into a UInt128 so that we can leverage more performant compiler primitives.

Modifications:

This PR replaces VariableLengthInteger operations with UInt128 operations, rewriting comparison, addition (and subtraction), multiplication, and division. Normalization is also rewritten to remove the last consumer of VariableLengthInteger, but it is also now only called by the NSDecimalNormalize shim.

Along the way, latent bugs are either annotated or fixed altogether--see added tests. For example:

The existing implementation truncates the 'refitted' mantissa in the case of arithmetic overflow during addition, which is not correct for the documented default .plain rounding mode (it also makes no attempt to behave correctly for other rounding modes). The revised implementation now respects rounding mode.
The existing implementation exhibits unexpected behavior when multiplying two values with small exponents that should lead to an underflow result. (Reading the code suggests there should be a runtime trap, but in the REPL there's just a very large arbitrary result.) The revised implementation now correctly throws underflow.
The existing implementation always rounds towards zero (i.e., truncates) for division. The revised implementation now respects rounding mode (crucially, the documented default rounding mode, .plain).
The existing implementation normalizes dividend and divisor by an arbitrary criterion chosen in 1999, which has been associated with bugs; code comments reference rdar://problem/5197585 and rdar://problem/2354750. The revised implementation now scales the dividend's significand appropriately to fill 128 bits.
The existing implementation produces a NaN value during normalization if the smaller of the two inputs has a finite, negative value that truncates to zero. The revised implementation now respects rounding mode and, if rounding up such a negative value, produces zero rather than spurious NaN.
In the existing implementation, legacy NSDecimal* functions other than Add never signal loss of precision, as such information was neither consistently computed nor plumbed through. The revised implementation now indicates loss of precision whenever an inexact result is returned.

Result:

Using benchmarks added in #1754, this PR results in a ~~~350%~~ ~500% boost in addition performance, a ~~~750%~~ ~950% boost in multiplication performance, and a ~7000% boost in division performance as measured by throughput.

And, as described above, arithmetic operations now have improved precision and latent bugs have been fixed. VariableLengthInteger is removed entirely.

----------------------------------------------------------------------------------------------------------------------------
Decimal add metrics
----------------------------------------------------------------------------------------------------------------------------

╒══════════════════════════════════════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╕
│          Throughput (# / s) (M)          │        p0 │       p25 │       p50 │       p75 │       p90 │       p99 │      p100 │   Samples │
╞══════════════════════════════════════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╡
│                   main                   │        13 │        13 │        13 │        13 │        13 │        13 │        13 │        26 │
├──────────────────────────────────────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│               Current_run                │        80 │        78 │        78 │        77 │        76 │        71 │        69 │       154 │
├──────────────────────────────────────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│                    Δ                     │        67 │        65 │        65 │        64 │        63 │        58 │        56 │       128 │
├──────────────────────────────────────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│              Improvement %               │       515 │       500 │       500 │       492 │       485 │       446 │       431 │       128 │
╘══════════════════════════════════════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╛


----------------------------------------------------------------------------------------------------------------------------
Decimal divide metrics
----------------------------------------------------------------------------------------------------------------------------

╒══════════════════════════════════════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╕
│          Throughput (# / s) (M)          │        p0 │       p25 │       p50 │       p75 │       p90 │       p99 │      p100 │   Samples │
╞══════════════════════════════════════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╡
│                   main                   │         1 │         1 │         1 │         1 │         1 │         1 │         1 │         2 │
├──────────────────────────────────────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│               Current_run                │        74 │        73 │        73 │        72 │        72 │        71 │        71 │       145 │
├──────────────────────────────────────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│                    Δ                     │        73 │        72 │        72 │        71 │        71 │        70 │        70 │       143 │
├──────────────────────────────────────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│              Improvement %               │      7300 │      7200 │      7200 │      7100 │      7100 │      7000 │      7000 │       143 │
╘══════════════════════════════════════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╛


----------------------------------------------------------------------------------------------------------------------------
Decimal multiply metrics
----------------------------------------------------------------------------------------------------------------------------

╒══════════════════════════════════════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╤═══════════╕
│          Throughput (# / s) (M)          │        p0 │       p25 │       p50 │       p75 │       p90 │       p99 │      p100 │   Samples │
╞══════════════════════════════════════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╪═══════════╡
│                   main                   │         8 │         8 │         8 │         8 │         8 │         8 │         8 │        17 │
├──────────────────────────────────────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│               Current_run                │        85 │        84 │        84 │        83 │        82 │        80 │        80 │       166 │
├──────────────────────────────────────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│                    Δ                     │        77 │        76 │        76 │        75 │        74 │        72 │        72 │       149 │
├──────────────────────────────────────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┼───────────┤
│              Improvement %               │       962 │       950 │       950 │       938 │       925 │       900 │       900 │       149 │
╘══════════════════════════════════════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╧═══════════╛

Testing:

All 33 existing unit tests for Decimal pass (with modifications to account for now-corrected rounding with division and improved precision--see comments below). Additional unit tests are added for corrected behavior.
All 5 existing benchmark tests show improved performance compared to the current baseline as described above.

…when lossy

…w, fix off-by-one exponent in constants, plumb through loss-of-precision

…of UInt128 helper extensions

xwu · 2026-06-05T15:51:08Z

@swift-ci test macOS

xwu · 2026-06-05T15:53:07Z

cc @stephentyrone :)

… comparison

xwu · 2026-06-10T19:59:55Z

            let result = try lhs._multiply(by: rhs, roundingMode: .plain)
            lhs = result
+        } catch _CalculationError.underflow {
+            lhs = .zero


Note that this may be formally tantamount to a policy change as compared to NSDecimal* guarantees, but is also probably the more (only?) reasonable behavior.

The prior implementation never threw .underflow, so there is no actual precedent for this specific operation. In practice, that implementation also had sufficient issues with correctness, precision, and not respecting rounding mode that I'm not sure users could rely upon it to produce zero or NaN (or sometimes a totally unspecified arbitrarily large result—see above).

It is already the behavior in existing code with respect to at least some operations to underflow to zero:

swift-foundation/Sources/FoundationEssentials/Decimal/Decimal+Conformances.swift

Lines 288 to 289 in 5da00f0

if actual == .underflow {

self = 0

xwu · 2026-06-12T17:13:08Z

@swift-ci test macOS

Decimal: use UInt128 significand to speed up comparison and addition

db6ed66

xwu changed the title ~~Decimal: use UInt128 significand to speed up comparison and addition~~ Decimal: use UInt128 significand to speed up operations Jun 3, 2026

Xiaodi Wu added 4 commits June 3, 2026 18:55

Decimal: use UInt128 significand to speed up multiplication

7ced311

Decimal: use UInt128 significand to speed up division

fd1f0ef

Decimal: optimize full-width division by constant

64cafcd

Decimal: re-implement addition without calling _normalize

4a26c4a

xwu force-pushed the decimal-performance branch from 152a2d4 to 4a26c4a Compare June 4, 2026 14:54

Xiaodi Wu added 3 commits June 4, 2026 13:37

Decimal: use UInt128 significand for _normalize, improving precision …

54ef277

…when lossy

Decimal: remove VariableLengthInteger entirely

8d9e7db

Decimal: refine division to omit calling _normalize

810c0be

xwu commented Jun 4, 2026

View reviewed changes

Xiaodi Wu added 3 commits June 4, 2026 20:20

Decimal: unify significand rounding, eliminate spurious over/underflo…

d71d00a

…w, fix off-by-one exponent in constants, plumb through loss-of-precision

Decimal: add tests for corrected latent bugs

09bd889

[fixup] Decimal: improve comments and whitespace, limit access level …

96aa7b9

…of UInt128 helper extensions

xwu marked this pull request as ready for review June 5, 2026 15:50

xwu requested a review from a team as a code owner June 5, 2026 15:50

stephentyrone reviewed Jun 5, 2026

View reviewed changes

Comment thread Sources/FoundationEssentials/Decimal/Decimal+Conformances.swift Outdated

stephentyrone reviewed Jun 5, 2026

View reviewed changes

Comment thread Sources/FoundationEssentials/Decimal/Decimal+Math.swift

stephentyrone reviewed Jun 5, 2026

View reviewed changes

Comment thread Sources/FoundationEssentials/Decimal/Decimal+Math.swift

xwu commented Jun 6, 2026

View reviewed changes

Comment thread Sources/FoundationEssentials/Decimal/Decimal+Math.swift Outdated

Decimal: add comments for readability

a4e1441

xwu force-pushed the decimal-performance branch from 42c3295 to a4e1441 Compare June 6, 2026 13:23

Decimal: add a same-exponent fast path for addition

1b06b97