Add checked math to FixedDecimals; default to overflow behavior #85

NHDaly · 2023-12-07T00:12:32Z

Description

Following the behavior of operations on Integers, FD operations will now overflow by default, and can be used in checked_* checked-math operations to get OverflowErrors on overflow.

Fixes #12.

Decisions

FD arithmetic operations wrap by default on overflow.
Except for division operations, div, /, fld, fld1, cld, rem, mod, which all throw on overflow.
checked_* operations support FD, Int combinations, just like how checked_add can do Int8 + Int64.
We introduced checked_rdiv for /.

Questions (now closed)

What should the behavior be for truncating division?
- e.g. Should we allow FD{Int8,2}(1) ÷ FD{Int8,2}(0.5) to overflow?
  - (100 ÷ 50 =2. 2 * 100 = 200. 200 doesn't fit in Int8.)
  - In my current implementation, this overflows to FixedDecimal{Int8,2}(-0.56). Are we okay with that?
- Base.div() already throws for Integers, for divide-by-zero and typemin(Int) ÷ -1, so we could consider letting it throw in these cases too. But it seems weird for all the other operations to overflow by default but for division not to. So I'm inclined to let it overflow/underflow as well.
Should the checked_* operations support FD, non-FD pairs? Like checked_mul(FD{Int8,2}(1), 2)?
- We support that for regular * so i'm inclined to say yes.
We don't have a mechanism for checked / division... This is because the checked_* functions in Base only apply to integers, and there isn't a / for integers! :( So what should we do here?
- We could introduce our own, like FixedPointDecimals.checked_decimal_division?
  - I'm going to try that for now... but i'm very open to other suggestions.
Anything else I'm missing?

NHDaly · 2023-12-07T00:13:00Z

CC: @mcmcgrath13, @Drvi

src/FixedPointDecimals.jl

implementation

src/FixedPointDecimals.jl

test/FixedDecimal.jl

NHDaly · 2023-12-09T01:16:45Z

Okay, I think this is ready for review, since I need a touch of help deciding on the last open question! :) Thanks

…nk about

NHDaly · 2023-12-09T01:24:29Z

test/FixedDecimal.jl

+ @testset "division" begin
+ # TODO(PR): Is this the expected value?
+ @test typemax(T) / T(0.5) == FD2(-0.2)
+ @test typemin(T) / T(0.5) == FD2(0)
+ end
+
+ @testset "truncating division" begin
+ # TODO(PR): Is this the expected value?
+ @test typemax(T) ÷ T(0.5) == T(-0.6)
+ @test typemin(T) ÷ T(0.5) == T(0.6)
+ @test typemax(T) ÷ eps(T) == T(-1)
+ @test typemin(T) ÷ eps(T) == T(0)
+ end
+
+ @testset "fld / cld" begin
+ # TODO(PR): Is this the expected value?
+ @test fld(typemax(T), T(0.5)) == T(-0.6)
+ @test fld(typemin(T), T(0.5)) == T(-0.4)
+ @test fld(typemax(T), eps(T)) == T(-1)
+ @test fld(typemin(T), eps(T)) == T(0)
+
+ # TODO(PR): Is this the expected value?
+ @test cld(typemax(T), T(0.5)) == T(0.4)
+ @test cld(typemin(T), T(0.5)) == T(0.6)
+ @test cld(typemax(T), eps(T)) == T(-1)
+ @test cld(typemin(T), eps(T)) == T(0)
+ end


(Sorry, reposting the question since I edited the tests to be FD{Int8,1} instead of FD{Int,2}):

@Drvi / @mcmcgrath13 / @omus: This is the last open question in this PR I think: What should the value of overflowing division and truncating division be?

I think that they should be the same, only differing in their rounding modes, but currently they are not.
Ideally x ÷ y would be the same as round(5 / 6, RoundToZero), which I think it currently is without overflow, but it certainly is not after overflow.
In particular, I think that trunc-divide should always return a whole-number, even if the operation overflowed?

Gosh, or maybe we should just leave all the division operators always throwing and never wrapping?? It's complicated!

What I have done so far in this PR is: trunc-divide the inner integers (so div(typemax(Int8), 5), in this case), then multiplied that by C (10), which overflows, and then I left that overflow alone:

julia> typemax(Int8) ÷ Int8(5) 25 julia> (typemax(Int8) ÷ Int8(5)) * Int8(10) -6

But now I actually think that the right thing to do is to perform nontruncating division, and then truncate the result??
So typemax(FD{Int8,1}) ÷ FD{Int8,1}(0.5) would be 0 (since -0.2 rounds to 0) and fld would be -1?

What do you all think?

Thanks!

Hmm, this is tricky. I find it hard to even define criteria by which I'd evaluate the different approaches, because the result of the overflowing operation is probably not useful no matter how hard one tries to define its sematics.

But if I think about overflows in multiplication / addition, here is roughtly what I expect
a) Behind the scenes, a "correct number" is produced
b) If the "correct number" is too big for the storage type, it wraps around

I'm not sure how economical is to produce the "correct number" only for it wrap around, maybe there are ways to speed things up at the cost of some UB (and maybe there is a place for unsafe_div, not sure), but I think it makes sense and provides a reasonable mental model for diagnosing weird results. So for example:

julia> div(FD{Int8,2}(0.5), FD{Int8,2}(0.33)) # (50 / 33) * 100 = 151.515... (overflows max of 127) -> round to 100 (no longer overflows) -> convert to FD (% Int8, no change) FixedDecimal{Int8,2}(1.00) julia> div(FD{Int8,2}(0.5), FD{Int8,2}(0.2)) # (50 / 20) * 100 = 250 (overflows max of 127) -> round to 200 (still overflows) -> convert to FD (% Int8, we get -56) FixedDecimal{Int8,2}(-0.56)

in your example:

typemax(FD{Int8,1}) ÷ FD{Int8,1}(0.5) # (127 / 5) * 10 = 254 (overflows max of 127) -> round to 250 (still overflows) -> convert to FD (% Int8, we get -6) FixedDecimal{Int8,1}(-0.6)

Yeah, it's so tricky! :/ I agree, i'm not even sure if overflow makes sense for truncating division.

I like your formalization, thanks. I think it's quite similar to what the code is currently doing, which is why we're getting -0.6. 👍

But the more that I think about it, i'm wondering about swapping the order of the last two operations?:

You wrote:

(127 / 5) * 10 = 254 (overflows max of 127) -> round to 250 (still overflows) -> convert to FD (% Int8, we get -6)

(50 / 20) * 100 = 250 (overflows max of 127) -> round to 200 (still overflows) -> convert to FD (% Int8, we get -56)

(50 / 33) * 100 = 151.515... (overflows max of 127) -> round to 100 (no longer overflows) -> convert to FD (% Int8, no change)

Whereas I think I'd like to do this:

(127 / 5) * 10 = 254 (overflows max of 127) -> convert to FD (% Int8, gives -2) -> round `-0.2` to `-0`.

(50 / 20) * 100 = 250 (overflows max of 127) -> convert to FD (% Int8, gives -6) -> round `-0.6` to `-0`.

(50 / 33) * 100 = 151.515... (overflows max of 127) -> convert to FD (% Int8, gives -105) -> round `-1.05` to `1.`.

This way, we preserve the invariant that div is supposed to drop the fractional part of the division. From the docs:

help?> div search: div divrem DivideError splitdrive code_native @code_native div(x, y) ÷(x, y) The quotient from Euclidean (integer) division. Generally equivalent to a mathematical operation x/y without a fractional part. See also: cld, fld, rem, divrem.

So I think any div() implementation that can return a fractional result is wrong. It seems like it should always be safe to do Int(div(x, y))?

But I do wonder about just throwing in this case instead 😅

I also like that I think my approach gives the same answer as / before the truncation, which is what I think I'd expect. It's just the truncating version of /, and they both overflow in the same way.

I'm not very well informed in this area, but I think I'd expect

So I think any div() implementation that can return a fractional result is wrong. It seems like it should always be safe to do Int(div(x, y))?

to be true. Especially since the docs say that div will drop the fractional part, right?

Hmm.
One issue here is that we are combining

wrap-around behavior which is an "integral" phenomenon

truncation to an integer which is a "fractional-number" phenomenon

The former is the limitation of the storage type and is an implementation detail for a FD. There is no precedence for floating point numbers to manifest wrap-around behavior (they just increase scale and eventually call it ~~a day~~ an Inf), while for fixed point... who knows? For floating point arithmetic IIUC, the mental model of "produce the correct number behind the scenes and then round to nearest representable" does capture their semantics.

Which brings me to another point which can help us decide: what is truncating division? Is it a single, atomic operation? Or a combination of two separate operations (the / and the trunc)? For a) it would make sense to wrap at the end, for b) it would make sense to do what you suggest. One argument for a) is that the user can always implement b) by composing the two operations manually (but maybe this would be confusing to users? Freedom for one user is possible confusion for the other...).

I guess the nearest sibling FixedDecimals have in this regard are Rationals, which also use integers to implement fractional numbers. They apparently recognize this is not a well defined situation an throw:

julia> Rational{Int8}(127, 1) // Rational{Int8}(1, 2) ERROR: OverflowError: 127 * 2 overflowed for type Int8

Note that an alternative to fixed point decimals are floating point decimals (e.g. https://github.com/JuliaMath/DecFP.jl). IIUC, they could be a viable substitute for fixed point decimal and they have some advantages too:

They are well-defined by a IEEE standard

They have one fewer type param

Being a floating point, they're able to adapt to what your scale is (and if you are not surpassing the number of significant digits within that scale, they are precise, I think)

But...:

Their precision is lower than for fixed decimals (7, 16, and 34 digits for 32, 64 and 128 bits respectively)

OKAY, following the behavior of Rational, I have changed this PR to always throw OverflowError on overflow during division operations.

With this, the tests are passing, and i think this PR is finally ready to go! Thanks for the great discussion.

I'll leave this thread open here for anyone who reads the PR in the future.

Drvi

I think this looks pretty good, but I still need to spend more time on it.

One meta comment about "inexact" errors, I think they can happen also as a result of multiplication or division and it might be desireable for the user to make those throw as well. E.g. FD{Int8,2}(1.1) * FD{Int8,2}(1.01) = FD{Int8,2}(1.11) while the correct result 1.111. It might be that the user doesn't care for the results to be more precise than 2 dec places, but there is no straightforward way to check... not sure how big of an issue this is for financial applications.

In the python decimal module one can setup a check a "trap" for inexact results e.g:

>>> import decimal
>>> r = decimal.Decimal("1.1") * decimal.Decimal("1.11111111111111111111111111111111111111111111111111111")
>>> r
Decimal('1.222222222222222222222222222')


>>> decimal.getcontext().traps[decimal.Inexact] = True
>>> r = decimal.Decimal("1.1") * decimal.Decimal("1.11111111111111111111111111111111111111111111111111111")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
decimal.Inexact: [<class 'decimal.Inexact'>]

src/FixedPointDecimals.jl

Drvi · 2023-12-11T13:05:55Z

test/FixedDecimal.jl

+ @testset "division" begin
+ # TODO(PR): Is this the expected value?
+ @test typemax(T) / T(0.5) == FD2(-0.2)
+ @test typemin(T) / T(0.5) == FD2(0)
+ end
+
+ @testset "truncating division" begin
+ # TODO(PR): Is this the expected value?
+ @test typemax(T) ÷ T(0.5) == T(-0.6)
+ @test typemin(T) ÷ T(0.5) == T(0.6)
+ @test typemax(T) ÷ eps(T) == T(-1)
+ @test typemin(T) ÷ eps(T) == T(0)
+ end
+
+ @testset "fld / cld" begin
+ # TODO(PR): Is this the expected value?
+ @test fld(typemax(T), T(0.5)) == T(-0.6)
+ @test fld(typemin(T), T(0.5)) == T(-0.4)
+ @test fld(typemax(T), eps(T)) == T(-1)
+ @test fld(typemin(T), eps(T)) == T(0)
+
+ # TODO(PR): Is this the expected value?
+ @test cld(typemax(T), T(0.5)) == T(0.4)
+ @test cld(typemin(T), T(0.5)) == T(0.6)
+ @test cld(typemax(T), eps(T)) == T(-1)
+ @test cld(typemin(T), eps(T)) == T(0)
+ end


Hmm, this is tricky. I find it hard to even define criteria by which I'd evaluate the different approaches, because the result of the overflowing operation is probably not useful no matter how hard one tries to define its sematics.

But if I think about overflows in multiplication / addition, here is roughtly what I expect
a) Behind the scenes, a "correct number" is produced
b) If the "correct number" is too big for the storage type, it wraps around

I'm not sure how economical is to produce the "correct number" only for it wrap around, maybe there are ways to speed things up at the cost of some UB (and maybe there is a place for unsafe_div, not sure), but I think it makes sense and provides a reasonable mental model for diagnosing weird results. So for example:

julia> div(FD{Int8,2}(0.5), FD{Int8,2}(0.33)) # (50 / 33) * 100 = 151.515... (overflows max of 127) -> round to 100 (no longer overflows) -> convert to FD (% Int8, no change) FixedDecimal{Int8,2}(1.00) julia> div(FD{Int8,2}(0.5), FD{Int8,2}(0.2)) # (50 / 20) * 100 = 250 (overflows max of 127) -> round to 200 (still overflows) -> convert to FD (% Int8, we get -56) FixedDecimal{Int8,2}(-0.56)

in your example:

typemax(FD{Int8,1}) ÷ FD{Int8,1}(0.5) # (127 / 5) * 10 = 254 (overflows max of 127) -> round to 250 (still overflows) -> convert to FD (% Int8, we get -6) FixedDecimal{Int8,1}(-0.6)

Drvi · 2023-12-11T13:15:08Z

src/FixedPointDecimals.jl

+Base.:*(x::Integer, y::FD{T, f}) where {T, f} = reinterpret(FD{T, f}, *(promote(x, y.i)...))
+Base.:*(x::FD{T, f}, y::Integer) where {T, f} = reinterpret(FD{T, f}, *(promote(x.i, y)...))


I think when the Integer is a BigInt, and T is not, the promote would allocate another bigint which might not be needed because there are usually specialized methods for BigInt x Integer that avoid the allocation.

So maybe i should just leave it without the promote() and let * do the promotion internally if needed? I'll try that

Drvi · 2023-12-11T13:16:05Z

src/FixedPointDecimals.jl

+Base.checked_rem(x::FD, y::FD) = Base.checked_rem(promote(x, y)...)
+Base.checked_mod(x::FD, y::FD) = Base.checked_mod(promote(x, y)...)
+
+Base.checked_add(x::FD, y) = Base.checked_add(promote(x, y)...)


Also here would be good to audit if promote is a good idea when one of the inputs is a BigInt

Currently, I think that this package just relies on promotion to do arithmetic on BigInts, which I agree is causing unnecessary allocs:

julia> @which FD{BigInt,2}(2) + 2 +(x::Number, y::Number) @ Base promotion.jl:410 julia> @code_typed FD{BigInt,2}(2) + 2 CodeInfo( 1 ─ %1 = invoke Base.GMP.MPZ.set_si(10::Int64)::BigInt │ %2 = invoke Base.GMP.bigint_pow(%1::BigInt, 2::Int64)::BigInt │ %3 = invoke Base.GMP.MPZ.mul_si(%2::BigInt, y::Int64)::BigInt │ %4 = Base.getfield(x, :i)::BigInt │ %5 = invoke Base.GMP.MPZ.add(%4::BigInt, %3::BigInt)::BigInt │ %6 = %new(FixedDecimal{BigInt, 2}, %5)::FixedDecimal{BigInt, 2} └── return %6 ) => FixedDecimal{BigInt, 2} julia> @code_typed optimize=false FD{BigInt,2}(2) + 2 CodeInfo( 1 ─ %1 = Base.:+::Core.Const(+) │ %2 = Base.promote(x, y)::Tuple{FixedDecimal{BigInt, 2}, FixedDecimal{BigInt, 2}} │ %3 = Core._apply_iterate(Base.iterate, %1, %2)::FixedDecimal{BigInt, 2} └── return %3 ) => FixedDecimal{BigInt, 2}

I'm just going to file this as a future improvement and move on, since I feel bad about how long this PR has lagged for.

Filed: #87.

NHDaly · 2023-12-13T05:22:28Z

Interesting notes on inexact! I think this is the same thing that @davidwzhao was referring to on slack as well... is that right david?

i don't know if our current API will support opting-in and opting-out on those errors? Does checked_mul throw for the InexactErrors in the current implementation in this PR? should it?

I do kind of think truncating to 1.11 for FD{2} is the expected behavior for this package 🤔

davidwzhao · 2023-12-13T06:58:32Z

@NHDaly i think this is the same, yes! I think it's a bit more complicated than the truncating case though, e.g., the following also throws an InexactError even though the decimal precision is correct:

julia> FixedDecimal{Int8, 2}(1.0) / FixedDecimal{Int8, 2}(0.5)
ERROR: InexactError: trunc(Int8, 200)

Drvi · 2023-12-13T10:10:53Z

Re InexactErrors: I don't want to derail this PR further, I knew this was going to be a rabbit hole:) But it would be nice, if there was a global switch that could change all the _round_to_nearest calls to an equivalent that uses the RoundThrows for checked operations. We already support this rounding mode in parsing -- i.e. anytime there is anything to round (the remainder) but a series of zeros, we'd throw with this mode. Then the user can have "really" checked operations as an opt-in option.

Co-authored-by: Tomáš Drvoštěp <tomas.drvostep@gmail.com>

This allows BigInt * Int to avoid allocating another BigInt.

Also update the README

NHDaly

Okay this is ready for a final review! Please take another look!!

davidwzhao

Looks good to me, thank you!! I'm not super familiar with this kind of code though, so it may be a good idea to get a second review

NHDaly · 2023-12-19T01:16:47Z

Unfortunately, @Drvi has just gone out for holiday leave..
Lemme see if I can find another reviewer. Otherwise, based on his previous review, your LGTM, and our test suite, I think I would feel good to merge. Thanks!

NHDaly · 2023-12-19T16:49:16Z

Had a chat internally and we feel good about merging. Thanks for the reviews, @davidwzhao and @Drvi! :) Merging now. thanks!

Let all the FixedDecimals operations overflow, matching Int overflow

c81bb83

NHDaly mentioned this pull request Dec 7, 2023

Handling Overflow #12

Closed

Add checked_* methods for FixedDecimals

9da03cd

NHDaly force-pushed the nhd-checked-math branch from eda2525 to 9da03cd Compare December 7, 2023 00:14

NHDaly commented Dec 7, 2023

View reviewed changes

src/FixedPointDecimals.jl Show resolved Hide resolved

NHDaly added 2 commits December 7, 2023 10:34

Fix dispatch for checked_mul and checked_div. Fix checked_div

f19c147

implementation

Add checked / operation for FDs: checked_decimal_division

ea46f72

NHDaly force-pushed the nhd-checked-math branch from 831b4ff to ea46f72 Compare December 7, 2023 17:34

NHDaly added 5 commits December 7, 2023 10:45

Add README sections around overflow, conversions and inexact errors

c465e6f

Add the missing checked_ operations: fld,cld,rem,mod,abs,neg. Add tests

fc1d927

Fix up tests

e134ee4

Add corner cases tests

2efb468

Fix OverflowError for older versions of julia

7793ffc

NHDaly commented Dec 9, 2023

View reviewed changes

src/FixedPointDecimals.jl Show resolved Hide resolved

Add fld/cld tests, but they still seem wrong, and abs/neg

30aef4f

NHDaly force-pushed the nhd-checked-math branch from 667c162 to 30aef4f Compare December 9, 2023 01:10

NHDaly commented Dec 9, 2023

View reviewed changes

test/FixedDecimal.jl Outdated Show resolved Hide resolved

NHDaly marked this pull request as ready for review December 9, 2023 01:16

Change overflow tests to FD{Int8,1} to make the results easier to thi…

07bf40f

…nk about

NHDaly commented Dec 9, 2023

View reviewed changes

Add promotions tests

137c8b4

NHDaly force-pushed the nhd-checked-math branch from 804710b to 137c8b4 Compare December 9, 2023 08:35

Drvi reviewed Dec 11, 2023

View reviewed changes

NHDaly added 2 commits December 18, 2023 16:37

Rename checked_rdiv; reexport checked*

e1dd56d

Enable nightly CI

d119dce

NHDaly and others added 4 commits December 18, 2023 16:55

Apply suggestions from code review

41a69fd

Co-authored-by: Tomáš Drvoštěp <tomas.drvostep@gmail.com>

Improve perf: don't force-promote inside *()

3270ad5

This allows BigInt * Int to avoid allocating another BigInt.

Make division operations always throw on overflow!

4cb2d5d

Also update the README

Add overflow tests for fld1 and div(::RoundMode)

0cf9092

NHDaly requested review from davidwzhao and Drvi December 19, 2023 00:24

Cleanup

8eefc80

NHDaly commented Dec 19, 2023

View reviewed changes

NHDaly added 2 commits December 18, 2023 17:34

This is a breaking release!

34a9306

compat to README

73410da

davidwzhao approved these changes Dec 19, 2023

View reviewed changes

NHDaly merged commit e769be4 into master Dec 19, 2023
14 checks passed

NHDaly deleted the nhd-checked-math branch December 19, 2023 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add checked math to FixedDecimals; default to overflow behavior #85

Add checked math to FixedDecimals; default to overflow behavior #85

NHDaly commented Dec 7, 2023 •

edited

Loading

NHDaly commented Dec 7, 2023

NHDaly commented Dec 9, 2023

NHDaly Dec 9, 2023 •

edited

Loading

Drvi Dec 11, 2023

NHDaly Dec 13, 2023

NHDaly Dec 13, 2023

NHDaly Dec 13, 2023 •

edited

Loading

davidwzhao Dec 13, 2023

Drvi Dec 13, 2023 •

edited

Loading

NHDaly Dec 19, 2023

Drvi left a comment

Drvi Dec 11, 2023

Drvi Dec 11, 2023

NHDaly Dec 18, 2023

Drvi Dec 11, 2023

NHDaly Dec 18, 2023

NHDaly Dec 18, 2023

NHDaly commented Dec 13, 2023

davidwzhao commented Dec 13, 2023

Drvi commented Dec 13, 2023

NHDaly left a comment

davidwzhao left a comment

NHDaly commented Dec 19, 2023

NHDaly commented Dec 19, 2023

		Base.:(x::Integer, y::FD{T, f}) where {T, f} = reinterpret(FD{T, f}, (promote(x, y.i)...))
		Base.:(x::FD{T, f}, y::Integer) where {T, f} = reinterpret(FD{T, f}, (promote(x.i, y)...))

Add checked math to FixedDecimals; default to overflow behavior #85

Add checked math to FixedDecimals; default to overflow behavior #85

Conversation

NHDaly commented Dec 7, 2023 • edited Loading

Description

Decisions

Questions (now closed)

NHDaly commented Dec 7, 2023

NHDaly commented Dec 9, 2023

NHDaly Dec 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NHDaly Dec 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Drvi Dec 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Drvi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NHDaly commented Dec 13, 2023

davidwzhao commented Dec 13, 2023

Drvi commented Dec 13, 2023

NHDaly left a comment

Choose a reason for hiding this comment

davidwzhao left a comment

Choose a reason for hiding this comment

NHDaly commented Dec 19, 2023

NHDaly commented Dec 19, 2023

NHDaly commented Dec 7, 2023 •

edited

Loading

NHDaly Dec 9, 2023 •

edited

Loading

NHDaly Dec 13, 2023 •

edited

Loading

Drvi Dec 13, 2023 •

edited

Loading