Closed Bug 744965 Opened 12 years ago Closed 6 years ago

mozilla::NumberEqualsInt32 shouldn't rely on undefined behavior

Categories

(Core :: MFBT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: Waldo, Assigned: Waldo)

References

(Blocks 1 open bug)

Details

(Keywords: csectype-undefined, Whiteboard: [-fsanitize=float-cast-overflow] )

Attachments

(4 files)

static MOZ_ALWAYS_INLINE int
MOZ_DOUBLE_IS_INT32(double d, int32_t* i)
{
  return !MOZ_DOUBLE_IS_NEGATIVE_ZERO(d) && d == (*i = (int32_t)d);
}

It's not legal (it's undefined behavior) to cast a floating point value to an integral type, if the floating point value -- truncated -- isn't within that integral type's range.  So the above cast isn't cool, and it should be replaced with something that's permitted by the language.

The cast happens to be "safe" if you assume it'll produce some garbage value but won't have any other effects.  It appears that compilers we care about do something like this, so it's apparently not a critical thing to fix.  But relying on this, especially in the face of PGO compilers and such, seems like a trap waiting to be sprung.  We should fix this.
Unfortunately the wine-gecko folks still compile this part of Geko using the old, default 387 instruction set for the x86 platform, instead of requiring SSE2.  The requirement is good for portability, but bad for speed and unintentional exceptions.

In this case the cast of an out-of-range number actually sets the "Invalid operation" bit in the 387 floating point status register.  Normally that exception is masked out, so it's not harmful.  But sometimes, perhaps because the application has deliberately chosen to receive its own exceptions, the exception is _not_ masked out, and then MOZ_DOUBLE_IS_INT32 should avoid generating it.

A few minutes ago I attached a proposed patch to the corresponding bugzilla entry for wine-gecko:

    http://bugs.winehq.org/show_bug.cgi?id=30465

See comments there.  It simply adds "d <= INT32_MAX && d >= INT32_MIN &&" before the "d ==" requirement, thereby avoiding the illegal cast.  However, as you say, in practice it is unnecessary slowdown on many other platforms than 387-x86, e.g. SSE2-x86 and x86_64 (in the default 64 bit environment).  But maybe the slowdown is minor for those platforms?
Attached patch Possible patchSplinter Review
With the bug 798179 work fresh in my mind, doing this by inspecting the underlying double bits seemed fairly easy, so I hacked it out quickly over the weekend.  There might be a tweak or two to do to this to cut down on the number of bit ops, conditions, etc. and code size, but I think it's almost there.

The real question is whether this is any better than what the compiler will do, if we exclude the edge cases and just let the compiler do a double-to-int32_t cast.  This is what v8 does (couldn't find similar code in Nitro, so dunno about it):

http://code.google.com/p/v8/source/browse/branches/bleeding_edge/src/type-info.h?spec=svn15187&r=15172#102

I don't know which approach produces faster code, and I don't really have time to compile some test programs and experiment right now.  Maybe in a spontaneous weekend hacking session, but almost certainly no sooner.
https://developer.mozilla.org/en-US/docs/Building_SpiderMonkey_with_UBSan

js> this[1e81]

/Users/jruderman/trees/mozilla-central/js/src/obj-ubsan5/js:0x100f6a81b: runtime error: value 1e+81 is outside the range of representable values of type 'int'

(The error is actually printed 3 times, which might indicate this function is being called more times than needed.)
Whiteboard: [-fsanitize=float-cast-overflow]
Clang doesn't seem to have a builtin for float-to-int conversion.  (It does have builtins for many other things that are undefined behavior in C/C++.)

http://clang.llvm.org/docs/LanguageExtensions.html#builtin-functions

I guess you could use inline asm to get the CPU instruction you want.  But that might not be worth it.
Summary: MOZ_DOUBLE_IS_INT32 shouldn't rely on undefined behavior → mozilla::NumberEqualsInt32 shouldn't rely on undefined behavior
I also have a bitwise implementation, that I'll post as another patch after this.  Some moderate testing of how it does on 1e7 randomly generated int32_ts (using XorShift128PlusRNG) suggests this implementation is about twice as fast as bit-level tactics, tho, so this look like what we want.  The other patch we can pocket just in case it's useful in the future for some reason.
Attachment #8951483 - Flags: review?(nfroyd)
Assignee: nobody → jwalden+bmo
Status: NEW → ASSIGNED
When I ran this test against the prior two patches, I got numbers starting with ~14 doing the FP comparison approach, and I got numbers starting ~28-30 doing the bit-level tactics.  That seems far apart enough to suggest one is clearly better than the other, even if there's a lot of noise in this looping such that the measurement probably includes a lot of other random overhead besides that of the number-check we really care about.
The implementation here should allow us to trivially implement mozilla::Number{Is,Equals}Int64 when we decide we need it.  I have not added those implementations here because 1) we don't have anyone demanding it quite yet, and 2) more pressingly, it'd require writing a bunch more verification tests, and I'd rather not spend time on something that's feature-creep with respect to this bug.

The implementation approach here should also translate without too much difficulty to Number{IsEquals}Uint* if we ever want it.
Comment on attachment 8951483 [details] [diff] [review]
Implement mozilla::NumberEqualsInt32 by is-finite && in-floating-point-range comparisons

Review of attachment 8951483 [details] [diff] [review]:
-----------------------------------------------------------------

The algorithm seems pretty straightforward; I have some questions about details.

::: mfbt/FloatingPoint.h
@@ +344,5 @@
> +  static_assert(sizeof(SignedInteger) >= sizeof(int),
> +                "this function *might* require some finessing for signed types "
> +                "subject to integral promotion before it can be used on them");
> +
> +  MOZ_MAKE_MEM_UNDEFINED(aInteger, sizeof(*aInteger));

Nice.

@@ +352,5 @@
> +  // These are the usual INT32_MIN/INT32_MAX-style values.
> +  constexpr SignedInteger MaxIntValue =
> +    SignedInteger((1ULL << (SignedIntegerWidth - 1)) - 1);
> +  constexpr SignedInteger MinValue =
> +    -MaxIntValue - 1;

Just use std::numeric_limits for these, perhaps?

@@ +374,5 @@
> +    ? MaxIntValue
> +    : SignedInteger((uint64_t(1) << (SignedIntegerWidth - 1)) -
> +                    (uint64_t(1) << (SignedIntegerWidth - 2 - ExponentShift)));
> +
> +  if (IsFinite(aValue) &&

Maybe we should just test for finiteness early on in the function and immediately return false, just to make things slightly clearer?  Early returns being good and all that.

@@ +416,3 @@
>  /**
> + * If |aValue| is identical to some |int32_t| value, set |*aInt32| to that value
> + * and return true.  Otherwise return false, leaving |*aInt32| in indeterminate

Nit: "in an indeterminate state", I think?

@@ +422,5 @@
> + * be 0, use NumberEqualsInt32 below.
> + */
> +template<typename T>
> +static MOZ_ALWAYS_INLINE bool
> +NumberIsInt32(T aValue, int32_t* aInt32)

The body of this function (via inlining on detail::NumberIsSignedInteger) is way, way larger than the few instructions we had for the previous implementation.  Does NumberIsInt32 need to be MOZ_ALWAYS_INLINE, with the attendant code bloat?  Does NumberIsInt32 even need to be `static`, since we additionally don't want every compilation unit to have its own copy of this function?

Actually, do the detail:: functions need to be MOZ_ALWAYS_INLINE and `static`, for the same reasons?

@@ +430,5 @@
> +
> +/**
> + * If |aValue| is equal to some int32_t value (where -0 and +0 are considered
> + * equal), set |*aInt32| to that value and return true.  Otherwise return false,
> + * leaving |*aInt32| in indeterminate state.

Nit: "in an indeterminate state", I think?

@@ +438,4 @@
>   */
>  template<typename T>
>  static MOZ_ALWAYS_INLINE bool
>  NumberEqualsInt32(T aValue, int32_t* aInt32)

Same comments as regards MOZ_ALWAYS_INLINE and static here as well.
Attachment #8951483 - Flags: review?(nfroyd) → review+
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1fcc972d445b
Implement mozilla::NumberEqualsInt32 in a way that doesn't depend on undefined behavior casting an out-of-range floating point number to int32_t.  r=froydnj
Well, this is unfortunate.  It looks like in at least some configurations, this happens:

  Assertion failure: pow(2.0, -1075.0) == 0.0

Testing with MSVC on my Windows box, I see it compiles that to be equal to pow(2.0, -1074.0) == 5e-324.

Now, it should be the case that 2**-1075 is halfway between +0.0 and 2**-1074 as the smallest exactly representable value.  Could rounding mode be coming into play here?

Unless I'm missing something, it doesn't seem to.  If I print out |fegetround()| in an affected binary, I get 0, which also appears to be the value of |FE_TONEAREST| -- round to nearest representable value, or the one with lowest bit zero if there are two.  The call here *should* fall into this case, and lowest bit zero means the result should be +0.0.

So, um.  Maybe we're compiling cppunit tests without all the right compiler/linker flags on Windows, and double math isn't compiling the IEEE-754 way we want it to?  Will ask around about MSVC flags for this when I head in today, I guess.
Flags: needinfo?(jwalden+bmo)
Blocks: 1440184
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/31ea2689701e
Implement mozilla::NumberEqualsInt32 in a way that doesn't depend on undefined behavior casting an out-of-range floating point number to int32_t.  r=froydnj
https://hg.mozilla.org/mozilla-central/rev/31ea2689701e
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: