465308 - TM: Interpreter and JIT disagree on (0 | ((0x60000009) * 0x60000009))

For moh: We have 2 numbers that we know are within the int32 range. We know that the result of the multiplication will be clipped into the int32 range. We are looking for a way to do the multiplication using integer math, instead of floating point math. We incorrectly were optimizing like this, which we had to remove: assuming int32(a) && int32(b) then int32(double(a) * double(b)) => int32(a) * int32(b) [This produces incorrect results, including for the test case in the title.] I was wondering whether there is a way to multiply a and b without FPU involvement but still maintaining the same overflow semantics than converting to double and then doing an FPU multiplication.

Moh Haghighat

Comment 6

•

17 years ago

Thanks Andreas. Will get back to you soon.

Moh Haghighat

Comment 7

•

17 years ago

With SSE2 instructions, we can get the desired result without doing FPU multiplication. For example, the assignment statement in the code segment: int x, y, z; void f () { z = ((double) x) * ((double) y); } can be implemented as: cvtsi2sd xmm1, DWORD PTR [_x] cvtsi2sd xmm0, DWORD PTR [_y] mulsd xmm1, xmm0 cvttsd2si eax, xmm1 mov DWORD PTR [_z], eax This is, in fact, what ICC 11.0 generates for the above code.

Boris Zbarsky [:bzbarsky]

Comment 8

•

17 years ago

Either I'm misreading the documentation for cvttsd2si, or that doesn't quite do what we want here, does it? cvttsd2si documentation seems to say: If a converted result is larger than the maximum singed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked the indefinite integer value (80000000H) is returned. That's correct for the C example cited in comment 5 and comment 7, but the relevant example here really seems to be: int x, y; /* assign them some values */ long long z; int w; z = ((double)x) * ((double)y); w = (int)z; At least that's what's consistent with the JS interpreter output.

Moh Haghighat

Comment 9

•

17 years ago

According to the ECMA specs: http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf 11.5.1 Applying the * Operator ... In the remaining cases, where neither an infinity or NaN is involved, the product is computed and rounded to the nearest representable value using IEEE 754 round-to-nearest mode. If the magnitude is too large to represent, the result is then an infinity of appropriate sign. This seems to suggest that the expected result would be as if the computations were carried in doubles, right? This, of course, is different from the results in 64 bit (e.g., long long). If carrying the results in 64-bit integer is compliant with the standards, then an integer sequence for finding the result is also possible, but I still think the FPU implementation is the right one, and so is the sequence with cvttsd2si.

Andreas Gal :gal

Assignee

Comment 10

•

17 years ago

David and I were pondering about the following optimization: When at recording time no overflow is observed, emit imul followed by a LIR_ov guard. If we every observe an overflow, we side-exit, re-compile and make sure we don't do this optimization again for that pc location.

Boris Zbarsky [:bzbarsky]

Comment 11

•

17 years ago

Yeah, the expected result of * is basically a double multiply, but the cast to integer performed before doing '|', as defined in that same spec (section 9.5), says to truncate towards zero to get an integral-valued double (might be out of int range, but that's OK), then reduce it mod 2^32 (to a positive number, always), then effectively cast the resulting unsigned 32-bit into a signed 32-bit int. Comment 10 sounds like it could work. Another option might be to guard on the numbers being below 2^16 and doing the same thing. That involves two guards instead of one; not sure how the expense compares to LIR_ov.

Brendan Eich [:brendan]

Comment 12

•

17 years ago

(In reply to comment #11) > Comment 10 sounds like it could work. Agreed -- let's try it! > Another option might be to guard on the > numbers being below 2^16 and doing the same thing. That involves two guards > instead of one; not sure how the expense compares to LIR_ov. An or feeding a shift right by 16 feeding a single guard would be better, but let's not :-P. /be

Moh Haghighat

Comment 13

•

17 years ago

Sorry for the confusion. I had not got the problem right. Are we looking for an effcient implementation of "JavaScript ToInt32(x*y)" where x and y are JavaScript numbers, respresented in doubles with the additional knowledge that they are both representable as int32? Also, what's the status of the latest ToInt32 implementaion? Do we inline and trace into js_DoubleToECMAInt32 and js_ValueToInt32 or just call them from the JIT'ed code? thanks

Andreas Gal :gal

Assignee

Comment 14

•

17 years ago

Yes, basically we want to optimize ToInt32(ToInt32(x) * ToInt32(y)). We call into DoubleToECMAInt32. We don't want to split traces over it since the types remain the same. A builtin (inline or a call) is more effective in this case. The call is a bit expensive for us atm. We could reduce that cost through either inlining the code (now we have branches), or use better call semantics (don't spill all registers, use xmm to pass values).

Mike Beltzner [:beltzner, not reading bugmail]

Comment 15

•

17 years ago

Fixed in the merge pushed by vlad on Nov 18 14:11:14 2008 -0800: http://hg.mozilla.org/mozilla-central/rev/e8ed5d4bf531

Status: REOPENED → RESOLVED

Closed: 17 years ago → 17 years ago

Resolution: --- → FIXED

Brian Crowder

Comment 16

•

17 years ago

Is this one really ready to be closed?

Andreas Gal :gal

Assignee

Comment 17

•

17 years ago

Yeah, the bug is fixed. If we come up with a way to optimize it we will file a new bug.

Bob Clary [:bc] (inactive)

Comment 18

•

17 years ago

Checking in js1_8/regress/regress-465308.js; /cvsroot/mozilla/js/tests/js1_8/regress/regress-465308.js,v <-- regress-465308.js initial revision: 1.1 done

Flags: in-testsuite+

Flags: in-litmus-

Shawn Wilsher :sdwilsh

Comment 19

•

17 years ago

http://hg.mozilla.org/mozilla-central/rev/d96e397541b0

Mike Beltzner [:beltzner, not reading bugmail]

Updated

•

17 years ago

Keywords: fixed1.9.1

Bob Clary [:bc] (inactive)

Comment 20

•

17 years ago

v 1.9.1, 1.9.2

Status: RESOLVED → VERIFIED

Keywords: fixed1.9.1 → verified1.9.1