TM: v8-crypto calls js_DoubleToInt32 too often because Math.floor returns a double

RESOLVED FIXED

Status

()

--
enhancement
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: billm, Assigned: billm)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 attachment, 2 obsolete attachments)

Code generated by the method JIT for v8-crypto almost never converts doubles to ints. But code generated by the tracer is doing billions of conversions. We should make the tracer as fast as the method JIT.
Created attachment 478872 [details] [diff] [review]
fix

This patch fixes two problems.

(1) v8-crypto makes a lot of calls to Math.floor. The traceable native for Math.floor is annotated to return a double, since it can return NaN or infinity. Therefore, all the code in the trace following the floor call is specialized for doubles, even though the double is actually an integer.

(2) v8-crypto occasionally does integer multiplications that result in zero sometimes. To avoid problems with -0, the tracer guards against muliplies resulting in zero and switches to a double path.

This patch fixes these two problems. It generates special code for floor, ceil, and round. The results of floor, ceil, and round are treated as integers, and we have a guard to go off trace if the result doesn't fit in an integer. (A special bit pattern is used to pass back the result in such cases.)

It also adds a more specific test for multiplications. Rather than just guard on the result being zero, it also checks if either operand was negative, which is also a necessary condition for getting -0 as a result.

On my machine, just using -j, I get the following times for v8-crypto:
  without patch: 324ms
  with patch: 201ms
Using -m -j, I get these times:
  without patch: 213ms
  with patch: 200ms

So, it's not much help with -m -j, but it's still an improvement. And it helps with the trace tuning.
Assignee: general → wmccloskey
Status: NEW → ASSIGNED
Attachment #478872 - Flags: review?(nnethercote)
Bill, how was JM avoiding this?  Because it was calling the jsval version of the math functions and those used integer jsvals as their return values in the non-failure cases?
Also, does this help with the tracer issue described in bug 597814 too?
(In reply to comment #2)
> Bill, how was JM avoiding this?  Because it was calling the jsval version of
> the math functions and those used integer jsvals as their return values in the
> non-failure cases?

Yes, the fast native version basically does |result->setNumber(floor(...))|. And setNumber calls JSDOUBLE_IS_INT32 on its argument to test if it can use an integer tag.

(In reply to comment #3)
> Also, does this help with the tracer issue described in bug 597814 too?

That's a separate issue, I think.
Comment on attachment 478872 [details] [diff] [review]
fix

I'd like this bug split into two, one for each of the issues.  This
will make it clearer how important each of the two issues are.  The title
currently reflects issue (1), so you might as well move issue (2) into the
separate bug.  

I'm giving you an r+ for the parts of the patch that relate to issue (1),
assuming you fix the things mentioned below.  There are also some comments
about the parts of the patch related to issue (2) that you might as well
address before posting a patch to the separate bug.  I'm happy to review the
patch for part (2).


>         if (needsNegZeroCheck) {
>             // make sure we don't lose a -0
>             JS_ASSERT(v == LIR_muli);
>             if (!exit)
>                 exit = snapshot(OVERFLOW_EXIT);
>-            guard(false, lir->insEqI_0(result), exit);
>+
>+            LIns* test_zero = lir->insEqI_0(result);
>+            if ((d0->isImmI() && d0->immI() < 0)
>+                || (d1->isImmI() && d1->immI() < 0))
>+            {
>+                guard(false, test_zero, exit);
>+            } else {
>+                if (test_zero->isImmI()) {
>+                    if (test_zero->immI())
>+                        guardNonNeg(d0, d1, exit);
>+                } else {
>+                    LIns* branch = lir->insBranch(LIR_jf, test_zero, NULL);
>+                    guardNonNeg(d0, d1, exit);
>+                    branch->setTarget(lir->ins0(LIR_label));
>+                }
>+            }
>         }
>         break;

(Issue (2))  I found this whole block hard to follow.  The control-flow is complex.  Some
comments would help.


>+static const int INT_FAIL_MAGIC = 0x0badf00d;
>+
>+static jsint FASTCALL
>+ceilReturningInt(jsdouble x)
>+{
>+    jsdouble r = js_math_ceil_impl(x);
>+    int32_t i;
>+    if (JSDOUBLE_IS_INT32(r, &i)) return i;
>+    else return INT_FAIL_MAGIC;
>+}

AFAICT it's possible that a program might call ceil/floor/round on the value
0xbadf00d in which case you'll get a false positive and leave the trace
unnecessarily, right?  Please add a comment about this.


>+JS_DEFINE_CALLINFO_1(static, INT32, ceilReturningInt, DOUBLE, 0, 0)
>+JS_DEFINE_CALLINFO_1(static, INT32, floorReturningInt, DOUBLE, 0, 0)
>+JS_DEFINE_CALLINFO_1(static, INT32, roundReturningInt, DOUBLE, 0, 0)

I guess you haven't worked out what those last two args mean, right?  The
second last one is a boolean that should be 1 if the function is pure (no
side-effects, etc), as these are.  The last one is the AccSet describing the
memory access regions stored to by the function;  for these functions it
should be ACCSET_NONE.  See the big "Aliasing" comment in nanojit/LIR.h for
more info about access regions;  they're important, you should understand
them if you are venturing into jstracer.cpp.
Attachment #478872 - Flags: review?(nnethercote) → review+
(In reply to comment #5)
> 
> >+JS_DEFINE_CALLINFO_1(static, INT32, ceilReturningInt, DOUBLE, 0, 0)
> >+JS_DEFINE_CALLINFO_1(static, INT32, floorReturningInt, DOUBLE, 0, 0)
> >+JS_DEFINE_CALLINFO_1(static, INT32, roundReturningInt, DOUBLE, 0, 0)
> 
> I guess you haven't worked out what those last two args mean, right?

Actually, I'm interested to know how/why you put "0, 0)" there, because the thought of people getting these values wrong is something that worries me, because it can lead to very subtle bugs, and I'd like to come up with ways to make it harder to get wrong.
Lots of old code/patches has 0, 0 there from back when they used to both be booleans.

Can we make AccSet a struct so 0 doesn't cast to it?
(In reply to comment #7)
> Lots of old code/patches has 0, 0 there from back when they used to both be
> booleans.

In the current tip code the only 0, 0 case I can find is functionProbe, for which I opened bug 600153.

We could make AccSet a struct, though it's a bit of a pain.  I'm surprised Bill used 0, 0 when every single example CallInfo except functionProbe has an ACCSET_<something> as its last question.  Hence my question about why he put 0, 0 there.
(In reply to comment #1)
> It also adds a more specific test for multiplications. Rather than just guard
> on the result being zero, it also checks if either operand was negative, which
> is also a necessary condition for getting -0 as a result.
> 

I don't know if it's worth it, but for negative zero you need 1) a zero 2) a negative number. If you know that lhs/rhs is positive (> 0), the result can't be -0. If one operand is zero, you can check the other for a negative number, etc. That's also what JM does (bug 584770). It avoids a negative zero check for common cases like x * 2. But again, I don't know if it's worth it.
Hm sorry, I just realized that jstracer already does that, kind of. Must have been added recently or I didn't notice it last time I glanced over it :)
Created attachment 479119 [details] [diff] [review]
new and improved!

Here's a new patch that just fixes the Math.floor issue.

I was about to add a comment for the MAGIC thing, and then I decided to stop being lazy and benchmark. It turns out that using an extra pointer argument to return whether the result fits in an integer is no slower than the MAGIC business. I'm asking for review again in case I got the stack allocation wrong somehow.
Attachment #478872 - Attachment is obsolete: true
Attachment #479119 - Flags: review?(nnethercote)
(In reply to comment #6)
> Actually, I'm interested to know how/why you put "0, 0)" there, because the
> thought of people getting these values wrong is something that worries me,
> because it can lead to very subtle bugs, and I'd like to come up with ways to
> make it harder to get wrong.

I just copied those lines from functionProbe, which must have been the first one I saw. I guess I just assumed that the zeros were unimportant.
Summary: TM: v8-crypto calls js_DoubleToInt32 too often → TM: v8-crypto calls js_DoubleToInt32 too often because Math.floor returns a double
(In reply to comment #12)
> 
> I just copied those lines from functionProbe, which must have been the first
> one I saw. I guess I just assumed that the zeros were unimportant.

Oh man, so you unluckily copies the one example out of dozens that was wrong :/
Comment on attachment 479119 [details] [diff] [review]
new and improved!

Looks good!  The insAlloc() call looks fine.

>+static jsint FASTCALL
>+ceilReturningInt(jsdouble x, int32_t *fits)
>+{
>+    jsdouble r = js_math_ceil_impl(x);
>+    int32_t i;
>+    if (JSDOUBLE_IS_INT32(r, &i)) {
>+        *fits = 1;
>+        return i;
>+    } else {
>+        *fits = 0;
>+        return 0;
>+    }
>+}
>+
>+JS_DEFINE_CALLINFO_2(static, INT32, ceilReturningInt, DOUBLE, INT32PTR, 1, ACCSET_NONE)

I was under the impression that 'int32' is preferred to 'int32_t' in
SpiderMonkey, but grep tells me the latter is almost twice as common.  So
it's probably ok.

The general idiom in SpiderMonkey is functions like ceilReturningInt() to
indicate success/failure via a 'bool' return value, and then the output
value is passed by reference, ie:

  static bool FASTCALL
  ceilReturningInt(jsdouble x, int32_t *out)

And the return type in the CALLINFO will be "BOOL".

r=me with that changed.  And can you re-measure to see what the improvement
is with just this issue fixed?  Thanks.
Attachment #479119 - Flags: review?(nnethercote) → review+
(In reply to comment #14)
> I was under the impression that 'int32' is preferred to 'int32_t' in
> SpiderMonkey, but grep tells me the latter is almost twice as common.  So
> it's probably ok.
> 
> The general idiom in SpiderMonkey is functions like ceilReturningInt() to
> indicate success/failure via a 'bool' return value, and then the output
> value is passed by reference, ie:
> 
>   static bool FASTCALL
>   ceilReturningInt(jsdouble x, int32_t *out)
> 
> And the return type in the CALLINFO will be "BOOL".
> 
> r=me with that changed.  And can you re-measure to see what the improvement
> is with just this issue fixed?  Thanks.

OK, thanks, I'll make these changes.

Unfortunately, it's not possible to separate the effect of the two changes. Either one of the problems causes a bunch of variables to be typed as doubles, which then leads to a bunch of js_DoubleToInt32 calls later on when these variables are used in bitops. Even if I fix one problem, the vars will still be polluted by the other one and the issue remains. So both patches are needed to get any measurable improvement.
(In reply to comment #15)
> 
> Unfortunately, it's not possible to separate the effect of the two changes.
> Either one of the problems causes a bunch of variables to be typed as doubles,
> which then leads to a bunch of js_DoubleToInt32 calls later on when these
> variables are used in bitops. Even if I fix one problem, the vars will still be
> polluted by the other one and the issue remains. So both patches are needed to
> get any measurable improvement.

I see.  Should be fine, it's ok to fix a problem in two stages.
Out of curiosity, would future integration with JS type inference (bug 557407) avoid the first issue automatically? It looks like that patch hasn't been worked on in a while, so I don't know if there are any solid plans to add it.
> would future integration with JS type inference (bug 557407) avoid the first
> issue automatically?

If it treats "finite double" as a distinct type from "arbitrary double", then possibly yes (because then we could tell that floor won't return NaN or Infinity given the input).
Comment on attachment 479119 [details] [diff] [review]
new and improved!

>+ceilReturningInt(jsdouble x, int32_t *fits)
>+{
>+    jsdouble r = js_math_ceil_impl(x);
>+    int32_t i;
>+    if (JSDOUBLE_IS_INT32(r, &i)) {
>+        *fits = 1;
>+        return i;
>+    } else {

else-after-return non-sequitur.

>+    if (JSDOUBLE_IS_INT32(r, &i)) {
>+        *fits = 1;
>+        return i;
>+    } else {

Ditto.

>+    if (JSDOUBLE_IS_INT32(r, &i)) {
>+        *fits = 1;
>+        return i;
>+    } else {

Again.

>@@ -11225,20 +11288,33 @@ TraceRecorder::callNative(uintN argc, JS
>     JSFunction* fun = GET_FUNCTION_PRIVATE(cx, funobj);
>     Native native = fun->u.n.native;
> 
>     switch (argc) {
>       case 1:
>         if (vp[2].isNumber() && mode == JSOP_CALL) {
>             if (native == js_math_ceil || native == js_math_floor || native == js_math_round) {
>                 LIns* a = get(&vp[2]);
>+                int32_t fits;
>                 if (isPromote(a)) {
>                     set(&vp[0], a);
>                     pendingSpecializedNative = IGNORE_NATIVE_CALL_COMPLETE_CALLBACK;
>                     return RECORD_CONTINUE;
>+                } else if (native == js_math_floor) {

Again -- the if (native == ...) if-else-if chain should just start on the next line.

/be
>+ceilReturningInt(jsdouble x, int32_t *fits)

Use int32 not int32_t -- the _t noise-suffix is creeping in but still way outnumbered.

/be
Paging bhackett re: comment 17 and comment 18.

/be
The type inference treats the result of Math.floor and similar functions as integers.  If a call to Math.floor is passed infinity/NaN or a finite value which doesn't fit in an int32, the fast native informs the inference and types are updated for the result of that call and everything downstream.  This is the same way which integer overflow is handled.  It's not really possible to statically determine a call to Math.floor can never overflow.
Created attachment 484109 [details] [diff] [review]
slight update
Attachment #479119 - Attachment is obsolete: true
Attachment #484109 - Flags: review?(nnethercote)
Attachment #484109 - Flags: review?(nnethercote) → review+

Comment 25

8 years ago
http://hg.mozilla.org/mozilla-central/rev/817760a6153a
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.