626398 - A decision to abort in the tracer gets lost

Reporter

Description

•

13 years ago

User-Agent:       Mozilla/5.0 (X11; Linux i686; rv:2.0b10pre) Gecko/20110114 Firefox/4.0b10pre
Build Identifier: 

In bug #625494 I stumbled on a bug. I got a "unexpected constantly false guard detected" on a guard. Now first I thought it was problem with my patch, but after examination the error is reproducible without my patch in a lookupswitch:

Testcase:
function g(n) {
  switch (n) {
    case 0:
    case "1": 
	print("case "+n+":  "+tracemonkey.onTrace)
	break;
    default:
	print("default: "+tracemonkey.onTrace)
	break;
  }
}

for (let i = 0; i != 30; i++) {
  g(i%4/2);
}

Output when run with -j:
case 0:  false
default: false
default: false
default: false
case 0:  false
default: false
default: false
default: false
Assertion failure: unexpected constantly false guard detected, at jstracer.cpp:4343
Aborted

The problem arise on the following guard:

guard(true, w.name(w.eqd(v_ins, w.immd(d)), "guard(switch on numeric)"), BRANCH_EXIT);

So only when the trace get's recorded and the input argument is 0 (non constant) and the type of the argument in de tracerecord is double.

I think there is a wrong check in the guard that makes the "eqd" a constant, but it really isn't.


Reproducible: Always

Hannes Verschore [:h4writer]

Reporter

Comment 1

•

13 years ago

Attached patch Patch (obsolete) — Details — Splinter Review

Ok I have found the problem. The problem is in the tracerecorder of \, so in the alu on LIR_divd. 

There is this guard:

/* Don't lose a -0. */
guard(false, w.eqi0(result), exit);

Now this also guards when result is a normal integer 0. This is wrong. The right check is:

/* Don't lose a -0. */
guard(false, w.ltiN(result, 0), exit);

Attachment #504707 - Flags: review?

Hannes Verschore [:h4writer]

Reporter

Comment 2

•

13 years ago

Attached patch Patch (obsolete) — Details — Splinter Review

Sorry was a bit to fast. The previous patch also guarded all negative integers. But it should only guard negative 0. So this is the updated patch.

Attachment #504707 - Attachment is obsolete: true

Attachment #504711 - Flags: review?

Attachment #504707 - Flags: review?

Boris Zbarsky [:bzbarsky]

Updated

•

13 years ago

Attachment #504711 - Flags: review? → review?(nnethercote)

Hannes Verschore [:h4writer]

Reporter

Comment 3

•

13 years ago

Attached patch Patch using MaybeBranch (obsolete) — Details — Splinter Review

Well in "modd" there is also a check on negative zero. There a MaybeBranch is used. So I copied that. Not sure what is better? So I provide here the MaybeBranch method.

Attachment #504773 - Flags: review?

Boris Zbarsky [:bzbarsky]

Comment 4

•

13 years ago

haytjes, you want to request review from someone specific.  I'd recomment ":njn".

Status: UNCONFIRMED → NEW

Ever confirmed: true

Jason Orendorff [:jorendorff]

Comment 5

•

13 years ago

It seems like we are emitting several guards here (in this LIR_divd code) that could be constantly false. The easiest thing to do would be: before emitting such a guard, check that the current values would pass the guard. If not, give up recording.

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

13 years ago

(In reply to comment #5)
> It seems like we are emitting several guards here (in this LIR_divd code) that
> could be constantly false. The easiest thing to do would be: before emitting
> such a guard, check that the current values would pass the guard. If not, give
> up recording.

I suspect that's not enough.  Nanojit has become smarter about detecting tests that always succeed/fail by looking at earlier guards.  Eg. if you have:

  xt c

then you know that afterwards 'c' must be false.

Hannes Verschore [:h4writer]

Reporter

Comment 7

•

13 years ago

@bz: k, I'll do that next time, ty for info.

@njn: I went through the code again of LIR_divd and seems like the code don't emit several guards that could be constantly false. (Like reported in comment 5) For the guard on line 8366 there is a check on the current values on line 8327 (jstracer.cpp). So the only possible constantly false check is the "0" guard. That should be a -0 check. So my patch is still valid.
Also your comment about not being enough. It will be enough. The problem arose because the guard made sure v_ins couldn't/shouldn't be (+/-)0. But the recorder was recording when v_ins was 0. In the switchop a guard is set for the case it is currently recording. So in this case 0. The guard made sure v_ins should be 0. There the contradiction came ("v_ins not 0" and "v_ins must be 0").

Hannes Verschore [:h4writer]

Reporter

Updated

•

13 years ago

Attachment #504711 - Attachment is obsolete: true

Attachment #504711 - Flags: review?(nnethercote)

Hannes Verschore [:h4writer]

Reporter

Updated

•

13 years ago

Attachment #504773 - Flags: review? → review?(nnethercote)

Jason Orendorff [:jorendorff]

Comment 8

•

13 years ago

(In reply to comment #6)
> >                      [...] The easiest thing to do would be: before emitting
> > such a guard, check that the current values would pass the guard. If not, give
> > up recording.
> 
> I suspect that's not enough.  Nanojit has become smarter about detecting tests
> that always succeed/fail by looking at earlier guards.

But a condition that is true at record time can't be statically false, no matter how smart Nanojit is.

(What the TraceRecorder records is the interpreter's actions and decision points. So the static assumptions baked into the trace must be true of the record time state at every point, or we have a tracer bug.

This rule is baked into the tracing jit very deeply. For example, suppose we trace `x = ""`. After that we'll emit code that assumes x is a string, but we don't actually track this static assumption anywhere in the TraceRecorder. Rather each time we need to emit code involving x, we look at the record-time value of x to see what type it is. The static assumptions are "stored" in the interpreter state.)

Bob Clary [:bc] (inactive)

Comment 9

•

13 years ago

I just saw this assertion with http://www.instalki.pl/programy/download/Windows/odczyt_edycja_pdf/Adobe_Reader.html on 64bit Linux but I haven't been able to reproduce it locally. I assume the appearance of this assert in crash testing is due to yesterday's merge.

Summary: Tracing a double 0 gives "unexpected constantly false guard detected" → Assertion failure: unexpected constantly false guard detected Tracing a double 0

Hannes Verschore [:h4writer]

Reporter

Comment 10

•

13 years ago

(In reply to comment #9)
> I just saw this assertion with
> http://www.instalki.pl/programy/download/Windows/odczyt_edycja_pdf/Adobe_Reader.html
> on 64bit Linux but I haven't been able to reproduce it locally. I assume the
> appearance of this assert in crash testing is due to yesterday's merge.

I can't reproduce it on that site. And it is very well possible that the problem is somewhere else then in this bug. If you find a (reproducible) testcase, it may help.

Andreas Gal :gal

Comment 11

•

13 years ago

We should take this after nick reviewed it.

Bob Clary [:bc] (inactive)

Comment 12

•

13 years ago

I tried to reproduce but could not on the original url. I could reproduce it regularly when spidering the site from the original url but it kept happening at different urls. I assume it is ad related, but I haven't been able to get a reproducible chunk of page to reduce yet. Sorry.

Nicholas Nethercote [inactive]

Assignee

Comment 13

•

13 years ago

bc, this assertion can come up in a range of circumstances, it's probably a different case that you're seeing to the one haytjes is seeing.

Nicholas Nethercote [inactive]

Assignee

Comment 14

•

13 years ago

I can't reproduce the problem in comment 0 -- my shell complains about there being no such object called 'tracemonkey'.

As for this guard:

> /* Don't lose a -0. */
> guard(false, w.eqi0(result), exit);

I can see that it's more pessimistic than necessary -- it'll side-exit if the result is 0 or -0.  But I don't think that's a correctness problem.  Eg. this program runs fine:

function f(i) {
    var x = 0 / -i;
    if (x === 0 && (1/x) === -Infinity)
        print("negzero");
    else
        print("other");
}

for (var i = -20; i < 10; i++) {
    f(i);
}

Jeff Walden [:Waldo]

Comment 15

•

13 years ago

(In reply to comment #14)
> my shell complains about there being no such object called 'tracemonkey'.

Run with -j to get that object.  No -j, no tracemonkey.

Nicholas Nethercote [inactive]

Assignee

Comment 16

•

13 years ago

(In reply to comment #15)
> 
> Run with -j to get that object.  No -j, no tracemonkey.

Ah, yes.  Thanks.  But now I can't reproduce the assertion failure, on 32-bit or 64-bit.

Hannes Verschore [:h4writer]

Reporter

Comment 17

•

13 years ago

(In reply to comment #16)
> Ah, yes.  Thanks.  But now I can't reproduce the assertion failure, on 32-bit
> or 64-bit.

Yeah, the testcase doesn't work anymore. This is because of the fix in bug #631788. In this fix the tableswitch doesn't assert anymore when the guard value==0 is done twice. (because of abortIfAlwaysExits).

So the symptom of this bug is gone. Not the real issue. I tried to find another example but didn't succeed yet...

(In reply to comment #14)

> I can't reproduce the problem in comment 0 -- my shell complains about there
> being no such object called 'tracemonkey'.
> 
> As for this guard:
> 
> > /* Don't lose a -0. */
> > guard(false, w.eqi0(result), exit);
> 
> I can see that it's more pessimistic than necessary -- it'll side-exit if the
> result is 0 or -0.  But I don't think that's a correctness problem.  Eg. this
> program runs fine:
> 
> function f(i) {
>     var x = 0 / -i;
>     if (x === 0 && (1/x) === -Infinity)
>         print("negzero");
>     else
>         print("other");
> }
> 
> for (var i = -20; i < 10; i++) {
>     f(i);
> }

The bug isn't a correctness bug. It will always give the right result. The issue is the 'speed' (minor):

Currently:
1) a trace get's recorded that contains: i/3 or i/4 or ...
2) a guard will get added that guards if result is '0' or '-0'
3) the trace get's run a few times

Result:
If a 0 is encountered the code will guard and go to the interpreter till it has run enough times and record a new trace.

Expected:
2) guard only when result is '-0'

Expected result:
If a 0 is encountered the code just run further in the trace. No new trace should get recorded.

Nicholas Nethercote [inactive]

Assignee

Comment 18

•

13 years ago

Comment on attachment 504773 [details] [diff] [review]
Patch using MaybeBranch

Seems fine, esp. since it mirrors the code for LIR_modi.  I'll do some profiling with Sunspider and V8 before landing.  Thanks, and sorry I took so long to process this bug.

Attachment #504773 - Flags: review?(nnethercote) → review+

Nicholas Nethercote [inactive]

Assignee

Updated

•

13 years ago

Summary: Assertion failure: unexpected constantly false guard detected Tracing a double 0 → Trace code for integer div side-exits on 0 or -0 result, but doesn't need to on 0

Whiteboard: see comment 14 and 17

Hannes Verschore [:h4writer]

Reporter

Comment 19

•

13 years ago

Let me adjust my statement. the symptom of this bug is not gone, the symptom is just something else. So the testcase is still valid. I give a better testcase here:

function g(n) {
  switch (n) {
    case 0:
    print("case   : "+n);
    break;
    default:
    print("default: "+n)
    break;
  }
}

for (let i = 0; i != 30; i+=2) {
  g(i%4/2);
}


Ran with -j:
case   : 0
default: 1
case   : 0
default: 1
case   : 0
default: 1
case   : 0
default: 1
case   : 0
case   : 1
case   : 0
case   : 1
case   : 0
case   : 1
case   : 0


Expected result:
case   : 0
default: 1
case   : 0
default: 1
case   : 0
default: 1
case   : 0
default: 1
case   : 0
default: 1
case   : 0
default: 1
case   : 0
default: 1
case   : 0


Explanation:

The trace starts after 'x' iterations. Variable i is 0 at that moment.
1) LIR_divd gets recorded. So a guard on i==0 is added. This happens when the actual value of i equals 0 !!!
2) switchop gets recorded. So a guard on i==0 is added. Because of (1) this check is discarded. 

The trace then runs with i equals 1
1) The first guard does nothing. i equals 1
2) Here there should have been a guard, but there isn't one. So the code for i is 0 is ran instead of the code when i equals 1.

@njn: I noticed you updated the code of alu to tryToDemote. Is it because of that this patch isn't committed yet? I'll reapply the patch to the newest trunk.

Hannes Verschore [:h4writer]

Reporter

Comment 20

•

13 years ago

Attached patch Updated patch using MaybeBranch (obsolete) — Details — Splinter Review

Attachment #504773 - Attachment is obsolete: true

Attachment #520583 - Flags: review?(nnethercote)

Nicholas Nethercote [inactive]

Assignee

Comment 21

•

13 years ago

Attached patch patch — Details — Splinter Review

haytjes, your explanation isn't quite correct but it led me to the correct explanation.

This one's my fault, I screwed up in bug 631788 -- I forgot to check the result of a call to guard().  So what was happening was that the tracer decided to abort -- because in the DIV we exited if the div result was zero, and in the SWITCH we exited if the div result was non-zero, so we clearly always exit -- but the decision to abort was lost due to the missing check, so we kept compiling.

Assignee: general → nnethercote

Attachment #520583 - Attachment is obsolete: true

Status: NEW → ASSIGNED

Attachment #520591 - Flags: review?

Attachment #520583 - Flags: review?(nnethercote)

Nicholas Nethercote [inactive]

Assignee

Updated

•

13 years ago

Blocks: 631788

Summary: Trace code for integer div side-exits on 0 or -0 result, but doesn't need to on 0 → A decision to abort in the tracer gets lost

Whiteboard: see comment 14 and 17 → see comment 21

Nicholas Nethercote [inactive]

Assignee

Updated

•

13 years ago

Attachment #520591 - Flags: review? → review?(dmandelin)

Hannes Verschore [:h4writer]

Reporter

Comment 22

•

13 years ago

Owh, my fault. In that case I switch back to my first statement. The symptom is more or less gone. But the patch is actually still valid for this bug.

When guarding in a trace, the same check has to be done on the current value it is tracing on. 

So in this case there is a check for -0:

        if (jsint(r) != r || JSDOUBLE_IS_NEGZERO(r))
            goto undemotable;

And the corresponding guard should be:

        /* Don't lose a -0. */
        if (MaybeBranch mbr = w.jf(w.eqi0(result))) {
            guard(false, w.ltiN(d0, 0), exit);
            w.label(mbr);
        }

Currently the guard is

        guard(false, w.eqi0(result), exit);

But that not only takes -0, but also 0 in account. And there is no check if the value is '0'. So potentially we create a trace on '0', but that would guard when running. Therefor the guard should get specialized to only take -0 in account.

Nicholas Nethercote [inactive]

Assignee

Comment 23

•

13 years ago

To clarify:  are you just reiterating what comment 14 said (the guard is more pessimistic than necessary, but is correct in its current form), or something else?  If it's the former, I can merge my correctness patch with your patch in order to generate slightly better code.

Hannes Verschore [:h4writer]

Reporter

Comment 24

•

13 years ago

(In reply to comment #23)
> To clarify:  are you just reiterating what comment 14 said (the guard is more
> pessimistic than necessary, but is correct in its current form), or something
> else?  If it's the former, I can merge my correctness patch with your patch in
> order to generate slightly better code.

Yeah, I was just reiterating comment 14.

David Mandelin [:dmandelin]

Updated

•

13 years ago

Attachment #520591 - Flags: review?(dmandelin) → review+

Nicholas Nethercote [inactive]

Assignee

Updated

•

13 years ago

Attachment #504773 - Flags: review+ → review-

Nicholas Nethercote [inactive]

Assignee

Comment 25

•

13 years ago

haytjes, this code isn't valid for div:

        /* Don't lose a -0. */
        guard(false, w.eqi0(result), exit);
        if (MaybeBranch mbr = w.jf(w.eqi0(result))) {
            guard(false, w.ltiN(d0, 0), exit);
            w.label(mbr);
        }

It will do the wrong thing for 0 / -1, the result of which is -0 and so requires a side-exit.

The reason that code is valid for mod is that the sign of a mod operation in JavaScript depends solely on the LHS.  This isn't true for div.

Hannes Verschore [:h4writer]

Reporter

Comment 26

•

13 years ago

Sry, my code is indeed wrong. I should have checked the RHS instead of the LHS. For certainty I took the ECMA262 with me to find when a -0 can occur.

1) LHS is 0 and the RHS has the opposite sign.
So in this case RHS < 0.

2) LHS is -0 and the RHS has the opposite sign.
LHS shouldn't be -0, because that is not an integer.

3) inf/inf
LHS nor RHS should be integer, so can't be inf.

So the patch should be:
        /* Don't lose a -0. */
        guard(false, w.eqi0(result), exit);
        if (MaybeBranch mbr = w.jf(w.eqi0(result))) {
            guard(false, w.ltiN(d1, 0), exit);
            w.label(mbr);
        }

At the moment I'm not an my normal computer, so I can't build it to test and provide a patch, sry.

Nicholas Nethercote [inactive]

Assignee

Comment 27

•

13 years ago

haytjes, I just pushed the missed-abort fix, but I didn't optimize the guard.  I did this because this bug has been so messy, with various incorrect diagnoses, incorrect patches, and an early inappropriate r+.  (Not that I'm complaining;  these things happen sometimes and I'm as much to blame as anyone :)  And fixing the correctness bug is more important IMHO than a slight optimization.

So I'm being really conservative and just fix the correctness issue right now.
If you still want to optimize the guard, can you file a follow-up bug and CC me?

http://hg.mozilla.org/tracemonkey/rev/7081b38c04b7

Thanks for your persistence on this bug, and sorry for all the delays on my end.

Whiteboard: see comment 21 → see comment 21, fixed-in-tracemonkey

Chris Leary [:cdleary] (not checking bugmail)

Comment 28

•

13 years ago

http://hg.mozilla.org/mozilla-central/rev/7081b38c04b7

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Patch 13 years ago Hannes Verschore [:h4writer] 398 bytes, patch		Details \| Diff \| Splinter Review
Patch 13 years ago Hannes Verschore [:h4writer] 429 bytes, patch		Details \| Diff \| Splinter Review
Patch using MaybeBranch 13 years ago Hannes Verschore [:h4writer] 841 bytes, patch	n.nethercote : review-	Details \| Diff \| Splinter Review
Updated patch using MaybeBranch 13 years ago Hannes Verschore [:h4writer] 475 bytes, patch		Details \| Diff \| Splinter Review
patch 13 years ago Nicholas Nethercote [inactive] 1.56 KB, patch	dmandelin : review+	Details \| Diff \| Splinter Review