Closed Bug 558814 Opened 14 years ago Closed 14 years ago

nanojit: handle const conditions for LIR_jt/LIR_jf

Categories

(Core Graveyard :: Nanojit, defect)

defect
Not set
critical

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: marcia, Unassigned)

References

()

Details

(Keywords: crash, regression, testcase, Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey)

Attachments

(3 files, 4 obsolete files)

Seen while running Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.3a5pre) Gecko/20100412 Minefield/3.7a5pre. Clean profile with no extensions.

STR:
1. Load the site in the URL. Crash

100% reproducible on the Mac and Windows 7. https://crash-stats.mozilla.com/report/index/4645a91e-46ad-448a-a9f0-cf8e32100412. There are a few other crashes in crash stats as well.
I can work on finding a regression range.
Crash does not happen with a clean profile using Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3.
Crash does not happen on x86-64/Linux:

Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.3a5pre) Gecko/20100412 Minefield/3.7a5pre

I'll try my Mac.
Running a debug i386/Mac build I get this assertion failure:

Assertion failed: cond->isCmp() (/Users/njn/moz/ws0/js/src/nanojit/Nativei386.cpp:1416)

I'd guess changeset 38082:7a62d623d36a is the culprit, but I'm not certain.

I have a fairly good idea how to fix this.
Attached file testcase
The problem is that Nanojit allows a LIR_jt/LIR_jf to have a const condition early on in the pipeline (eg. ValidateWriter allows it), but then the backends assume/assert that the condition is not const.

There are three possible parts to the fix.

- ExprFilter should optimize away conditional jumps with const conditions, just like it does with conditional guards.

- Then, we have to decide if this is enough -- in other words, is it safe to assume that ExprFilter is always run?  Probably not.  In which case we need to handle a const condition being passed to asm_branch().  This could be done in each backend, but it's probably better to do it in Assembler.

- Then, do we want to detect const conditions on jumps?  We currently do so in TM for const conditions on guards (in LIR.cpp):

 #ifdef JS_TRACER
                    // We're emitting a guard that will always fail. Any code
                    // emitted after this guard is dead code. We could
                    // silently optimize out the rest of the emitted code, but
                    // this could indicate a performance problem or other bug,
                    // so assert in debug builds.
                    NanoAssertMsg(0, "Constantly false guard detected");
 #endif

  If we do want to detect these, we need to know where the const condition came from.  Turns out it's from a dense array access -- if you read from an array using the array's own length as the index, the 'index < length' test gets optimised to false.  The attached file demonstrates.  I think this example is not a "performance problem or bug" and so perhaps we should not try to detect const conditions on jumps.  But if we do want to detect such cases, we'll need to handle the index==length case in jstracer.cpp.
Assignee: general → nnethercote
Status: NEW → ASSIGNED
Assignee: nnethercote → nobody
Component: JavaScript Engine → Nanojit
QA Contact: general → nanojit
Summary: Crash in Firefox 3.7a5pre Crash Report [@ nanojit::Assembler::findRegFor(nanojit::LIns*, int) ] → nanojit: handle const conditions for LIR_jt/LIR_jf
This feels a bit ridiculous, and so I'm inclined to not add that assertion detecting always-taken jumps in ExprFilter.
Attachment #438674 - Flags: feedback?
This is the ExprFilter change.  There's no assertion checking for the always-taken case.
Attachment #438676 - Flags: feedback?(edwsmith)
Attachment #438676 - Attachment is patch: true
Attachment #438676 - Attachment mime type: application/octet-stream → text/plain
I think we should avoid generating always-taken conditional jumps in the tracer regardless of what NJ does, as we do for conditional guards.
The 3rd piece of the puzzle.
Attachment #438683 - Flags: feedback?(edwsmith)
Marcia, any of the three patches should fix the crash :)

I'm inclined to land the two NJ patches, and skip the TM patch.  Anyone disagree?
I agree - the most general fix is to handle jt/jf+const in the Assembler.

For ExprFilter to have something to return in the "not taken" cases (jt(false) or jf(true)), we might want LIR_jn (never jump).  Otherwise ExprFilter has to return NULL, which requires special casing downstream, particularly where jump instructions are patched.
the "formal" way to deal with this is LIR_true and LIR_false condition constants, which are conditions and not value types.

condition-to-int:  cmov(c) ? 1 : 0
int-to-condition:  eq, lt, etc
const folding:     1 > 0 => LIR_true instead of LIR_imml(1), etc.

overkill?
Comment on attachment 438676 [details] [diff] [review]
NJ: optimize jumps with const conditions

The change looks correct, and should work fine in TR, we already have code higher up in the pipeline that essentially does the same thing (optimize to LIR_j or NULL, including checking for null).

its out of scope for this bug, but interested in your thoughts about LIR_jn (scope creep) or condition-constants LIR_true/false (more scope creep).
Attachment #438676 - Flags: feedback?(edwsmith) → feedback+
Comment on attachment 438683 [details] [diff] [review]
NJ: handle const conditions in jumps and guards in Assembler

Cool.
 * keeps the hair out of the cpu backends
 * agnostic about upstream optimizations or not
 
we might want test cases to ensure backends handle asm_jmp(LIR_jt/jf) correctly by ignoring the opcode (we want a jump in all cases, ins is passed along to get access to the target and/or to allow using ins for register allocation or as a table key). 

maybe add an assert at the top of Assembler::asm_jmp():
NanoAssert(ins->isop(LIR_j) || ins->isop(LIR_jt) && ins->oprnd1()->isconstval(1) || ins->isop(LIR_jf) && ins->oprnd1()->isconstval(0))
Attachment #438683 - Flags: feedback?(edwsmith) → feedback+
(In reply to comment #5)
> Created an attachment (id=438667) [details]
> testcase

Strange - I can't get this testcase to crash or assert 32-bit Mac shell.. Anything special that might have been needed to be done?
(In reply to comment #11)
> 
> For ExprFilter to have something to return in the "not taken" cases (jt(false)
> or jf(true)), we might want LIR_jn (never jump).  Otherwise ExprFilter has to
> return NULL, which requires special casing downstream, particularly where jump
> instructions are patched.

A better name for LIR_jn is LIR_nop :)  Actually I've been thinking about having different-sized nops:  nop1/nop2/nop3/nop4.  This would allow us to overwrite instructions in an existing LIR buffer.  I had some example in the past where this would have been useful, but I can't remember it now.


(In reply to comment #12)
> the "formal" way to deal with this is LIR_true and LIR_false condition
> constants, which are conditions and not value types.
> 
> condition-to-int:  cmov(c) ? 1 : 0
> int-to-condition:  eq, lt, etc
> const folding:     1 > 0 => LIR_true instead of LIR_imml(1), etc.
> 
> overkill?

More generally, the idea would be to formalize bool types, rather than having the current weird "bools are ints but there are some restrictions" situation.
Bools would still be value types, though.  But we'd have to add opcodes for boolean AND and OR at the least (and the 'b' suffix is taken for byte-sized values, dammit!)


(In reply to comment #14)
> (From update of attachment 438683 [details] [diff] [review])
> Cool.
>  * keeps the hair out of the cpu backends
>  * agnostic about upstream optimizations or not
> 
> we might want test cases to ensure backends handle asm_jmp(LIR_jt/jf) correctly
> by ignoring the opcode (we want a jump in all cases, ins is passed along to get
> access to the target and/or to allow using ins for register allocation or as a
> table key). 

It's a good idea, I'm not sure how to do it -- lirasm currently doesn't handle labels :(


> maybe add an assert at the top of Assembler::asm_jmp():
> NanoAssert(ins->isop(LIR_j) || ins->isop(LIR_jt) &&
> ins->oprnd1()->isconstval(1) || ins->isop(LIR_jf) &&
> ins->oprnd1()->isconstval(0))

Will do.
(In reply to comment #15)
> (In reply to comment #5)
> > Created an attachment (id=438667) [details] [details]
> > testcase
> 
> Strange - I can't get this testcase to crash or assert 32-bit Mac shell..
> Anything special that might have been needed to be done?

I just did "js -j a.js".
> More generally, the idea would be to formalize bool types, rather than having
> the current weird "bools are ints but there are some restrictions" situation.
> Bools would still be value types, though.  But we'd have to add opcodes for
> boolean AND and OR at the least (and the 'b' suffix is taken for byte-sized
> values, dammit!)

It would also clarify something -- currently 0 means false and 1 means true but it's not clear what other values mean, or if they shouldn't happen.
 
> > maybe add an assert at the top of Assembler::asm_jmp():
> > NanoAssert(ins->isop(LIR_j) || ins->isop(LIR_jt) &&
> > ins->oprnd1()->isconstval(1) || ins->isop(LIR_jf) &&
> > ins->oprnd1()->isconstval(0))

Eg. here:  should the LIR_jt part be '!ins->oprnd1()->isconstval(0)'?

Just to clarify:  I'm definitely in favour of making the type system stronger.  We might need some int-to-bool and bool-to-int conversions to, not sure.
(In reply to comment #8)
> I think we should avoid generating always-taken conditional jumps in the tracer
> regardless of what NJ does, as we do for conditional guards.

Thinking about this some more, NJ doesn't optimize them fully, eg. in this sequence:

    jf 0 foo
    ...
   foo:
  
The '...' is never executed but NJ doesn't remove it.  So that's an argument in favour of keeping the assertion.
Attached patch new TM patch (obsolete) — Splinter Review
Same as the previous TM patch, but with a test case too.
Attachment #438674 - Attachment is obsolete: true
Attachment #438940 - Flags: review?(dvander)
Attachment #438674 - Flags: feedback?
Attached patch new NJ patchSplinter Review
This is the two NJ patches merged, with an extra assertion at the top of asm_jmp(), and a JS_TRACER-only assertion if always-taken conditional jumps are encountered, to match the handling of guards.
Attachment #438676 - Attachment is obsolete: true
Attachment #438683 - Attachment is obsolete: true
Attachment #438941 - Flags: review?(edwsmith)
Attached patch new TM patch, v2Splinter Review
minor tweak
Attachment #438942 - Flags: review?(dvander)
I don't want to permit generating dead branches from the tracer. Thats always a bug. I don't mind if NJ learns how to optimize it, but we should still have an assert checking that we never emit something NJ would optimize.
Comment on attachment 438941 [details] [diff] [review]
new NJ patch

tested on TR, looks ok.

regarding the assert (isconst(1) vs !isconst(0)) I'm inclined to leave it tight (as-is) since any constant value should be the result of one of the condition-generating opcodes (LIR_eql, etc), which must be 0 or 1.  if we loosen this up like C rules, we should do it consistently everywhere in nanojit.

not that it matters yet, but requiring 0 or 1 simplifies doing boolean algebra on boolean values.
Attachment #438941 - Flags: review?(edwsmith) → review+
Attachment #438942 - Flags: review?(dvander) → review+
(In reply to comment #24)
> 
> not that it matters yet, but requiring 0 or 1 simplifies doing boolean algebra
> on boolean values.

0 and 1 is fine, I just realized recently that it wasn't specified clearly anywhere.

If we implement bug 559265 this question becomes moot, which is a point in favour of doing it.
(In reply to comment #17)
> (In reply to comment #15)
> > (In reply to comment #5)
> > > Created an attachment (id=438667) [details] [details] [details]
> > > testcase
> > 
> > Strange - I can't get this testcase to crash or assert 32-bit Mac shell..
> > Anything special that might have been needed to be done?
> 
> I just did "js -j a.js".

Thanks, njn, I can reproduce now.

(Not yet the smallest) regression range for TM:

e804d07cf96b seems to not work properly (i.e. crash):

changeset:   39558:e804d07cf96b
user:        Nicholas Nethercote
date:        Tue Mar 23 16:07:19 2010 -0700
summary:     Bug 517910 - NJ: add more alias-set annotations to LIR so as to improve CSEing of loads (TM-specific part).  r=gal,dvander.

ebee0290c5b3 does not seem to compile on a TM 32-bit Mac opt shell:

changeset:   39557:ebee0290c5b3
user:        Nicholas Nethercote
date:        Tue Mar 23 16:00:38 2010 -0700
summary:     Update nanojit-import-rev stamp.

changeset:   39556:0db87fb46cb2
user:        Nicholas Nethercote
date:        Tue Mar 23 15:49:12 2010 -0700
summary:     Follow-up assertion failure fix for bug 517910.  r=me.

31596ada8bfd does not seem to compile on a TM 32-bit Mac opt shell:

changeset:   39555:31596ada8bfd
user:        Nicholas Nethercote
date:        Tue Mar 23 15:28:41 2010 -0700
summary:     Update nanojit-import-rev stamp.

changeset:   39554:c4c2174afcca
user:        Nicholas Nethercote
date:        Tue Mar 23 15:05:47 2010 -0700
summary:     Bug 517910 - NJ: add more alias-set annotations to LIR so as to improve CSEing of loads.  r=edwsmith.

656054962a9e seems to work properly:

changeset:   39553:656054962a9e
user:        Edwin Smith
date:        Tue Mar 23 15:09:52 2010 -0400
summary:     Fix PPC bustage for bug 507089 (r=me)


Someone more knowledgeable might be able to identify the true cause..
Changesets 39556:0db87fb46cb2 and 39554:c4c2174afcca also do not compile to a TM 32-bit Mac opt shell.
Attachment #438940 - Attachment is obsolete: true
Attachment #438940 - Flags: review?(dvander)
(In reply to comment #26)
>
> Someone more knowledgeable might be able to identify the true cause..

Those are all patches for bug 517910, so they really constitute a single change.
(In reply to comment #29)
> (In reply to comment #26)
> >
> > Someone more knowledgeable might be able to identify the true cause..
> 
> Those are all patches for bug 517910, so they really constitute a single
> change.

Adding dependency to bug 517910.
Blocks: 517910
Keywords: regression, testcase
Nicholas: Will this patch take care of this crash that happened to me today: https://crash-stats.mozilla.com/report/index/bp-b4f72059-76b7-4f9c-aa8b-b65a22100416? If I need to file a new bug let me know. Thanks.
(In reply to comment #32)
> Nicholas: Will this patch take care of this crash that happened to me today:
> https://crash-stats.mozilla.com/report/index/bp-b4f72059-76b7-4f9c-aa8b-b65a22100416?

I'm pretty sure it will.  The fix has landed in the TraceMonkey repository but hasn't yet been merged to mozilla-central.  Once that happens hopefully you won't see these again!
Sayre, this is now the #4 crash for 3.7a5pre.  Is it worth cherry-picking the patches to mozilla-central now, rather than waiting for the next general merge?
http://hg.mozilla.org/mozilla-central/rev/49bb5513776d
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Verified fixed using Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.3a5pre) Gecko/20100426 Minefield/3.7a5pre as well as the latest Windows nightly. I verified by trying the link in the URL as well as the URL in Comment 16.
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: