Closed Bug 1186006 Opened 5 years ago Closed 4 years ago

Crash at js::jit::LRecoverInfo::appendResumePoint js::jit::LRecoverInfo::init js::jit::LRecoverInfo::New js::jit::LIRGeneratorShared::getRecoverInfo js::jit::LIRGeneratorShared::buildSnapshot

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: cbook, Unassigned)

References

(Blocks 1 open bug, )

Details

(Keywords: crash)

Attachments

(4 files)

Found by bughunter and reproduced on a windows 7 trunk debug build

steps to reproduce:
-> Load http://w110.bcn.cat/portal/site/MuseuPicasso/menuitem.61a8b00f5dbc36aca4b2e31220348a0c/?vgnextoid=3d09f87a9950e410VgnVCM1000001947900aRCRD&vgnextchannel=0000000417706709VgnV6CONT00000000000RCRD&vgnextfmt=formatDetall&lang=ca_ES

-->> js::jit::LRecoverInfo::appendResumePoint js::jit::LRecoverInfo::init js::jit::LRecoverInfo::New js::jit::LIRGeneratorShared::getRecoverInfo js::jit::LIRGeneratorShared::buildSnapshot
Flags: needinfo?(nicolas.b.pierron)
fyi, I was testing something where I needed a reproducible crash and found that I could reproduce this in Nightly/42 opt/debug on Linux x86_64. Looking at crash-stats, it's a pretty low volume crasher.
Based on the stack, it seems that we have a NULL resume point, which is not supposed to happen unless we have new split-edge blocks, or join blocks without resume-points (as suggested by Bug 1106171).

This is probably related to a new phase added to the compilation pipeline.
I tried adding an assertion, but unfortunately I get too many false positive created by SplitCriticalEdgesForBlock.  I think I have an idea which would let us remove tons of false positive created by this function, by inheriting the resume point from the successor instead of the predecessor.

I hope either the patch will fix this issue, or the assertion will highlight from where this issue is coming from.
This patch removes the propagation of the resume point from the
predecessors, which was done during the lowering.

Instead, this patch copy the resume point of the successor when the
split-edge block is allocated.  Note, this is different than the default
creation of basic block which are inheriting their resume point from their
predecessor.  Copying from the successor is on the other hand trivial, as
the successor has a resume point, and if it has Phi, then we should just
pick the operands from the current edge.

The difference, is that taking the predecessor one will resume before the
branch (as we did before, with some corner cases), and taking the successor
will resume after the branch (with no corner cases).
Attachment #8656575 - Flags: review?(bhackett1024)
Comment on attachment 8656575 [details] [diff] [review]
Add a copy of the successor resume point to the split-edge blocks.

Review of attachment 8656575 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIRGraph.cpp
@@ +341,5 @@
> +
> +        // A split edge is used to simplify the graph to avoid having a
> +        // predecessor with multiple successors as well as a successor with
> +        // multiple predecessors.  As instructions can be moved in this
> +        // split-edge block, we need to give this instruction a resume point. We

s/instruction/block/

@@ +345,5 @@
> +        // split-edge block, we need to give this instruction a resume point. We
> +        // have 2 options, either we take the last resume point of the
> +        // predecessor (which implies iterating backward over instructions), or
> +        // we copy the entry resume points of the successor (which implies
> +        // filtering phi to keep inputs from the current edge).

The last part of this comment is confusing.  Why not just describe the approach which is actually being used?
Attachment #8656575 - Flags: review?(bhackett1024) → review+
(In reply to Brian Hackett (:bhackett) from comment #5)
> Comment on attachment 8656575 [details] [diff] [review]
> Add a copy of the successor resume point to the split-edge blocks.
> 
> Review of attachment 8656575 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jit/MIRGraph.cpp
> @@ +341,5 @@
> > +
> > +        // A split edge is used to simplify the graph to avoid having a
> > +        // predecessor with multiple successors as well as a successor with
> > +        // multiple predecessors.  As instructions can be moved in this
> > +        // split-edge block, we need to give this instruction a resume point. We
> 
> s/instruction/block/
> 
> @@ +345,5 @@
> > +        // split-edge block, we need to give this instruction a resume point. We
> > +        // have 2 options, either we take the last resume point of the
> > +        // predecessor (which implies iterating backward over instructions), or
> > +        // we copy the entry resume points of the successor (which implies
> > +        // filtering phi to keep inputs from the current edge).
> 
> The last part of this comment is confusing.  Why not just describe the
> approach which is actually being used?

is this now ready for checkin ?
This patch found issues in our tests where the assertions does not hold, we should investigate more and make additional patches to fix these tests:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a558d10f5d5


PROCESS | 2542 | Assertion failure: block->entryResumePoint(), at /builds/slave/try-l64-d-00000000000000000000/build/src/js/src/jit/Lowering.cpp:4355
PROCESS-CRASH | /webvtt/rendering/cues-with-video/processing-model/bidi/u002E_u2028_u05D0.html | application crashed [@ js::jit::LIRGenerator::updateResumeState(js::jit::MBasicBlock*)]
TEST-UNEXPECTED-CRASH | /webvtt/rendering/cues-with-video/processing-model/bidi/u002E_u2028_u05D0.html | expected FAIL
This patch still is an intermittent failure on debug builds of Try (Wr) [1]
and I am unable to reproduce it locally, nor on the JS Shell.

If you some spare room for fuzzing, I would be interested in any signature
which looks like an assertion failure like:

  Assertion failure: block->entryResumePoint(), at .../js/src/jit/Lowering.cpp:4372
  SEGV at js::jit::LIRGenerator::updateResumeState(js::jit::MBasicBlock*)

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=995c7cf14ce3
Attachment #8677506 - Flags: review+
Attachment #8677506 - Flags: feedback?(choller)
Depends on: 1218470
Comment on attachment 8677506 [details] [diff] [review]
(fuzzing) Add a copy of the successor resume point to the split-edge blocks.

I take it that jsfunfuzz fuzzing is requested too.
Attachment #8677506 - Flags: feedback?(gary)
Depends on: 1218994
Comment on attachment 8677506 [details] [diff] [review]
(fuzzing) Add a copy of the successor resume point to the split-edge blocks.

Didn't find anything out of the ordinary these few days.
Attachment #8677506 - Flags: feedback?(gary) → feedback+
Comment on attachment 8677506 [details] [diff] [review]
(fuzzing) Add a copy of the successor resume point to the split-edge blocks.

I fuzzed this for a while but according to nbp, LangFuzz wasn't able to find the intermittent failure he was looking for. Clearing feedback request.
Attachment #8677506 - Flags: feedback?(choller)
nbp: is this ready for checkin ?
Depends on: 1238592
Ok, I finally found test cases to reproduce this issue (see Bug 1238592 and Bug 1221872).

I will fix this assertion tomorrow.
To make it short, the previous patch changed the way we create the split
edge blocks, to ensure that we have a resume point by making a copy of the
successor resume point.

This patch, instead of adding a resume point to the fixup block, flag fixup
blocks as unreachable, such that we can assert that unreachable blocks
should have no instructions once we reach the lowering phase.

This helps us remove the complex resume point propagation that we used to
have in the lowering phase, and add stricter assertions than what we used to
have in a few cases.
Attachment #8724058 - Flags: review?(sunfish)
Comment on attachment 8724058 [details] [diff] [review]
Assert that all blocks have an entry resume point, excepts for unreachable one added by the Value Numbering.

Review of attachment 8724058 [details] [diff] [review]:
-----------------------------------------------------------------

The more we can tie the special-casing to OSR, the more we can hopefully keep it from spreading too far. Please (a) mention OSR in comments for code dealing with these OSR-specific special cases, and (b) test for graph.osrBlock() in assertions whenever possible.

::: js/src/jit/Lowering.cpp
@@ +4591,5 @@
>  
>  void
>  LIRGenerator::updateResumeState(MBasicBlock* block)
>  {
> +    MOZ_ASSERT_IF(!mir()->compilingAsmJS() && !block->unreachable(), block->entryResumePoint());

This assert can be made stricter by also testing whether block->graph().osrBlock() when the block is unreachable().

@@ +4606,5 @@
>  
>      definePhis();
>  
> +    // See fixup blocks added by Value Numbering.
> +    MOZ_ASSERT_IF(block->unreachable(), *block->begin() == block->lastIns());

Can we also add a MOZ_ASSERT_IF(block->unreachable(), block->graph().osrBlock())? Also, please mention OSR in the comment.

::: js/src/jit/RangeAnalysis.cpp
@@ -2239,5 @@
>      JitSpew(JitSpew_Range, "Doing range propagation");
>  
>      for (ReversePostorderIterator iter(graph_.rpoBegin()); iter != graph_.rpoEnd(); iter++) {
>          MBasicBlock* block = *iter;
> -        MOZ_ASSERT(!block->unreachable());

Can we keep this assert, but change it to !block->unreachable() || graph_.osrBlock() ?

@@ +3383,5 @@
>          if (!block->unreachable())
>              continue;
>  
> +        // Filter out unreachable fake entries.
> +        if (block->numPredecessors() == 0)

Can this MOZ_ASSERT(graph.osrBlock())? Either way, the comment should mention "OSR" as being associated with this special case.
Attachment #8724058 - Flags: review?(sunfish) → review+
https://hg.mozilla.org/mozilla-central/rev/d95fd1fbee52
https://hg.mozilla.org/mozilla-central/rev/130026ae6a1e
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Flags: needinfo?(nicolas.b.pierron)
Depends on: 1257089
Depends on: 1298354
Depends on: 1319888
You need to log in before you can comment on or make changes to this bug.