Closed Bug 1247909 Opened 4 years ago Closed 4 years ago

Assertion failure: !producer->isDiscarded(), at js/src/jit/IonAnalysis.cpp:2235 or Crash [@ remove] with PGO

Categories

(Core :: JavaScript Engine, defect, critical)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: decoder, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: assertion, regression, testcase, Whiteboard: [jsbugmon:update])

Crash Data

Attachments

(1 file)

The following testcase crashes on mozilla-central revision 576a6dcde5b6 (build with --enable-optimize --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-debug, run with --ion-eager --ion-pgo=on):

function test() {
  libdir(startTest("", c(""),
    test([{ 0 : c(), 0 : toString("", c(), [], tab([])) }])
  ));
  function f() {};
}
test();



Backtrace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff4aba700 (LWP 18512)]
0x0000000000698056 in CheckOperand (consumer=consumer@entry=0x7ffff69bf638, use=<optimized out>, usesBalance=usesBalance@entry=0x7ffff4ab99e0) at js/src/jit/IonAnalysis.cpp:2235
#0  0x0000000000698056 in CheckOperand (consumer=consumer@entry=0x7ffff69bf638, use=<optimized out>, usesBalance=usesBalance@entry=0x7ffff4ab99e0) at js/src/jit/IonAnalysis.cpp:2235
#1  0x00000000006be12c in js::jit::AssertBasicGraphCoherency (graph=...) at js/src/jit/IonAnalysis.cpp:2316
#2  0x0000000000686a36 in js::jit::OptimizeMIR (mir=mir@entry=0x7ffff69b81c0) at js/src/jit/Ion.cpp:1528
#3  0x000000000068b8b7 in js::jit::CompileBackEnd (mir=mir@entry=0x7ffff69b81c0) at js/src/jit/Ion.cpp:1993
#4  0x0000000000a0cfdd in js::HelperThread::handleIonWorkload (this=this@entry=0x7ffff6933a00) at js/src/vm/HelperThreads.cpp:1276
#5  0x0000000000a0eca7 in js::HelperThread::threadLoop (this=0x7ffff6933a00) at js/src/vm/HelperThreads.cpp:1603
#6  0x0000000000ac0e01 in nspr::Thread::ThreadRoutine (arg=0x7ffff692e280) at js/src/vm/PosixNSPR.cpp:45
#7  0x00007ffff7bc4182 in start_thread (arg=0x7ffff4aba700) at pthread_create.c:312
#8  0x00007ffff6cb3fbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
rax	0x0	0
rbx	0x7ffff69bf428	140737330803752
rcx	0x7ffff6ca53cd	140737333842893
rdx	0x0	0
rsi	0x7ffff6f7a9d0	140737336814032
rdi	0x7ffff6f791c0	140737336807872
rbp	0x7ffff4ab9980	140737298274688
rsp	0x7ffff4ab9980	140737298274688
r8	0x7ffff4aba700	140737298278144
r9	0x6372732f736a2f6c	7165916604736876396
r10	0x7ffff4ab9740	140737298274112
r11	0x7ffff6c27960	140737333328224
r12	0x7ffff69bf648	140737330804296
r13	0x7ffff69bf638	140737330804280
r14	0x29	41
r15	0x3	3
rip	0x698056 <CheckOperand(js::jit::MNode const*, js::jit::MUse const*, int32_t*)+198>
=> 0x698056 <CheckOperand(js::jit::MNode const*, js::jit::MUse const*, int32_t*)+198>:	movl   $0x8bb,0x0
   0x698061 <CheckOperand(js::jit::MNode const*, js::jit::MUse const*, int32_t*)+209>:	callq  0x4a4f50 <abort()>
Needinfo from :nbp due to --ion-pgo.
Flags: needinfo?(nicolas.b.pierron)
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
JSBugMon: Bisection requested, result:
=== Treeherder Build Bisection Results by autoBisect ===

The "good" changeset has the timestamp "20151112020845" and the hash "6ab5b8056202fa1b1efcd0faf03fa9c16d8403a6".
The "bad" changeset has the timestamp "20151112025748" and the hash "5ddb8631dbccb9cca635eeb2c9252fe2ccaff4bf".

Likely regression window: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=6ab5b8056202fa1b1efcd0faf03fa9c16d8403a6&tochange=5ddb8631dbccb9cca635eeb2c9252fe2ccaff4bf
Some --ion-pgo=on issues can be existing bugs which are slightly harder to trigger, flag as sec-want until we figure this out.
Group: javascript-core-security
Keywords: sec-want
Crash Signature: [@ remove]
Summary: Assertion failure: !producer->isDiscarded(), at js/src/jit/IonAnalysis.cpp:2235 with PGO → Assertion failure: !producer->isDiscarded(), at js/src/jit/IonAnalysis.cpp:2235 or Crash [@ remove] with PGO
Ok, I am able to easily reproduce this issue both under gdb, and without gdb.  The problem likely comes during the third Ion compilation, and depends on on the fact that we already compiled asynchronously and linked the code once.

Unfortunately, I was unable to reproduce (over 24k runs) the following under rr, with the chaos mode of rr-4.1.0-101-g9d54c08 (master).
Flags: needinfo?(roc)
The problem comes from the fact that MBasicBlock::safeInsertTop failed to settle after the last constant used by the entryResumePoint of the first basic block of an inlined function.  The reason it failed is that we generate the following sequence:

  block1:
    resumepoint mode=At (caller in block0) constant75 constant76 constant30 constant78
  constant75 = constant undefined
  constant76 = constant undefined
  functionenvironment77 = functionenvironment constant73
  constant78 = constant undefined
  constant79 = constant function f at 0xff64138f800
  lambda80 = lambda functionenvironment77 constant79

Thus, when we convert this basic block to a bailing basic block, we settle on MFunctionEnvironment, and remove all the instructions which are after, leaving us with a resume point, which is needed for the newly added MBail, but which has operands which got removed from the MIRGraph.
(In reply to Nicolas B. Pierron [:nbp] from comment #4)
> Ok, I am able to easily reproduce this issue both under gdb, and without
> gdb.  The problem likely comes during the third Ion compilation, and depends
> on on the fact that we already compiled asynchronously and linked the code
> once.
> 
> Unfortunately, I was unable to reproduce (over 24k runs) the following under
> rr, with the chaos mode of rr-4.1.0-101-g9d54c08 (master).

This was because rr was always only reporting 1 core (since that's what you basically get).

I fixed rr chaos mode to report a random number of cores, which makes this reproducible in rr.
Flags: needinfo?(roc)
Removing the status flag until we find the source of the issue, as the current test case depends on a feature which is not landed yet.
I recommend to keep this bug as s-s as it is part of a category of issues that we did not to pay much attention before.  I suggest we keep it marked s-s until we figure if any of the variants of this issue is exploitable, as part of Bug 1257929.
Flags: needinfo?(nicolas.b.pierron)
Comment on attachment 8732279 [details] [diff] [review]
Move MFunctionEnvironment after the entry resume points operands.

Review of attachment 8732279 [details] [diff] [review]:
-----------------------------------------------------------------

Don't think there is a reason to have this sec-sensitive. Since constants are used at uses, it is not an issue without PGO.
Seems that for PGO this makes a difference. I wouldn't be surprised that we see similar bugs surface.

This also means we no longer allow constants to be defined after their uses (except for Phi).
Can we assert on this?

I.e. in AssertBasicGraphConsistency:
- test that every operand to a block entryresumepoint is
-- defined before the block.
-- phi (not in loopheader), where the operands are defined before the block.
-- phi (not in loopheader), the operand is defined inside the block is part of 'safeInsertTop' (to make sure constants are defined first). (This will require some refactor to get the logic out of safeInsertTop)
- test that for a resumepoint in an instruction the operands are defined before the instruction.

We might already assert on some of the above, but definitely not all, given this bug.
Attachment #8732279 - Flags: review?(hv1989) → review+
(In reply to Hannes Verschore [:h4writer] from comment #10)
> This also means we no longer allow constants to be defined after their uses
> (except for Phi).

I don't think we ever allowed that for Phi, only entry resume points are the exceptions.
We already have assertions to ensure that definitions are dominating the uses.
Keywords: sec-want
Group: javascript-core-security
https://hg.mozilla.org/mozilla-central/rev/1cbb64847cc9
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
You need to log in before you can comment on or make changes to this bug.