Closed Bug 1456524 Opened 2 years ago Closed 2 years ago

Assertion failure: !cycleEnd_, at js/src/jit/MoveResolver.h:286 with OOM

Categories

(Core :: JavaScript Engine: JIT, defect, P1, critical)

ARM
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr52 --- wontfix
firefox-esr60 --- wontfix
firefox60 --- wontfix
firefox61 --- fixed
firefox62 --- fixed

People

(Reporter: decoder, Assigned: sstangl)

References

(Blocks 2 open bugs)

Details

(4 keywords, Whiteboard: [jsbugmon:update])

Attachments

(2 files, 1 obsolete file)

The following testcase crashes on mozilla-central revision 26e53729a109 (build with --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --disable-profiling --enable-debug --without-intl-api --enable-optimize --target=i686-pc-linux-gnu --enable-simulator=arm, run with --fuzzing-safe --ion-offthread-compile=off --ion-eager):

oomTest(function() {
  eval(`
    function f() {
      var n;
      for (var i = 0; i < 18; ++i) {
	if (k) {}
      }
      return [k, n];
    }
    var [a, b] = f();
  `);
});


Backtrace:

received signal SIGSEGV, Segmentation fault.
#0  0x084897d1 in js::jit::MoveResolver::PendingMove::setCycleEnd (this=<optimized out>, this=<optimized out>, cycleSlot=<optimized out>) at js/src/jit/MoveResolver.h:286
#1  js::jit::MoveResolver::resolve (this=<optimized out>) at js/src/jit/MoveResolver.cpp:271
#2  0x082d8233 in js::jit::CodeGenerator::visitMoveGroup (this=0xf3d7e000, group=0xf3db31e0) at js/src/jit/CodeGenerator.cpp:3330
#3  0x08320a63 in js::jit::CodeGenerator::generateBody (this=0xf3d7e000) at js/src/jit/CodeGenerator.cpp:5775
#4  0x0832141a in js::jit::CodeGenerator::generate (this=0xf3d7e000) at js/src/jit/CodeGenerator.cpp:10119
#5  0x0834a3ed in js::jit::GenerateCode (mir=0xf3dac0f8, lir=0xf3daeb38) at js/src/jit/Ion.cpp:1913
#6  0x083a8816 in js::jit::CompileBackEnd (mir=0xf3dac0f8) at js/src/jit/Ion.cpp:1935
#7  0x083a98e9 in js::jit::IonCompile (cx=<optimized out>, cx@entry=0xf6e1d800, script=<optimized out>, baselineFrame=baselineFrame@entry=0x0, osrPc=0x0, recompile=false, optimizationLevel=js::jit::OptimizationLevel::Normal) at js/src/jit/Ion.cpp:2215
#8  0x083a9f43 in js::jit::Compile (cx=cx@entry=0xf6e1d800, script=script@entry=..., osrFrame=osrFrame@entry=0x0, osrPc=0x0, forceRecompile=false) at js/src/jit/Ion.cpp:2425
#9  0x083aa070 in js::jit::CanEnterIon (cx=0xf6e1d800, state=...) at js/src/jit/Ion.cpp:2509
#10 0x083ccacc in js::jit::MaybeEnterJit (cx=0xf6e1d800, state=...) at js/src/jit/Jit.cpp:140
#11 0x081e239f in js::RunScript (cx=<optimized out>, state=...) at js/src/vm/Interpreter.cpp:402
#12 0x081e2a05 in js::InternalCallOrConstruct (cx=<optimized out>, args=..., construct=js::NO_CONSTRUCT) at js/src/vm/Interpreter.cpp:489
#13 0x081e2cf0 in InternalCall (cx=cx@entry=0xf6e1d800, args=...) at js/src/vm/Interpreter.cpp:516
#14 0x081e2e6f in js::CallFromStack (cx=0xf6e1d800, args=...) at js/src/vm/Interpreter.cpp:522
#15 0x082be1a4 in js::jit::DoCallFallback (cx=<optimized out>, frame=0xf5bffca8, stub_=0xf3da2030, argc=0, vp=0xf5bffc78, res=...) at js/src/jit/BaselineIC.cpp:2380
#16 0x085f6410 in js::jit::Simulator::softwareInterrupt (this=0xf6e58000, instr=0xf6e78164) at js/src/jit/arm/Simulator-arm.cpp:2707
#17 0x085f6c16 in js::jit::Simulator::decodeType7 (this=0xf6e58000, instr=0xf6e78164) at js/src/jit/arm/Simulator-arm.cpp:3876
#18 0x085f4b02 in js::jit::Simulator::instructionDecode (this=0xf6e58000, instr=0xf6e78164) at js/src/jit/arm/Simulator-arm.cpp:4856
#19 0x085f882a in js::jit::Simulator::execute<false> (this=0xf6e58000) at js/src/jit/arm/Simulator-arm.cpp:4911
#20 js::jit::Simulator::callInternal (this=0xf6e58000, entry=0x20469800 "\360O-\351\004\320M\342\020\212-\355\r\200\240\341h\220\235\345\r\260\240\341t\240\235", <incomplete sequence \345>) at js/src/jit/arm/Simulator-arm.cpp:4991
#21 0x085f8a49 in js::jit::Simulator::call (this=<optimized out>, entry=<optimized out>, argument_count=<optimized out>) at js/src/jit/arm/Simulator-arm.cpp:5074
#22 0x083cbdb0 in EnterJit (cx=<optimized out>, cx@entry=0xf6e1d800, state=..., code=0x21d21670 "\004\340-\345\a") at js/src/jit/Jit.cpp:101
#23 0x083ccaef in js::jit::MaybeEnterJit (cx=0xf6e1d800, state=...) at js/src/jit/Jit.cpp:163
#24 0x081e239f in js::RunScript (cx=<optimized out>, state=...) at js/src/vm/Interpreter.cpp:402
#25 0x081e4cd9 in js::ExecuteKernel (cx=<optimized out>, script=..., envChainArg=..., newTargetValue=..., evalInFrame=..., result=<optimized out>) at js/src/vm/Interpreter.cpp:700
#26 0x0821a61a in EvalKernel (cx=<optimized out>, v=..., v@entry=..., evalType=evalType@entry=DIRECT_EVAL, caller=..., env=..., pc=0xf57b4275 "{\001", vp=...) at js/src/builtin/Eval.cpp:323
#27 0x0821acc0 in js::DirectEval (cx=<optimized out>, v=..., vp=...) at js/src/builtin/Eval.cpp:433
#28 0x082be7ed in js::jit::DoCallFallback (cx=<optimized out>, frame=0xf5bffdd8, stub_=0xf6e89050, argc=1, vp=0xf5bffd98, res=...) at js/src/jit/BaselineIC.cpp:2364
#29 0x085f6410 in js::jit::Simulator::softwareInterrupt (this=0xf6e58000, instr=0xf6e78164) at js/src/jit/arm/Simulator-arm.cpp:2707
#30 0x085f6c16 in js::jit::Simulator::decodeType7 (this=0xf6e58000, instr=0xf6e78164) at js/src/jit/arm/Simulator-arm.cpp:3876
#31 0x085f4b02 in js::jit::Simulator::instructionDecode (this=0xf6e58000, instr=0xf6e78164) at js/src/jit/arm/Simulator-arm.cpp:4856
#32 0x085f882a in js::jit::Simulator::execute<false> (this=0xf6e58000) at js/src/jit/arm/Simulator-arm.cpp:4911
#33 js::jit::Simulator::callInternal (this=0xf6e58000, entry=0x20469800 "\360O-\351\004\320M\342\020\212-\355\r\200\240\341h\220\235\345\r\260\240\341t\240\235", <incomplete sequence \345>) at js/src/jit/arm/Simulator-arm.cpp:4991
#34 0x085f8a49 in js::jit::Simulator::call (this=<optimized out>, entry=<optimized out>, argument_count=<optimized out>) at js/src/jit/arm/Simulator-arm.cpp:5074
#35 0x083cbdb0 in EnterJit (cx=<optimized out>, cx@entry=0xf6e1d800, state=..., code=0x20479e00 "\004\340-\345\a") at js/src/jit/Jit.cpp:101
#36 0x083ccaef in js::jit::MaybeEnterJit (cx=0xf6e1d800, state=...) at js/src/jit/Jit.cpp:163
#37 0x081e239f in js::RunScript (cx=<optimized out>, state=...) at js/src/vm/Interpreter.cpp:402
#38 0x081e2a05 in js::InternalCallOrConstruct (cx=<optimized out>, args=..., construct=js::NO_CONSTRUCT) at js/src/vm/Interpreter.cpp:489
#39 0x081e2cf0 in InternalCall (cx=cx@entry=0xf6e1d800, args=...) at js/src/vm/Interpreter.cpp:516
#40 0x081e2eaa in js::Call (cx=0xf6e1d800, fval=..., thisv=..., args=..., rval=...) at js/src/vm/Interpreter.cpp:535
#41 0x08638a89 in JS_CallFunction (cx=<optimized out>, obj=..., fun=..., args=..., rval=...) at js/src/jsapi.cpp:2948
#42 0x084c94d1 in OOMTest (cx=<optimized out>, argc=<optimized out>, vp=<optimized out>) at js/src/builtin/TestingFunctions.cpp:1721
[...]
#65 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at js/src/shell/js.cpp:9234
eax	0x0	0
ebx	0xffff9c54	-25516
ecx	0xf7d9f864	-136710044
edx	0x0	0
esi	0xf3db3448	-203738040
edi	0xf3db3448	-203738040
ebp	0xffff9c88	4294941832
esp	0xffff9bf0	4294941680
eip	0x84897d1 <js::jit::MoveResolver::resolve()+2081>
=> 0x84897d1 <js::jit::MoveResolver::resolve()+2081>:	movl   $0x0,0x0
   0x84897db <js::jit::MoveResolver::resolve()+2091>:	ud2
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
JSBugMon: Bisection requested, result:
=== Treeherder Build Bisection Results by autoBisect ===

The "good" changeset has the timestamp "20160115010341" and the hash "32a8c6a3be186bbc1f39da147eb09b087ed322e3".
The "bad" changeset has the timestamp "20160115014842" and the hash "df444117c7bea0a407387dca31ed54c3598b054a".

Likely regression window: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=32a8c6a3be186bbc1f39da147eb09b087ed322e3&tochange=df444117c7bea0a407387dca31ed54c3598b054a
Flags: needinfo?(jcoppeard)
Priority: -- → P1
Comment 1 is blaming me for changing the oomTest function.  This looks like a JIT issue.
Component: JavaScript Engine → JavaScript Engine: JIT
Flags: needinfo?(jcoppeard)
ARM simulator stuff -> Setting needinfo? from Sean as a start.
Flags: needinfo?(sstangl)
I took a look at this with rr.

The crash happens because we enter the MoveResolver with the following pending set:

> PendingMove( from_: MoveOperand { kind_: REG (1), to_: MoveOperand { kind_: REG (0) }
> PendingMove( from_: MoveOperand { kind_: REG (1), to_: MoveOperand { kind_: REG (0) }
> PendingMove( from_: MoveOperand { kind_: REG (0), to_: MoveOperand { kind_: REG (1) }

We cannot have redundant moves, so the algorithm crashes.

The cause of the redundant move (the first one in the list) is that the MoveResolver is not in a clean state when we enter LMoveGroup::visit() -- the MoveResolver's pending_ group contains data from an earlier call that for some reason didn't get resolved, and so it leaks into this invocation.

I'll investigate the cause and add stronger assertions around the MoveResolver. It's somewhat scary to share state as an optimization without asserting that the state is sane!

I'll also look into committing some form of debug statements, because I wind up having to re-write debug printing code every time I look at the MoveResolver. I think this is the fourth time in recent memory -- might as well commit it.
Fixes the bug on ARM, MIPS32, and MIPS64 platforms.

This bug occurred because:

1. As an optimization, we attempted to share MoveResolver state.
2. MoveResolver invariants were not specified, and were non-obvious.
3. As an optimization, in case of OOM, we avoided calling MoveResolver::resolve(), which maintained invariants.

As a result, state from one "borrower" of the shared-state MoveResolver could leak into a future "borrow," resulting in the MoveResolver processing nonsense data and crashing.

This patch also fixes two related bugs that can occur in different OOM conditions.

That bug occurred because the MoveResolver was unclear about its own invariants -- in extremely rare failure conditions, MoveResolver::resolve() would not maintain them, again causing state leak.

Neither of these bugs would have occurred if we didn't attempt to share MoveResolver state. If we do share state, then we need to explicitly assert what the entrance invariants are, so that the error is obvious to the fuzzers. This patch also adds such an assertion.

Passes jit-tests locally.
Assignee: nobody → sstangl
Flags: needinfo?(sstangl)
Attachment #8974170 - Flags: review?(jdemooij)
Comment on attachment 8974170 [details] [diff] [review]
0001-Bug-1456524-Maintain-MoveResolver-invariants.-r.patch

Review of attachment 8974170 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!

(Nit: maybe change your diff settings to 8 lines of context. 3 lines before and after makes it harder to review).

::: js/src/jit/MoveResolver.cpp
@@ +178,5 @@
>  }
>  #endif
>  
> +// Resolves the pending_ list to a list in orderedMoves_.
> +// Upon return from this function, the pending_ list must be cleared.

Nit: we could use mozilla::MakeScopeExit maybe? Something like:

auto clearPending = MakeScopeExit([this]() {
    pending_.clear();
});
Attachment #8974170 - Flags: review?(jdemooij) → review+
Same patch, carrying r+, with more MakeScopeExit.
Attachment #8974170 - Attachment is obsolete: true
Attachment #8975994 - Flags: review+
Keywords: checkin-needed
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ef7430b93e7c
Maintain MoveResolver invariants. r=jandem
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/ef7430b93e7c
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Is this something we need to consider backporting to 61, or can it ride the 62 train?
Flags: needinfo?(sstangl)
Attached patch Backport to 61Splinter Review
This is a rare crash and it's OK to ride the trains, if you choose. But the fix is very small and it's easy to backport, so I've done so and asked for approval.

Approval Request Comment
[Feature/Bug causing the regression]: Unknown, likely old.
[User impact if declined]: Rare crashes on OOM on ARM and MIPS devices.
[Is this code covered by automated tests?]: No (requires OOM conditions -- there's a testcase we could land, but it doesn't seem necessary and takes a while to run).
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: Small change that forcibly clears a buffer, when calling code assumed that the function always clears the buffer.
[String changes made/needed]: None.
Flags: needinfo?(sstangl)
Attachment #8976701 - Flags: review+
Attachment #8976701 - Flags: approval-mozilla-beta?
Comment on attachment 8976701 [details] [diff] [review]
Backport to 61

Ahem, that should read "Backport to 61". It applies to Beta.
Attachment #8976701 - Attachment description: Backport to 60 → Backport to 61
Comment on attachment 8976701 [details] [diff] [review]
Backport to 61

Fix for a possible ARM crash in OOM conditions. Approved for 61.0b7.
Attachment #8976701 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify-
Duplicate of this bug: 1368577
You need to log in before you can comment on or make changes to this bug.