ARM64: Crash in wasm/builtin.js

RESOLVED FIXED in Firefox 67

Status

()

defect
P1
normal
RESOLVED FIXED
6 months ago
5 months ago

People

(Reporter: sstangl, Assigned: nbp)

Tracking

(Blocks 1 bug, {crash})

unspecified
mozilla67
ARM64
Unspecified
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr60 wontfix, firefox64 wontfix, firefox65 wontfix, firefox66 wontfix, firefox67 fixed)

Details

(Whiteboard: [arm64:m3])

Attachments

(2 attachments)

When run on ARM64 hardware, this test fails: wasm/builtin.js

The arguments passed are as follows:

--ion-eager --ion-offthread-compile=off --more-compartments wasm/builtin.js
--ion-eager --ion-offthread-compile=off --ion-check-range-analysis --ion-extra-checks --no-sse3 --no-threads wasm/builtin.js

The crash signature is as follows:

Thread 1 "js" received signal SIGSEGV, Segmentation fault.
JSFunction::flags (this=0x200000000) at /home/sstangl/dev/gecko-dev/js/src/vm/JSFunction.h:236
236	  uint16_t flags() const { return flags_; }
(gdb) bt
#0  JSFunction::flags (this=0x200000000) at /home/sstangl/dev/gecko-dev/js/src/vm/JSFunction.h:236
#1  0x0000aaaaaabfa1b8 in JSFunction::hasScript (this=0x200000000) at /home/sstangl/dev/gecko-dev/js/src/vm/JSFunction.h:288
#2  0x0000aaaaaabfa104 in JSFunction::hasUncompletedScript (this=0x200000000) at /home/sstangl/dev/gecko-dev/js/src/vm/JSFunction.h:596
#3  0x0000aaaaaabf9dd4 in JSFunction::nonLazyScript (this=0x200000000) at /home/sstangl/dev/gecko-dev/js/src/vm/JSFunction.h:601
#4  0x0000aaaaabaec6b0 in js::jit::ScriptFromCalleeToken (token=0x200000001) at /home/sstangl/dev/gecko-dev/js/src/jit/JitFrames.h:68
#5  0x0000aaaaabaec5d4 in js::jit::BailoutFrameInfo::BailoutFrameInfo (this=0xffffffffa508, activations=..., bailout=0xffffffffa890)
    at /home/sstangl/dev/gecko-dev/js/src/jit/arm64/Bailouts-arm64.cpp:43
#6  0x0000aaaaac34d4dc in js::jit::Bailout (sp=0xffffffffa890, bailoutInfo=0xffffffffa888)
    at /home/sstangl/dev/gecko-dev/js/src/jit/Bailouts.cpp:41
#7  0x0000303398d90494 in ?? ()
#8  0x000000ffffffffab in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

[arm64:m3] because we should fix reproducible test crashes before letting ARM64 Fennec Nightly ride the trains to Beta.

Keywords: crash
Whiteboard: [arm64:m3]
Assignee: nobody → nicolas.b.pierron
Status: NEW → ASSIGNED

This test case is a minimal version of the one in the test suite and SEGV when trying to iterate over the stack, while producing an error message:

(gdb) bt
#0  JSScript::hasIonScript (this=0xf2b5b997d2917817) at /home/nicolas/mozilla/wksp-0/js/src/vm/JSScript.h:2219
#1  0x0000aaaaabac43f4 in js::jit::JSJitFrameIter::checkInvalidation (this=0xfffffffe00c8, ionScriptOut=0xfffffffdf648) at /home/nicolas/mozilla/wksp-0/js/src/jit/JSJitFrameIter.cpp:62
#2  0x0000aaaaabac4e54 in js::jit::JSJitFrameIter::ionScript (this=0xfffffffe00c8) at /home/nicolas/mozilla/wksp-0/js/src/jit/JSJitFrameIter.cpp:225
#3  0x0000aaaaabac4f40 in js::jit::JSJitFrameIter::machineState (this=0xfffffffe00c8) at /home/nicolas/mozilla/wksp-0/js/src/jit/JSJitFrameIter.cpp:181
#4  0x0000aaaaabacdcf4 in js::jit::InlineFrameIterator::resetOn (this=0xfffffffe0128, iter=0xfffffffe00c8) at /home/nicolas/mozilla/wksp-0/js/src/jit/JitFrames.cpp:1992
#5  0x0000aaaaab2b0148 in js::FrameIter::nextJitFrame (this=0xfffffffe0070) at /home/nicolas/mozilla/wksp-0/js/src/vm/Stack.cpp:797
#6  0x0000aaaaab2afb88 in js::FrameIter::settleOnActivation (this=0xfffffffe0070) at /home/nicolas/mozilla/wksp-0/js/src/vm/Stack.cpp:704
#7  0x0000aaaaab2b04e0 in js::FrameIter::FrameIter (this=0xfffffffe0070, cx=0xaaaaadb942d0, debuggerEvalOption=js::FrameIter::FOLLOW_DEBUGGER_EVAL_PREV_LINK, principals=0x0) at /home/nicolas/mozilla/wksp-0/js/src/vm/Stack.cpp:767
#8  0x0000aaaaab0b63b4 in js::NonBuiltinFrameIter::NonBuiltinFrameIter (this=0xfffffffe0070, cx=0xaaaaadb942d0, principals=0x0) at /home/nicolas/mozilla/wksp-0/js/src/vm/Stack.h:2211
#9  0x0000aaaaab052984 in PopulateReportBlame (cx=0xaaaaadb942d0, report=0xfffffffe0628) at /home/nicolas/mozilla/wksp-0/js/src/vm/JSContext.cpp:266
#10 0x0000aaaaab053704 in js::ReportErrorNumberVA (cx=0xaaaaadb942d0, flags=0, callback=0xaaaaaabe3e94 <js::shell::my_GetErrorMessage(void*, unsigned int)>, userRef=0x0, errorNumber=7, argumentsType=js::ArgumentsAreUTF8, ap=...) at /home/nicolas/mozilla/wksp-0/js/src/vm/JSContext.cpp:822
#11 0x0000aaaaab44a034 in JS_ReportErrorNumberUTF8VA (cx=0xaaaaadb942d0, errorCallback=0xaaaaaabe3e94 <js::shell::my_GetErrorMessage(void*, unsigned int)>, userRef=0x0, errorNumber=7, ap=...) at /home/nicolas/mozilla/wksp-0/js/src/jsapi.cpp:4859
#12 0x0000aaaaab437b7c in JS_ReportErrorNumberUTF8 (cx=0xaaaaadb942d0, errorCallback=0xaaaaaabe3e94 <js::shell::my_GetErrorMessage(void*, unsigned int)>, userRef=0x0, errorNumber=7) at /home/nicolas/mozilla/wksp-0/js/src/jsapi.cpp:4849
#13 0x0000aaaaaabf53ac in AssertEq (cx=0xaaaaadb942d0, argc=2, vp=0xfffffffe09d8) at /home/nicolas/mozilla/wksp-0/js/src/shell/js.cpp:2787
#14 0x00000a4042902860 in ?? ()

The problem here seems to be related to either the stack consumption or the computation of the the frame size when generating the exit frame.

(gdb) p *this
$9 = {current_ = 0xfffffffe0b50 "\b",
      type_ = js::jit::FrameType::IonJS,
      resumePCinCurrentFrame_ = 0x1ae3e703981c "\003\310@\371|8",
      frameSize_ = 128,
      cachedSafepointIndex_ = 0x0,
      activation_ = 0xfffffffe0d78}
(gdb) p this->activation_->packedExitFP_ 
$10 = (uint8_t *) 0xfffffffe0ac0 "\034\230\003\347\343\032"
(gdb) x/88a 0xfffffffe0ac0

(this->activation_->packedExitFP_)
(exit frame)
  0xfffffffe0ac0: 0x1ae3e703981c  0x8020 — This should be either 0x9020 or 0x8040
  0xfffffffe0ad0: 0x2     0xfffe3303befa4fc0
  0xfffffffe0ae0: 0xfff9800000000000      0x7ff8000000000000
  0xfffffffe0af0: 0x379fffe000000000      0xaaaaade06018
  0xfffffffe0b00: 0xfffffffe0aa8  0xfffffffe0a58
  0xfffffffe0b10: 0xffff7fc00000  0x7fc00000
  0xfffffffe0b20: 0x0     0x7fc0000000000001
  0xfffffffe0b30: 0xfffffffe0b61  0xaaaaadcc6ee0
  0xfffffffe0b40: 0x2     0x1ae3e7038a70
(this->current_)
  0xfffffffe0b50: 0x8     0x8
(jit frame)
  0xfffffffe0b60: 0x1ae3e7038a10  0x7020
  0xfffffffe0b70: 0x1     0xfff9800000000000
  0xfffffffe0b80: 0xfffe2a699b001720      0x7ff8000000000000
  0xfffffffe0b90: 0x7ff8000000000000      0x7fc00000befc7a60
  0xfffffffe0ba0: 0x200000001     0xfffe2a699b001d80
  0xfffffffe0bb0: 0x2a699b001c40  0x2a699b001de0
  0xfffffffe0bc0: 0xfffe2a699b001de0      0x2a699b001de0
  0xfffffffe0bd0: 0x3303befc7a60  0x3303bef8d040
(entry frame)
  0xfffffffe0be0: 0x1ae3e6f15b4c  0x5043
  0xfffffffe0bf0: 0x3303befbca62  0x0
  0xfffffffe0c00: 0x0     0xfffffffe1698
  0xfffffffe0c10: 0x0     0xfffffffe1698
  0xfffffffe0c20: 0x0     0xfffffffe1698
  0xfffffffe0c30: 0xfffffffe0c50  0xaaaaaaec1ba0 <mozilla::detail::PoisonObject<JS::AutoAssertNoGC>(JS::AutoAssertNoGC*)+36>
  0xfffffffe0c40: 0xdeadd11d      0xdeadd00d
(c++ stack)
  0xfffffffe0c50: 0x0     0x0
  0xfffffffe0c60: 0x0     0x0
  0xfffffffe0c70: 0x0     0x0
  0xfffffffe0c80: 0x0     0x0
  0xfffffffe0c90: 0xaaaaabac920c <EnterJit(JSContext*, js::RunState&, unsigned char*)+1304>       0xfffffffe10d8

Instrumenting the frame descriptor to include a 8 bits unique identifier (n° 79), and matching the arguments, highlighted that the code which produced this frame descriptor is the following stack frame:

#0  js::jit::MakeFrameDescriptor (frameSize=128, type=js::jit::FrameType::IonJS, headerSize=16) at /home/nicolas/mozilla/wksp-0/js/src/jit/JitFrames.h:282
#1  0x0000aaaaab8c9948 in js::jit::MacroAssembler::pushStaticFrameDescriptor (this=0xaaaaaddf39a0, type=js::jit::FrameType::IonJS, headerSize=16) at /home/nicolas/mozilla/wksp-0/js/src/jit/MacroAssembler-inl.h:220
#2  0x0000aaaaab9870c0 in js::jit::MacroAssembler::buildFakeExitFrame (this=0xaaaaaddf39a0, scratch=...) at /home/nicolas/mozilla/wksp-0/js/src/jit/MacroAssembler-inl.h:254
#3  0x0000aaaaab94844c in js::jit::CodeGenerator::visitCallNative (this=0xaaaaaddf3960, call=0xaaaaadd6e3f8) at /home/nicolas/mozilla/wksp-0/js/src/jit/CodeGenerator.cpp:4367
#4  0x0000aaaaab94e3e8 in js::jit::CodeGenerator::generateBody (this=0xaaaaaddf3960) at /home/nicolas/mozilla/wksp-0/js/src/jit/CodeGenerator.cpp:5963
#5  0x0000aaaaab9700c8 in js::jit::CodeGenerator::generate (this=0xaaaaaddf3960) at /home/nicolas/mozilla/wksp-0/js/src/jit/CodeGenerator.cpp:10268
#6  0x0000aaaaab9e695c in js::jit::GenerateCode (mir=0xaaaaadcebd60, lir=0xaaaaadd5bf48) at /home/nicolas/mozilla/wksp-0/js/src/jit/Ion.cpp:1747
#7  0x0000aaaaab9e6a4c in js::jit::CompileBackEnd (mir=0xaaaaadcebd60) at /home/nicolas/mozilla/wksp-0/js/src/jit/Ion.cpp:1768
#8  0x0000aaaaaba48eb4 in js::jit::IonCompile (cx=0xaaaaadb942d0, script=0x3066d8cbca60, baselineFrame=0xfffffffe0ba8, osrPc=0xaaaaadcc329c "\343\201B\a+\377\377\377\346\v\003", recompile=false, optimizationLevel=js::jit::OptimizationLevel::Normal)
    at /home/nicolas/mozilla/wksp-0/js/src/jit/Ion.cpp:2086

So it sounds like it might likely be that the stack is not accounted properly on ARM64.

After investigating the masm.framePushed() evolution in the CodeGenerator and the stack pointer evolution in the generated code. The problem does not seems to a non-accounted stack manipulation.

void CodeGenerator::visitCallNative(LCallNative* call) {

  : $sp = (void *) 0xfffffffe0ad8
  : /x $x28 = 0xfffffffe0ae0
  // masm.framePushed_ == 0x70
  masm.checkStackAlignment();
; => 0x356df82727c0:      mov     x16, x28
;    0x356df82727c4:      tst     x16, #0x7
;    0x356df82727c8:      b.eq    0x356df82727d0  // b.none
;    0x356df82727cc:      brk     #0x0
;    0x356df82727d0:      mov     x16, xzr
;    0x356df82727d4:      mov     x16, sp
;    0x356df82727d8:      tst     x16, #0x7
;    0x356df82727dc:      b.eq    0x356df82727e4  // b.none
;    0x356df82727e0:      brk     #0x0
;    0x356df82727e4:      mov     x16, xzr

  // unusedStack == 0
  masm.adjustStack(unusedStack);

  : $sp = (void *) 0xfffffffe0ad8
  : /x $x28 = 0xfffffffe0ae0
  masm.Push(ObjectValue(*target->rawJSFunction()));
;    0x356df82727e8:      ldr     x16, 0x356df8272be4
;    0x356df82727ec:      sub     sp, x28, #0x8
;    0x356df82727f0:      str     x16, [x28, #-8]!

  : $sp = (void *) 0xfffffffe0ad8
  : /x $x28 = 0xfffffffe0ad8
  // masm.framePushed_ == 0x78
  masm.loadJSContext(argContextReg);
;    0x356df82727f4:      mov     x0, #0x42d0                     // #17104
;    0x356df82727f8:      movk    x0, #0xadb9, lsl #16
;    0x356df82727fc:      movk    x0, #0xaaaa, lsl #32

  : $sp = (void *) 0xfffffffe0ad8
  : /x $x28 = 0xfffffffe0ad8
  masm.move32(Imm32(call->numActualArgs()), argUintNReg);
;    0x356df8272800:      mov     w1, #0x2                        // #2

  masm.moveStackPtrTo(argVpReg);
;    0x356df8272804:      mov     x2, x28

  masm.Push(argUintNReg);
;    0x356df8272808:      sub     sp, x28, #0x8
;    0x356df827280c:      str     x1, [x28, #-8]!

  : $sp = (void *) 0xfffffffe0ad0
  : /x $x28 = 0xfffffffe0ad0
  // masm.framePushed_ == 0x80
  uint32_t safepointOffset = masm.buildFakeExitFrame(tempReg);
    pushStaticFrameDescriptor(FrameType::IonJS, ExitFrameLayout::Size());
;    0x356df8272810:      mov     w16, #0x24f0                    // instrumented frame descriptor 0x802__0 + 0x___4f_ identifier.
;    0x356df8272814:      movk    w16, #0x80, lsl #16
;    0x356df8272818:      sub     sp, x28, #0x8
;    0x356df827281c:      str     x16, [x28, #-8]!

  : $sp = (void *) 0xfffffffe0ac8
  : /x $x28 = 0xfffffffe0ac8
    uint32_t retAddr = pushFakeReturnAddress(scratch);
;    0x356df8272820:      adr     x3, 0x356df827282c
;    0x356df8272824:      sub     sp, x28, #0x8
;    0x356df8272828:      str     x3, [x28, #-8]!

  : $sp = (void *) 0xfffffffe0ac0
  : /x $x28 = 0xfffffffe0ac0

Thus, the next suspect would be the entry of the function which might push extra stack arguments without registering them in the masm.framePushed() counter.

Ok, instrumenting the CodeGenerator::generateBody function to break after the first instruction which is causing the stack to produce a bad stack offset help me to identify the issue, which later appear in the next call to LCallNative.

I identified the issue being that MIonToWasmCall is not restoring the stack pointer correctly. I will reverse engineer this generated instruction and attempt to figure out where the stack differences comes from.

Priority: P2 → P1
Duplicate of this bug: 1522302
Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d7989f40291e
ARM64: Ensure that the emulated stack pointer is restored when returning from WASM. r=bbouvier,sstangl
Depends on: 1526923
Depends on: 1526959
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
You need to log in before you can comment on or make changes to this bug.