Closed
Bug 1315596
Opened 8 years ago
Closed 6 years ago
Assertion failure: !minimalBundle(bundle) in real-world wasm test case
Categories
(Core :: JavaScript Engine: JIT, defect, P1)
Tracking
()
RESOLVED
DUPLICATE
of bug 1325450
People
(Reporter: alonzakai, Assigned: bhackett1024)
References
Details
(Keywords: stale-bug)
Attachments
(5 files)
25.70 KB,
application/octet-stream
|
Details | |
77.00 KB,
application/javascript
|
Details | |
1020 bytes,
patch
|
Details | Diff | Splinter Review | |
146.96 KB,
text/plain
|
Details | |
2.35 KB,
patch
|
sunfish
:
review+
jcristau
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (X11; Linux i686; rv:52.0) Gecko/20100101 Firefox/52.0 Build ID: 20161106030203 Steps to reproduce: Running the attached .js file will hang in loading the wasm. This is from binaryen2.test_i64 in the emscripten test suite. Actual results: The hang appears in the shell, from latest trunk.
Comment 2•8 years ago
|
||
That seems like something we want to have fixed in this release, right Benjamin? Note: merge is next week. As a result there is not a lot of time left.
Flags: needinfo?(bbouvier)
Priority: -- → P1
Comment 3•8 years ago
|
||
Indeed we'd like to have this fixed as soon as possible. That being said, I can't repro with a debug build on x64 nor x86. Alon, which version of the tree are you using? Which specific flags do you use for configure / compiling?
Flags: needinfo?(bbouvier) → needinfo?(azakai)
This is on latest inbound of last night (321349:4b1de4471e4b) but before updating I saw it on an earlier commit too, from at least a week ago, so it's not a super-recent thing. This is with --enable-optimize --disable-debug. 32-bit linux, building with the system compiler (gcc 4.8.2).
This is on a slow machine with 2 cores btw, maybe timing related. Meanwhile, looks like it's not a true hang, as it does allocate memory very slowly. Eventually (minutes) it OOMs. gdb says main thread is in #0 0xb7fdd428 in __kernel_vsyscall () #1 0xb7faad4b in pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_wait.S:187 #2 0x080fbbe2 in wait (lock=..., this=0xb7a04334) at /mozilla-inbound/js/src/threading/posix/ConditionVariable.cpp:118 #3 js::ConditionVariable::wait_for (this=0xb7a04334, lock=..., a_rel_time=...) at /mozilla-inbound/js/src/threading/posix/ConditionVariable.cpp:134 #4 0x08496eb1 in js::GlobalHelperThreadState::wait (this=0xb7a04240, locked=..., which=which@entry=js::GlobalHelperThreadState::CONSUMER, timeout=...) at /mozilla-inbound/js/src/vm/HelperThreads.cpp:786 #5 0x08619edb in js::wasm::ModuleGenerator::finishOutstandingTask (this=this@entry=0xbfffcf34) at /mozilla-inbound/js/src/wasm/WasmGenerator.cpp:220 and a bunch of threads look like #0 0xb7fdd428 in __kernel_vsyscall () #1 0xb7faad4b in pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_wait.S:187 #2 0x080fbbe2 in wait (lock=..., this=0xb7a04364) at /mozilla-inbound/js/src/threading/posix/ConditionVariable.cpp:118 #3 js::ConditionVariable::wait_for (this=this@entry=0xb7a04364, lock=..., a_rel_time=...) at /mozilla-inbound/js/src/threading/posix/ConditionVariable.cpp:134 #4 0x0849a7ee in wait (timeout=..., which=js::GlobalHelperThreadState::PRODUCER, locked=..., this=0xb7a04240) at /mozilla-inbound/js/src/vm/HelperThreads.cpp:786 #5 js::HelperThread::threadLoop (this=0xb7a20000) at /mozilla-inbound/js/src/vm/HelperThreads.cpp:1864 while the running thread is #0 js::jit::BacktrackingAllocator::minimalBundle (this=this@entry=0xb5ffaf80, bundle=bundle@entry=0xb4453e40, pfixed=pfixed@entry=0xb5ffac0c) at /mozilla-inbound/js/src/jit/BacktrackingAllocator.cpp:2438 #1 0x086c2932 in js::jit::BacktrackingAllocator::computeSpillWeight (this=this@entry=0xb5ffaf80, bundle=0xb4453e40) at /mozilla-inbound/js/src/jit/BacktrackingAllocator.cpp:2498 #2 0x086c2a9e in js::jit::BacktrackingAllocator::maximumSpillWeight (this=this@entry=0xb5ffaf80, bundles=...) at /mozilla-inbound/js/src/jit/BacktrackingAllocator.cpp:2556 #3 0x086c7ca2 in js::jit::BacktrackingAllocator::tryAllocateRegister (this=this@entry=0xb5ffaf80, r=..., bundle=bundle@entry=0xab7bf890, success=success@entry=0xb5ffad44, pfixed=pfixed@entry=0xb5ffad40, conflicting=...) at /mozilla-inbound/js/src/jit/BacktrackingAllocator.cpp:1457 #4 0x086c7f20 in tryAllocateRegister (conflicting=..., pfixed=<optimized out>, success=0xb5ffad44, bundle=<optimized out>, r=..., this=<optimized out>) at /mozilla-inbound/js/src/jit/BacktrackingAllocator.cpp:1392 #5 js::jit::BacktrackingAllocator::tryAllocateNonFixed (this=this@entry=0xb5ffaf80, bundle=bundle@entry=0xab7bf890, requirement=..., hint=..., success=success@entry=0xb5ffad44, pfixed=pfixed@entry=0xb5ffad40, conflicting=...) at /mozilla-inbound/js/src/jit/BacktrackingAllocator.cpp:1223 #6 0x086dd884 in js::jit::BacktrackingAllocator::processBundle (this=this@entry=0xb5ffaf80, mir=0xb5ffb4cc, bundle=bundle@entry=0xab7bf890) at /mozilla-inbound/js/src/jit/BacktrackingAllocator.cpp:1293 #7 0x086dda5f in js::jit::BacktrackingAllocator::go (this=this@entry=0xb5ffaf80) at /mozilla-inbound/js/src/jit/BacktrackingAllocator.cpp:837 #8 0x081e2d5b in js::jit::GenerateLIR (mir=0xb5ffb4cc) at /mozilla-inbound/js/src/jit/Ion.cpp:1954 #9 0x0863f158 in js::wasm::IonCompileFunction (task=task@entry=0xb4344a0c) at /mozilla-inbound/js/src/wasm/WasmIonCompile.cpp:3769 #10 0x086459ae in js::wasm::CompileFunction (task=task@entry=0xb4344a0c) at /mozilla-inbound/js/src/wasm/WasmIonCompile.cpp:3791 #11 0x08499308 in js::HelperThread::handleWasmWorkload (this=this@entry=0xb7a2075c, locked=...) at /mozilla-inbound/js/src/vm/HelperThreads.cpp:1425 #12 0x0849a78a in js::HelperThread::threadLoop (this=0xb7a2075c) at /mozilla-inbound/js/src/vm/HelperThreads.cpp:1872 #13 0x084aaadc in callMain<0u> (this=0xb7a19098) at /mozilla-inbound/js/src/threading/Thread.h:234 #14 js::detail::ThreadTrampoline<void (&)(void*), js::HelperThread*>::Start (aPack=0xb7a19098) at /mozilla-inbound/js/src/threading/Thread.h:227 #15 0xb7fa6f70 in start_thread (arg=0xb5ffcb40) at pthread_create.c:312 #16 0xb7d71bee in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:129
Comment 6•8 years ago
|
||
I can actually reproduce with an optimized build, but not a debug build (be it debug only, or debug&optimized). This suggests an uninitialized variable somewhere... I got back to just after 0xd in spidermonkey (before, the given test case would not validate), and I can reproduce even at this time. Alon, any chance you could provide the same binary compiled with 0xc, please? It won't be an apple-to-apple comparison anymore, but with some luck, it would still reproduce. In the meanwhile, I will try to have a look at compiler warnings and Coverity issues under wasm/ and jit/.
Comment 7•8 years ago
|
||
This small patch helps triggering the issue with a debug build. This is an infinite loop in regalloc, caused because we're trying to split a bundle that's already minimal.
Comment 8•8 years ago
|
||
Assertion failure: !minimalBundle(bundle), at /mozilla-inbound/js/src/jit/BacktrackingAllocator.cpp:1320 Jan, this is an instance of an infinite loop in the regalloc, and you've fixed one recently in bug 1309903. Moreover, it seems related to a WasmCallI64, so the same range of things you've touched in this other bug. Maybe you have an idea of what's going on here? Here's the output of IONFLAGS=regalloc, if that can help. (note you need the previous patch to reproduce on a debug build).
Flags: needinfo?(azakai) → needinfo?(jdemooij)
Updated•8 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Hang in loading wasm file → Assertion failure: !minimalBundle(bundle) in real-world wasm test case
Comment 9•8 years ago
|
||
WasmCallI64 is a red herring, the actual problem is here: ... [138,139 SubI64] [def v58<g>:tied(0)] [def v59<g>:tied(1)] [use v23:r?] [use v24:r?] [use v28:r] [use v29:r] [140,141 MulI64] [def v61<g>:%eax] [def v62<g>:%edx] [temp v60<g>] [use v28:%eax] [use v29:%edx] [use v23:r] [use v24:r] [142,143 WasmStoreI64] [use v2:r] [use v58:r] [use v59:r] ... We're trying to allocate a register for the MulI64's use of v24. v24 has range [140,142) All registers are in use, so the register allocator tries some things: (1) Evict v13 [46,47). This frees up eax \o/ but eax is also needed for MulI64's fixed use/def, so this doesn't get us anywhere. (2) Evict v29 [132,141). This frees up edx \o/ but like eax, it doesn't let us make progress. Then we reach MAX_ATTEMPTS (== 2), so we give up and try to split v24, but this fails because v24 is minimal. Bumping MAX_ATTEMPTS to 3 "fixes" this, because at that point ecx is the register with the lowest spill weight and that one we *can* take after splitting v2. bhackett, what do you think?
Flags: needinfo?(jdemooij) → needinfo?(bhackett1024)
Assignee | ||
Comment 10•8 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #9) > WasmCallI64 is a red herring, the actual problem is here: > > ... > [138,139 SubI64] [def v58<g>:tied(0)] [def v59<g>:tied(1)] [use v23:r?] [use > v24:r?] [use v28:r] [use v29:r] > [140,141 MulI64] [def v61<g>:%eax] [def v62<g>:%edx] [temp v60<g>] [use > v28:%eax] [use v29:%edx] [use v23:r] [use v24:r] > [142,143 WasmStoreI64] [use v2:r] [use v58:r] [use v59:r] > ... > > We're trying to allocate a register for the MulI64's use of v24. v24 has > range [140,142) > > All registers are in use, so the register allocator tries some things: > > (1) Evict v13 [46,47). This frees up eax \o/ but eax is also needed for > MulI64's fixed use/def, so this doesn't get us anywhere. > > (2) Evict v29 [132,141). This frees up edx \o/ but like eax, it doesn't let > us make progress. > > Then we reach MAX_ATTEMPTS (== 2), so we give up and try to split v24, but > this fails because v24 is minimal. > > Bumping MAX_ATTEMPTS to 3 "fixes" this, because at that point ecx is the > register with the lowest spill weight and that one we *can* take after > splitting v2. > > bhackett, what do you think? Hmm, I don't think this fix would hold up in general. I think the basic problem here is that the regalloc is wasting its time evicting bundles which won't allow it to actually successfully allocate the current bundle, due to fixed register uses/defs of the evicted bundle which overlap the current bundle. Under the 'if (!aliasedConflicting.empty()) { ... }' branch in BacktrackingAllocator::tryAllocateRegister, we could check for fixed uses/defs in the |aliasedConflicting| bundles and avoid updating |conflicting| if any of them overlap the current bundle.
Flags: needinfo?(bhackett1024)
Assignee | ||
Comment 11•8 years ago
|
||
Assignee: nobody → bhackett1024
Attachment #8809133 -
Flags: review?(sunfish)
Comment 12•8 years ago
|
||
Looks like binaryen2.test_i64_precise hits what looks like the same issue. If another testcase would help let me know and I'll attach it.
Assignee | ||
Comment 13•8 years ago
|
||
(In reply to Alon Zakai (:azakai) from comment #12) > Looks like binaryen2.test_i64_precise hits what looks like the same issue. > If another testcase would help let me know and I'll attach it. Does the patch in comment 11 fix the issue?
Comment 14•8 years ago
|
||
Yes, it does, nice.
Comment 15•7 years ago
|
||
Comment on attachment 8809133 [details] [diff] [review] patch Review of attachment 8809133 [details] [diff] [review]: ----------------------------------------------------------------- Just a suggestion: the new code here would work really well refactored into a separate function, so that it can just do a return once it's found a fixed overlap instead of testing hasFixedOverlap at each loop level. And it'd keep tryAllocateRegister tidier.
Attachment #8809133 -
Flags: review?(sunfish) → review+
Comment 16•7 years ago
|
||
Pushed by bhackett@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/1e14d92676fb Don't evict bundles which won't help with allocating the target bundle, r=sunfish.
Comment 17•7 years ago
|
||
Since this seems to appear pretty easily with wasm, could we land this on 52 (which should be the first wasm and branched to m-a monday)?
Flags: needinfo?(bhackett1024)
Assignee | ||
Comment 18•7 years ago
|
||
Comment on attachment 8809133 [details] [diff] [review] patch Approval Request Comment [Feature/regressing bug #]: fairly old, but only seems to affect wasm [User impact if declined]: infinite loop / assertion failure on affected pages [Risks and why]: low risk
Flags: needinfo?(bhackett1024)
Attachment #8809133 -
Flags: approval-mozilla-aurora?
Comment 19•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/1e14d92676fb
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox53:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment 21•7 years ago
|
||
Comment on attachment 8809133 [details] [diff] [review] patch jit register allocation tweak needed by wasm, take in aurora52
Attachment #8809133 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•7 years ago
|
status-firefox52:
--- → affected
Comment 22•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/bc93261f3106
Comment 23•7 years ago
|
||
This was backed out for causing bug 1338642.
Status: RESOLVED → REOPENED
status-firefox54:
--- → affected
status-firefox-esr52:
--- → disabled
Resolution: FIXED → ---
Target Milestone: mozilla53 → ---
Comment hidden (obsolete) |
Assignee | ||
Comment 26•6 years ago
|
||
I think that bug 1325450 added a better and more robust fix than the patch in this bug. See also bug 1338642 comment 2.
Flags: needinfo?(bhackett1024)
Updated•6 years ago
|
Status: REOPENED → RESOLVED
Closed: 7 years ago → 6 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•