Closed Bug 959328 Opened 10 years ago Closed 8 years ago

JS_ASSERT(!ranges_.empty()) failing at js/src/jit/LiveRangeAllocator.h:267

Categories

(Core :: JavaScript Engine: JIT, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
tracking-b2g backlog

People

(Reporter: botond, Unassigned)

References

Details

(Keywords: regression)

When running a debug build of B2G master, I am getting an assertion failure every time I try to use the keyboard app. Here is the stack trace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 744.832]
0x41ccc1fc in js::jit::LiveInterval::start (this=0x44ffeb30) at /home/botond/dev/mozilla/central/js/src/jit/LiveRangeAllocator.h:267
267             JS_ASSERT(!ranges_.empty());
(gdb) i s
#0  0x41ccc1fc in js::jit::LiveInterval::start (this=0x44ffeb30) at /home/botond/dev/mozilla/central/js/src/jit/LiveRangeAllocator.h:267
#1  js::jit::LiveRangeAllocator<js::jit::LinearScanVirtualRegister, true>::findFirstSafepoint (this=0x44ffeb30) at /home/botond/dev/mozilla/central/js/src/jit/LiveRangeAllocator.h:721
#2  js::jit::LinearScanAllocator::populateSafepoints (this=0x44ffeb30) at /home/botond/dev/mozilla/central/js/src/jit/LinearScan.cpp:499
#3  0x41cd009c in js::jit::LinearScanAllocator::go (this=0x44ffeb30) at /home/botond/dev/mozilla/central/js/src/jit/LinearScan.cpp:1272
#4  0x41c63a2c in js::jit::GenerateLIR (mir=0x4681d170) at /home/botond/dev/mozilla/central/js/src/jit/Ion.cpp:1436
#5  0x41c63c2e in js::jit::CompileBackEnd (mir=0x4681d170, maybeMasm=0x0) at /home/botond/dev/mozilla/central/js/src/jit/Ion.cpp:1527
#6  0x41c6a6d0 in IonCompile (cx=0x404c75e0, script=..., osrFrame=<value optimized out>, osrPc=0x0, constructing=<value optimized out>, executionMode=js::SequentialExecution) at /home/botond/dev/mozilla/central/js/src/jit/Ion.cpp:1776
#7  Compile (cx=0x404c75e0, script=..., osrFrame=<value optimized out>, osrPc=0x0, constructing=<value optimized out>, executionMode=js::SequentialExecution) at /home/botond/dev/mozilla/central/js/src/jit/Ion.cpp:1979
#8  0x41c6aca0 in js::jit::CompileFunctionForBaseline (cx=0x404c75e0, script=..., frame=0x44ffef70, isConstructing=false) at /home/botond/dev/mozilla/central/js/src/jit/Ion.cpp:2145
#9  0x4203728c in EnsureCanEnterIon (cx=0x404c75e0, stub=<value optimized out>, frame=0x44ffef70, infoPtr=0x44ffef2c) at /home/botond/dev/mozilla/central/js/src/jit/BaselineIC.cpp:768
#10 DoUseCountFallback (cx=0x404c75e0, stub=<value optimized out>, frame=0x44ffef70, infoPtr=0x44ffef2c) at /home/botond/dev/mozilla/central/js/src/jit/BaselineIC.cpp:934
#11 0x4366f6b4 in ?? ()
#12 0x4366f6b4 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

The assertion failure occurs while compiling this line: https://github.com/mozilla-b2g/gaia/blob/master/apps/keyboard/js/imes/latin/predictions.js#L891

Please let me know if there is any other information I can provide.
I fear it's another dupe like bug 959126
Flags: needinfo?(nicolas.b.pierron)
Oh, and this is a regression. I have been running debug builds of B2G and using them in the same way and not triggering this assertion. Before today, I last pulled Gecko on Jan 3, so this appeared sometime since then.
Keywords: regression
I'm also seeing a crash that looks related when using the browser. This is basically making a debug build of B2G unusable:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1367.1388]
0xb5f0eeb4 in back (this=<optimized out>) at ../../dist/include/mozilla/Vector.h:384
384           MOZ_ASSERT(!entered);
(gdb) i s
#0  0xb5f0eeb4 in back (this=<optimized out>) at ../../dist/include/mozilla/Vector.h:384
#1  start (this=<optimized out>) at ../../../js/src/jit/LiveRangeAllocator.h:268
#2  findFirstSafepoint (startFrom=<optimized out>, interval=<optimized out>, this=<optimized out>) at ../../../js/src/jit/LiveRangeAllocator.h:721
#3  js::jit::LinearScanAllocator::populateSafepoints (this=0xb267f3d8) at ../../../js/src/jit/LinearScan.cpp:499
#4  0xb5f12c78 in js::jit::LinearScanAllocator::go (this=0xb267f3d8) at ../../../js/src/jit/LinearScan.cpp:1272
#5  0xb5eb3ff2 in js::jit::GenerateLIR (mir=0xb1910150) at ../../../js/src/jit/Ion.cpp:1436
#6  0xb5eb40ae in js::jit::CompileBackEnd (mir=0xb1910150, maybeMasm=0x0) at ../../../js/src/jit/Ion.cpp:1527
#7  0xb6074012 in js::WorkerThread::handleIonWorkload (this=0xb3d9ba58, state=...) at ../../../js/src/jsworkers.cpp:785
#8  0xb6074a12 in js::WorkerThread::threadLoop (this=0xb3d9ba58) at ../../../js/src/jsworkers.cpp:1024
#9  0xb488e458 in _pt_root (arg=0xb2986200) at ../../../../../nsprpub/pr/src/pthreads/ptthread.c:205
#10 0xb6e6fa5c in __thread_entry (func=0xb488e3b9 <_pt_root>, arg=0xb2986200, tls=0xb267ff00) at bionic/libc/bionic/pthread_create.cpp:92
#11 0xb6e6fbd8 in pthread_create (thread_out=0xbec76594, attr=<optimized out>, start_routine=0x78, arg=0xb2986200) at bionic/libc/bionic/pthread_create.cpp:201
#12 0xb2986200 in ?? ()
Cannot access memory at address 0x0
#13 0xb2986200 in ?? ()
Cannot access memory at address 0x0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Can somebody bisect this?
(In reply to Botond Ballo [:botond] from comment #3)
> I'm also seeing a crash that looks related when using the browser. This is
> basically making a debug build of B2G unusable:

For debugging, you can disable ion & asmjs.
Push a preference file to /system/b2g/defaults/pref/user.js which contains:

pref("javascript.options.asmjs", false);
pref("javascript.options.ion.content", false);
Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(nicolas.b.pierron)
I still see the crash in comment #3.

STR:
  1. Build the 1.3 branch of B2G in debug mode.
  2. Launch the Marketplace app.

Before the Marketplace app finishes launching, I get a crash with the backtrace in comment #3. Happens every time, on both Nexus 4 and Buri.
I was able to reproduce it, but I do not understand anything that I am getting out of gdb …

(gdb) p ranges_
Cannot access memory at address 0x8

--> I guess this might be caused by the clobber of the register which contains "this".

(gdb) up
#1  js::jit::LiveRangeAllocator<js::jit::LinearScanVirtualRegister, true>::findFirstSafepoint (this=0xbef73110) at /root/B2G/gecko/js/src/jit/LiveRangeAllocator.h:721
(gdb) p interval
$26 = <value optimized out>

The code is interval->start(), and interval is an immutable pointer argument.

(gdb) up
#2  js::jit::LinearScanAllocator::populateSafepoints (this=0xbef73110) at /root/B2G/gecko/js/src/jit/LinearScan.cpp:496
(gdb) p reg->getInterval(0)
$27 = (js::jit::LiveInterval *) 0x462a4648

So this should be the value of "this" in the previous frame

(gdb) p reg->getInterval(0)->ranges_
$29 = {<mozilla::VectorBase<js::jit::LiveInterval::Range, 1u, js::jit::IonAllocPolicy, js::Vector<js::jit::LiveInterval::Range, 1u, js::jit::IonAllocPolicy> >> = {<js::jit::IonAllocPolicy> = {alloc_ = @0x46287010}, static sElemIsPod = false, 
    static sMaxInlineBytes = 1024, static sInlineCapacity = 1, static sInlineBytes = 8, mBegin = 0x462a4668, mLength = 1, mCapacity = 1, mReserved = 1, storage = {u = {bytes = "\004\000\000\000\016\000\000", _ = 60129542148}}, entered = false, 
    static sMaxInlineStorage = <optimized out>}, <No data fields>}

But then, I do not understand why "ranges_.empty()" would be true, knowing that the "mLength" is 1.

I will try to recompile with more debug info.
I'm running into this crash as well and it is blocking me from debugging bug 969483.
Flags: needinfo?(nicolas.b.pierron)
See Also: → 993317
I can't reproduce the problem after applying the patch in bug 958432 where two assertions are removed in that patch. The removed assertions are in the function where the failed assertion in question got inlined. It looks like that, by the optimizations gcc made, all failed assertions are made to jump to the same point and gdb is unable to tell the exact assertion that failed. Disassemblies also shows that there are several jumps to that failed address, where address 0 is written by 0x123 and then calls abort(). This is what JS_ASSERT() do when fails.

The problem in bug 958432 only existed within

0593b5d7f1f497fd9f3378ace390ca82855af671
Thu Jan 9 12:10:14 2014 +0100

to

6e3e14f3054b1fbe7744b241dd41a3a2b9404aef
Mon Jan 13 20:45:01 2014 +0100

So I'd like to confirm with nbp and Kartikaya, are you testing with the 1.3 or 1.3t branch, or m-c checked out between those two commits?
Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(bugmail.mozilla)
I haven't run into this recently, but I haven't been doing debug builds recently either. Considering that I posted the comment above on Feb 14, I almost certainly was reproducing it on a m-c build outside that range since I'm usually not more than a week out of sync.
Flags: needinfo?(bugmail.mozilla)
(In reply to Ting-Yuan Huang from comment #9)
> I can't reproduce the problem after applying the patch in bug 958432 where
> two assertions are removed in that patch. The removed assertions are in the
> function where the failed assertion in question got inlined. It looks like
> that, by the optimizations gcc made, all failed assertions are made to jump
> to the same point and gdb is unable to tell the exact assertion that failed.
> Disassemblies also shows that there are several jumps to that failed
> address, where address 0 is written by 0x123 and then calls abort(). This is
> what JS_ASSERT() do when fails.

Have you tried using MOZ_ASSERT instead and giving a payload to identify which one of the assertion is causing such failure?
Flags: needinfo?(nicolas.b.pierron)
Haven't tried MOZ_ASSERT* but I compiled the function in question with __attribute__((optimize(0))) and found it always failed at where bug 958432 failed.

* It seems that JS_ASSERT is defined as MOZ_ASSERT?
http://dxr.mozilla.org/mozilla-central/source/js/public/Utility.h#51
(In reply to Ting-Yuan Huang from comment #12)
> Haven't tried MOZ_ASSERT* but I compiled the function in question with
> __attribute__((optimize(0))) and found it always failed at where bug 958432
> failed.

So this should be fixed then?

> * It seems that JS_ASSERT is defined as MOZ_ASSERT?
> http://dxr.mozilla.org/mozilla-central/source/js/public/Utility.h#51

Yes, but MOZ_ASSERT has a second variant which accept a string as a reason of the failure.
(In reply to Nicolas B. Pierron [:nbp] from comment #13)
> So this should be fixed then?

Yes, but not on 1.3/1.3T yet. I suspect if this bug is actually identical to bug 958432 but Kartikaya's comment suggested that there could be other problems.
 
> > * It seems that JS_ASSERT is defined as MOZ_ASSERT?
> > http://dxr.mozilla.org/mozilla-central/source/js/public/Utility.h#51
> 
> Yes, but MOZ_ASSERT has a second variant which accept a string as a reason
> of the failure.

Oh, I see. You mean to supply different arguments to each assertion? I think the conclusion will be the same to __attribute__((optimize(0))).
triage; let's not block tarako relase with this. if we have a safe solution, let's evaluate if we can uplift to 1.3T thanks
blocking-b2g: 1.3T? → backlog
I reliably hit this when launching the music app on tarako. It seems to happen during the media scanning sequence.
Easy to reproduce on tarako. We should fix this.
Flags: needinfo?(nihsanullah)
Ting-Yuan if I follow the comments correctly is the fix to remove the two ASSERTS on 1.3/1.3T now?
Flags: needinfo?(nihsanullah)
Yes, I believe so.
blocking-b2g: backlog → ---
This assertion does not exists anymore.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.