Last Comment Bug 782659 - SIGILL when profiling on Android
: SIGILL when profiling on Android
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Gecko Profiler (show other bugs)
: Trunk
: ARM Android
: -- normal (vote)
: mozilla17
Assigned To: Alex Crichton [:acrichto]
:
Mentors:
Depends on:
Blocks: 778979
  Show dependency treegraph
 
Reported: 2012-08-14 08:15 PDT by Vladimir Vukicevic [:vlad] [:vladv]
Modified: 2012-08-23 03:50 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Don't assert when no registers are available on ARM (3.83 KB, patch)
2012-08-21 17:55 PDT, Alex Crichton [:acrichto]
alex: review+
vladimir: feedback+
Details | Diff | Review

Description Vladimir Vukicevic [:vlad] [:vladv] 2012-08-14 08:15:20 PDT
When profiling on Android, we now hit a SIGILL shortly after pages start loading, perhaps in JS execution.

STR: open http://google.com/, hit Pull in profiler addon to restart with profiling enabled.  Shortly after the browser restrats and the page loads, it will crash with SIGILL.

In a debugger (which you can attach if you open about:blank and then restart with profiling of about:blank, and then opening a real page):

Program received signal SIGILL, Illegal instruction.
[Switching to Thread 3632]
0x65428828 in ?? ()
(gdb) x/8i $pc-12
   0x6542881c:	movw	r3, #16384	; 0x4000
   0x65428820:	movt	r3, #26855	; 0x68e7
   0x65428824:	adds	r2, r2, r3
=> 0x65428828:	cdppl	0, 13, cr8, cr13, cr0, {0}
   0x6542882c:	str	r6, [r2, #3378]	; 0xd32
   0x65428830:	ldr	r10, [sp, #28]
   0x65428834:	ldr	r9, [r10, #64]	; 0x40
   0x65428838:	ldr	r7, [r10, #-24]
(gdb) where
#0  0x65428828 in ?? ()
#1  0x6d1c2430 in InlineAddTypeProperty (cx=0x7050d430, obj=<optimized out>, id=<optimized out>, type=...)
    at /home/vladimir/proj/mozilla-central/js/src/jsinfer.cpp:2957
#2  0x6aa00198 in ?? ()
#3  0x6aa00198 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) x/12x $pc-16
0x65428818:	0xe1b02202	0xe3043000	0xe34638e7	0xe0922003
0x65428828:	0x5edd8000	0xe5826d32	0xe59da01c	0xe59a9040
0x65428838:	0xe51a7018	0xe51a6010	0xe51a5008	0xe59ff024

surrounding instructions around look reasonable, but not the one bogus instruction.  This is m-c 75cdb3f932c6 with the patches from bug 778724 applied -- not sure if that makes a difference, but Cc'ing bhackett just in case.  Will try a normal nightly shortly.
Comment 1 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-15 20:56:33 PDT
Not caused by bug 778724.  A nightly build is totally unusable when you start it with profiling enabled.
Comment 2 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-15 21:27:04 PDT
From nightlies, 08-06 works, 08-07 is busted.  I suspect bug 778979.
Comment 3 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-16 12:16:30 PDT
Ignore the previous two comments; they were bogus and related to awesomebar issues.  SIGILL issue still remains.
Comment 4 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-17 08:33:10 PDT
Note: this is happening even without the patches in bug 778724; just clean m-c build.
Comment 5 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-17 08:53:56 PDT
Definitely related to js profiling.  Manually disabling the "js" feature in TableTicker.cpp makes this problem go away.
Comment 6 Benoit Girard (:BenWa) 2012-08-17 12:35:29 PDT
This is likely caused by bug 778979, would be nice to get that confirmed.
Comment 7 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-17 12:44:22 PDT
Yeah, my original thinking that it *wasn't* due to that bug was because I was looking at the wrong behaviour (busted awesomebar behaviour with profiling enabled).
Comment 8 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-20 14:54:56 PDT
I've definitely confirmed that this is caused by bug 778979.  A build of the rev just before it landed works fine; immediately after I see SIGILLs on google.com, SIGSEGV on browserquest, and similar.

I'd actually suggest that we back out bug 778979 until we can fix it on mobile.
Comment 9 Alex Crichton [:acrichto] 2012-08-21 17:55:36 PDT
Created attachment 654040 [details] [diff] [review]
Don't assert when no registers are available on ARM

Turns out this was a problem where the register allocated for a call was something bogus, causing odd errors.

vlad, can you confirm this fixes the problem for Fennec?
Comment 10 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-22 09:26:19 PDT
Comment on attachment 654040 [details] [diff] [review]
Don't assert when no registers are available on ARM

Yup, seems to work great!

Any chance this could have caused bug 784687 on desktop?
Comment 11 Alex Crichton [:acrichto] 2012-08-22 09:29:44 PDT
Comment on attachment 654040 [details] [diff] [review]
Don't assert when no registers are available on ARM

r+ from bhackett on IRC yesterday
Comment 12 Alex Crichton [:acrichto] 2012-08-22 09:42:00 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/2a9b3c766512
Comment 13 Ed Morley [:emorley] 2012-08-23 03:50:30 PDT
https://hg.mozilla.org/mozilla-central/rev/2a9b3c766512

Note You need to log in before you can comment on or make changes to this bug.