Last Comment Bug 696386 - Crash on Quit [@ JS_ResumeRequest]
: Crash on Quit [@ JS_ResumeRequest]
Status: VERIFIED FIXED
[native-crash:P1]
: crash, crashreportid, regression, reproducible
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: unspecified
: ARM Android
: P1 critical (vote)
: ---
Assigned To: Brad Lassey [:blassey] (use needinfo?)
:
Mentors:
https://crash-stats.mozilla.com/repor...
: 698106 698349 698813 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-21 08:28 PDT by Aaron Train [:aaronmt]
Modified: 2012-01-09 10:25 PST (History)
12 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed
11+


Attachments
crash log (61.90 KB, text/plain)
2011-11-23 03:18 PST, Cristian Nicolae (:xti)
no flags Details

Description Aaron Train [:aaronmt] 2011-10-21 08:28:41 PDT
STR: Menu -> Quit

--
20111021040222
http://hg.mozilla.org/projects/birch/rev/6113f767a431
Samsung Galaxy SII (Android 2.3.4)
Comment 1 Aaron Train [:aaronmt] 2011-10-21 09:01:03 PDT
c7fd6631-2120-40c3-9c91-74f662111020
afc1fa50-ce70-4e49-b7c1-5b97b2111020
33676155-850f-404c-8431-c7f852111021
1b586ea0-53cb-4915-8778-5a11c2111021
Comment 3 Doug Turner (:dougt) 2011-10-23 21:28:36 PDT
backed out the change:
    http://hg.mozilla.org/projects/birch/rev/1b7748fbdf19
Comment 4 Aaron Train [:aaronmt] 2011-10-24 11:47:50 PDT
Mozilla/5.0 (Android; Linux armv7l; rv:10.0a1) Gecko/20111024 Firefox/10.0a1 Fennec/10.0a1
Comment 5 Brad Lassey [:blassey] (use needinfo?) 2011-10-28 15:29:30 PDT
backed out the backout
Comment 6 Brad Lassey [:blassey] (use needinfo?) 2011-10-28 15:29:56 PDT
*** Bug 698106 has been marked as a duplicate of this bug. ***
Comment 7 Brad Lassey [:blassey] (use needinfo?) 2011-10-28 15:43:02 PDT
it looks like the context being passed to JS_ResumeRequest is null.

Chris, would it be reasonable to null test the context before calling cx->thread()? or would that just be masking a bigger problem?
Comment 8 Aaron Train [:aaronmt] 2011-10-31 06:24:31 PDT
*** Bug 698349 has been marked as a duplicate of this bug. ***
Comment 9 Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-31 14:02:47 PDT
I saw a java crash in socorro Birch Build ID	20111024040254:

https://crash-stats.mozilla.com/report/index/b0fbf30f-93f7-46f1-8c86-deff32111031

App Notes 	java.lang.NullPointerException
at org.mozilla.gecko.GeckoApp.rememberLastScreen(GeckoApp.java:425)
at org.mozilla.gecko.GeckoApp.quit(GeckoApp.java:460)
at org.mozilla.gecko.GeckoApp.onOptionsItemSelected(GeckoApp.java:391)
at android.app.Activity.onMenuItemSelected(Activity.java:2205)
at com.android.internal.policy.impl.PhoneWindow.onMenuItemSelected(PhoneWindow.java:773)
at com.android.internal.view.menu.MenuItemImpl.invoke(MenuItemImpl.java:143)
at com.android.internal.view.menu.MenuBuilder.performItemAction(MenuBuilder.java:855)
at com.android.internal.view.menu.IconMenuView.invokeItem(IconMenuView.java:532)
at com.android.internal.view.menu.IconMenuItemView.performClick(IconMenuItemView.java:122)
at android.view.View$PerformClick.run(View.java:9157)
at android.os.Handler.handleCallback(Handler.java:587)
at android.os.Handler.dispatchMessage(Handler.java:92)
at android.os.Looper.loop(Looper.java:130)
at org.mozilla.gecko.GeckoApp$10.run(GeckoApp.java:677
Comment 10 Brad Lassey [:blassey] (use needinfo?) 2011-10-31 14:04:35 PDT
(In reply to Naoki Hirata :nhirata from comment #9)
> I saw a java crash in socorro Birch Build ID	20111024040254:

How did you relate that crash report to this bug?
Comment 11 Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-31 14:36:22 PDT
I probably should have asked first:

The GeckoApp.quit goes to GeckoApp.rememberLastScreen. and then crashed with a java.lang.NullPointerException at org.mozilla.gecko.GeckoApp.rememberLastScreen.  My guess is that it crashed due to trying to remember the last screen before quitting being null.

2 other bugs were closed out as dups of this one: bug 698349 and bug 698106; I saw your note about checking for null, so I stuck the comment in here.
Comment 12 Chris Leary [:cdleary] (not checking bugmail) 2011-10-31 15:13:13 PDT
(In reply to Brad Lassey [:blassey] from comment #7)
> Chris, would it be reasonable to null test the context before calling
> cx->thread()? or would that just be masking a bigger problem?

Sounds like that would be masking a bigger problem. Is this still occurring?
Comment 13 Brad Lassey [:blassey] (use needinfo?) 2011-10-31 15:18:11 PDT
(In reply to Chris Leary [:cdleary] from comment #12)
> (In reply to Brad Lassey [:blassey] from comment #7)
> > Chris, would it be reasonable to null test the context before calling
> > cx->thread()? or would that just be masking a bigger problem?
> 
> Sounds like that would be masking a bigger problem. Is this still occurring?

It sounds like QA can reproduce easily
Comment 14 David Mandelin [:dmandelin] 2011-10-31 16:53:18 PDT
(In reply to Brad Lassey [:blassey] from comment #7)
> it looks like the context being passed to JS_ResumeRequest is null.
> 
> Chris, would it be reasonable to null test the context before calling
> cx->thread()? or would that just be masking a bigger problem?

We are moving toward single-threaded runtimes in JS, which means JS_THREADSAFE code will come out, including the bodies of JS_XXXRequest, so those functions won't be needed any more.

The problem doesn't look like a null cx to me. According to the linked crashreports, it's crashing in this call to JS_ResumeRequest (in XPCJSContextStack::Pop):

        XPCJSContextInfo & e = mStack[idx];
        NS_ASSERTION(!e.frame || e.cx, "Shouldn't have frame without a cx!");
        NS_ASSERTION(!e.suspendDepth || e.cx, "Shouldn't have suspendDepth without a cx!");
        if(e.cx)
        {
            if(e.suspendDepth)
            {
                JS_ResumeRequest(e.cx, e.suspendDepth);


e.cx can't be null at this point, because of the if. The crash report I looked at gives a crash address of 0x1e1d2f6e, which seems pretty random. I would wonder whether that's a value that was pushed earlier, or if the value got overwritten. Since this is a shutdown crash, I might vaguely guess that a good cx gets pushed earlier, but that something destroys that cx before the pop. That would suggest something wrong about the shutdown sequencing, which sounds bad for the xpconnect stack, even ignoring the request problem.
Comment 15 Aaron Train [:aaronmt] 2011-11-01 11:03:22 PDT
*** Bug 698813 has been marked as a duplicate of this bug. ***
Comment 16 David Mandelin [:dmandelin] 2011-11-01 11:46:08 PDT
After seeing a comment in another bug, I realized this looks like a pop-the-empty stack bug. Anyway, it's mostly likely a misuse of the XPCJSContextStack. I think bz or mrbkap would be most likely to be able to help you with that.
Comment 17 Boris Zbarsky [:bz] (TPAC) 2011-11-01 12:34:44 PDT
Hmm.  Popping the empty stack should be hard, since the push and pop is handled by an RAII helper object..

Is it possible that during the running of the event the XPCJSContextStack goes away or something?

Blake, https://crash-stats.mozilla.com/report/index/c7fd6631-2120-40c3-9c91-74f662111020 has a usable stack.
Comment 18 Doug Turner (:dougt) 2011-11-01 18:54:05 PDT
we are killing gecko all wrong.  brian will attach a patch to bug 696315
Comment 19 Cristian Nicolae (:xti) 2011-11-23 03:18:48 PST
Created attachment 576440 [details]
crash log

This crash occurred while I was switching between several opened tabs.  Please see the attached log file.

https://crash-stats.mozilla.com/report/index/bp-30c48263-f56b-42d9-b856-cae852111123

--
Mozilla/5.0 (Android;Linux armv7l;rv:11.0a1)Gecko/20111122
Firefox/11.0a1 Fennec/11.0a1
Devices: Samsung Galaxy Nexus S
OS: Android 2.3.4
Comment 20 Brad Lassey [:blassey] (use needinfo?) 2011-11-30 21:27:23 PST
no reports since 11/12
Comment 21 Brad Lassey [:blassey] (use needinfo?) 2011-11-30 21:28:26 PST
(In reply to Cristian Nicolae (:xti) from comment #19)
> Created attachment 576440 [details]
> crash log
> 
> This crash occurred while I was switching between several opened tabs. 
> Please see the attached log file.
This bug is about a crash on quit
Comment 22 Aaron Train [:aaronmt] 2011-12-01 06:25:37 PST
Samsung Nexus S (Android 2.3.6)
20111201040252
http://hg.mozilla.org/projects/birch/rev/d71c91775f9b

Note You need to log in before you can comment on or make changes to this bug.