Closed Bug 1273770 Opened 8 years ago Closed 8 years ago

AsyncShutdown timeout in session store

Categories

(Firefox :: Session Restore, defect)

41 Branch
x86_64
All
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 1255977
Tracking Status
firefox47 --- affected
firefox48 --- affected
firefox49 --- affected
firefox-esr45 --- affected
firefox50 + fixed
firefox51 + fixed
firefox52 --- fixed

People

(Reporter: ting, Unassigned)

References

Details

(Keywords: crash)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-e4a574a0-6dd8-464c-be13-9b6ff2160516.
=============================================================

This is #7 crash on Windows for Nightly 20160515030241. There were 9 crashes from 9 installations.

Stack:
mozglue.dll!mozalloc_abort(const char * const msg) Line 33	C++
xul.dll!NS_DebugBreak(unsigned int aSeverity, const char * aStr, const char * aExpr, const char * aFile, int aLine) Line 403	C++
xul.dll!nsDebugImpl::Abort(const char * aFile, int aLine) Line 147	C++
xul.dll!XPTC__InvokebyIndex() Line 99	Unknown
xul.dll!XPCWrappedNative::CallMethod(XPCCallContext & ccx, XPCWrappedNative::CallMode) Line 1367	C++
xul.dll!XPC_WN_CallMethod(JSContext * cx, unsigned int argc, JS::Value * vp) Line 1128	C++
xul.dll!js::InternalCallOrConstruct(JSContext * cx, const JS::CallArgs & args, js::MaybeConstruct construct) Line 480	C++
xul.dll!Interpret(JSContext * cx, js::RunState & state) Line 2831	C++
xul.dll!js::RunScript(JSContext * cx, js::RunState & state) Line 426	C++
xul.dll!js::InternalCallOrConstruct(JSContext * cx, const JS::CallArgs & args, js::MaybeConstruct construct) Line 501	C++
xul.dll!js::fun_call(JSContext * cx, unsigned int argc, JS::Value * vp) Line 1192	C++
xul.dll!js::InternalCallOrConstruct(JSContext * cx, const JS::CallArgs & args, js::MaybeConstruct construct) Line 480	C++
xul.dll!js::CrossCompartmentWrapper::call(JSContext * cx, JS::Handle<JSObject *> wrapper, const JS::CallArgs & args) Line 309	C++
xul.dll!js::Proxy::call(JSContext * cx, JS::Handle<JSObject *> proxy, const JS::CallArgs & args) Line 400	C++
xul.dll!js::InternalCallOrConstruct(JSContext * cx, const JS::CallArgs & args, js::MaybeConstruct construct) Line 468	C++
xul.dll!js::jit::DoCallFallback(JSContext * cx, js::jit::BaselineFrame * frame, js::jit::ICCall_Fallback * stub_, unsigned int argc, JS::Value * vp, JS::MutableHandle<JS::Value> res) Line 5973	C++
000003bfac925a47()	Unknown
Crash Signature: [@ mozalloc_abort | xul.dll@0x7c7428 | xul.dll@0xb905b] → [@ mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort] [@ mozalloc_abort | xul.dll@0x7c7428 | xul.dll@0xb905b]
Summary: Crash in mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort(char const*, int) → Crash in mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort
This is a pretty generic crash stack, but it looks like in this case we retain control (presumably it's our own code that calls NS_DebugBreak). I bet we could get more information into the crash dump for this case, if someone had time to mess with it.
Component: JavaScript Engine → XPConnect
Crash Signature: [@ mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort] [@ mozalloc_abort | xul.dll@0x7c7428 | xul.dll@0xb905b] → [@ mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort | XPTC__InvokebyIndex] [@ mozalloc_abort | xul.dll@0x7c7428 | xul.dll@0xb905b]
This seems more like an XPCOM reflection bug, maybe the same issue as that thing that was exposed by the docshell change a week or two ago.
Component: XPConnect → XPCOM
Andrew can you help us figure out if this is also an issue in beta? This crash signature is showing up as a topcrash in 48 beta 1.
Flags: needinfo?(continuation)
I'm on PTO. Maybe Nathan could help.
Flags: needinfo?(continuation) → needinfo?(nfroyd)
(In reply to Andrew McCreight (PTO-ish) [:mccr8] from comment #2)
> This seems more like an XPCOM reflection bug, maybe the same issue as that
> thing that was exposed by the docshell change a week or two ago.

I guess this is referring to bug 1275719?  That bug wasn't about a crash, but merely failing to load a webextension.  I mean, the underlying issue that was addressed by that bug certainly exists in 49, but I have a hard time seeing how that would cause a crash.  We could try uplifting bug 1275719 and see if that fixes anything, but that'd be a stab in the dark...

I'm guessing the stack in comment 0 came from minidump examination?  The stack in the linked crash report is completely busted.  If that's the case, then I guess the offending line is:

https://dxr.mozilla.org/mozilla-central/source/xpcom/reflect/xptcall/md/win32/xptcinvoke_asm_x86_64.asm#97

which I don't think I believe, as I seriously doubt that either:

- That line is calling NS_DebugBreak; or
- NS_DebugBreak gets called via SIGSEGV on that line or similar, as there ought to be some other frames in between, I think.
Flags: needinfo?(nfroyd)
Yeah, this is still happening on branches that have bug 1275719, so that can't be it.
Crash Signature: [@ mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort | XPTC__InvokebyIndex] [@ mozalloc_abort | xul.dll@0x7c7428 | xul.dll@0xb905b] → [@ mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort | XPTC__InvokebyIndex] [@ mozalloc_abort | xul.dll@0x7c7428 | xul.dll@0xb905b] [@ Abort | mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort | XPTC__InvokebyIndex] [@ Abort | mozalloc_abort | Abort…
I looked over calls to abort in JS, and it looks like maybe only one is the debug abort:
  http://dev.searchfox.org/mozilla-central/source/toolkit/components/asyncshutdown/AsyncShutdown.jsm#979

I see a user comment that says "hang on quit" which would be consistent with that. It looks like AbortMessage should tell us who registered the async shutdown observer. But it sounds like this is a sort of shutdown hang that the browser stopped.
Depends on: 1286005
I tried searching for the signature, then doing a facet on the abort message, but unfortunately the abort message includes the PID so that does not do anything useful. I filed bug 1286005 for fixing that.

Skimming through a couple of pages of the reports, it looks like an even mix of Places and Session Store.
Win7, x64, FF50.0a1:
[@ Abort | mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort | XPTC__InvokebyIndex ]
https://crash-stats.mozilla.com/report/index/c8c42922-ade3-4fae-99ba-ba1bc2160715


https://crash-stats.mozilla.com/signature/?product=Firefox&signature=Abort%20%7C%20mozalloc_abort%20%7C%20NS_DebugBreak%20%7C%20nsDebugImpl%3A%3AAbort%20%7C%20XPTC__InvokebyIndex

Operating System
Operating System 	Count 	Percentage
Windows 10 		245 	60.5%
Windows 7 		104 	25.7%
Windows 8.1 		 53 	13.1%
Windows 8 		  3 	 0.7%

Product
Product 	Version 	Count 	Percentage 	Installations
Firefox 	50.0a1 		181 	44.7% 		234
Firefox 	47.0.1 		 55 	13.6% 		 50
Firefox 	49.0a2 		 41 	10.1% 		 48
Firefox 	47.0 		 36 	 8.9% 		 34
Firefox 	48.0b10 	 31 	 7.7% 		 24
Firefox 	48.0b9 		 15 	 3.7% 		 20
Firefox 	48.0b7 		  9 	 2.2% 		  6
Firefox 	45.2.0esr 	  6 	 1.5% 		  8
Firefox 	46.0.1 		  4 	 1.0% 		  3
Firefox 	47.0b3 		  3 	 0.7% 		  2
Firefox 	44.0b1 		  2 	 0.5% 		  1
Firefox 	45.0.1 		  2 	 0.5% 		  2
Firefox 	47.0a1 		  2 	 0.5% 		  1
Firefox 	48.0b1 		  2 	 0.5% 		  2
Firefox 	48.0b5 		  2 	 0.5% 		  2
Firefox 	41.0a2 		  1 	 0.2% 		  1
Firefox 	42.0 		  1 	 0.2% 		  1
Firefox 	42.0a1 		  1 	 0.2% 		  1
Firefox 	45.0.1esr 	  1 	 0.2% 		  1
Firefox 	45.0.2 		  1 	 0.2% 		  1
Firefox 	45.0a1 		  1 	 0.2% 		  1
Firefox 	45.0a2 		  1 	 0.2% 		  1
Firefox 	45.0b1 		  1 	 0.2% 		  1
Firefox 	45.0b99 	  1 	 0.2% 		  1
Firefox 	45.1.1esr 	  1 	 0.2% 		  1
Firefox 	46.0a2 		  1 	 0.2% 		  1
Firefox 	46.0b1 		  1 	 0.2% 		  1
Firefox 	47.0b2 		  1 	 0.2% 		  1
Firefox 	48.0b3 		  1 	 0.2% 		  1

Architecture
Architecture 	Count 	Percentage
amd64 		405 	100.0%


[Tracking Requested - why for this release (FF50.0a1)]:
181 crashes in the last 7 days with 234 installations.
OS: Windows 10 → Windows
Hardware: Unspecified → x86_64
Summary: Crash in mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort → Crash [@ mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort ]
Version: Trunk → 49 Branch
Version: 49 Branch → 41 Branch
Tracking 50+ so we can keep on eye on this crash.
Win7, x64, FF49.0a2:
[@ Abort | mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort | XPTC__InvokebyIndex ]
https://crash-stats.mozilla.com/report/index/d5994084-6032-44b0-95be-b84b12160801
I hope this is useful but I noticed that after the update to the version 45.2 this crash did not happen so whatever is causing it might be the same thing that was added to the 45.2 esr version.

Also that update caused crashes with my ati radeon video card too.
On Nightly right now, it looks like about 37% of these crashes are from session store. The second most common cause is only 2%. (You can look at this by faceting on Abort Message.)
Oh, that was across all crashes. If I just look at Nightly, then session store is about 81% of these crashes.
Component: XPCOM → Session Restore
Product: Core → Firefox
Summary: Crash [@ mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort ] → AsyncShutdown timeout in session store
[Tracking Requested - why for this release]:
(In reply to Marcia Knous [:marcia - use ni] from comment #11)
> Tracking 50+ so we can keep on eye on this crash.
51, too?

https://crash-stats.mozilla.com/report/index/50b85fbf-2671-4f90-af09-3ef8a2160816
Firefox 51.0a1 Crash Report [@ Abort | mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort | XPTC__InvokebyIndex ]
(Win7, 64bit)
AbortMessage = ###!!! ABORT: file resource:///modules/sessionstore/SessionStore.jsm, line 1465

...but I guess SS is just the trigger and the bug is in the core...

Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	mozglue.dll 	mozalloc_abort(char const* const) 	memory/mozalloc/mozalloc_abort.cpp:33
1 	xul.dll 	NS_DebugBreak 	xpcom/base/nsDebugImpl.cpp:405
2 	xul.dll 	nsDebugImpl::Abort(char const*, int) 	xpcom/base/nsDebugImpl.cpp:146
3 	xul.dll 	XPTC__InvokebyIndex 	xpcom/reflect/xptcall/md/win32/xptcinvoke_asm_x86_64.asm:97
4 		@0x124e0e3f 	
5 	xul.dll 	XPCJSRuntime::Get() 	js/xpconnect/src/xpcprivate.h:416
6 	xul.dll 	XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) 	js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1128
7 	xul.dll 	js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp:453
8 	xul.dll 	Interpret 	js/src/vm/Interpreter.cpp:2881
9 	xul.dll 	js::RunScript(JSContext*, js::RunState&) 	js/src/vm/Interpreter.cpp:399
10 	xul.dll 	js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp:471
11 	xul.dll 	js::fun_call(JSContext*, unsigned int, JS::Value*) 	js/src/jsfun.cpp:1251
12 	xul.dll 	js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp:453
13 	xul.dll 	js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/src/proxy/CrossCompartmentWrapper.cpp:329
14 	xul.dll 	js::Proxy::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/src/proxy/Proxy.cpp:401
15 	xul.dll 	js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp:441
16 	xul.dll 	js::jit::DoCallFallback 	js/src/jit/BaselineIC.cpp:5981
17 		@0x245bfc578f2
I just ran into this and it definitely seemed to be related to a hang on quit. It was on Firefox 48.0, Mac.
OS: Windows → All
¡Hola!

This just bite me recently...

¡Gracias!
Alex

Report ID 	Date Submitted
bp-328aec9a-1232-4b54-aa26-4f7f02160816
	16/08/2016	12:11 p.m.
See Also: → 1295663
This bug is still hanging around at the top of nightly crash stats. ni on the triage lead to see if there is anything we can do to make progress.
Flags: needinfo?(mdeboer)
Alex' crash shows the stack to originate from SQLite.jsm, but concerning the same code: `AsyncShutdown.profileBeforeChange.addBlocker`.
In SessionStore.jsm, it's `AsyncShutdown.quitApplicationGranted.addBlocker`.

Since David wrote this module, originally, I think he'd be the best person to ask: why is this causing a crash in core? Is `addBlocker` using some js-ctypes magic combo that might trigger a crash?
Flags: needinfo?(mdeboer) → needinfo?(dteller)
(In reply to alex_mayorga from comment #19)
> Report ID 	Date Submitted
> bp-328aec9a-1232-4b54-aa26-4f7f02160816
> 	16/08/2016	12:11 p.m.

This specific report has nothing to do with Session Restore, it's an async shutdown crash in Places.

All the stacks for async shutdown have the same stack, the thing to look at is the AsyncShutdownTimeout property in the Metadata tab of crash-stats.
(In reply to Mike de Boer [:mikedeboer] from comment #21)
> Since David wrote this module, originally, I think he'd be the best person
> to ask: why is this causing a crash in core? Is `addBlocker` using some
> js-ctypes magic combo that might trigger a crash?

I can probably try to answer, while I didn't write async shutdown code, I am using it in Places and looking into similar crashes.

when you register a promise with addBlocker, you basically sign a contract that such promise on shutdown won't take more than 1 minute to resolve. If you fail resolving the promise for whatever reason, like a bug causing an exception before you invoke resolve, Async Shutdown time outs and forces an application crash, whose stack is the one you see, that is sort of useless cause it's the same for any async shutdown consumer.
Though, attached to the crash report there is metadata, one of those is AsyncShutdownTimeout that is a "progress" object you pass along with your shutdown promise and you are expected to fill it with data useful to figure out the timeout reasons.

Hope this helps, otherwise, you can set again the flag on David.
Flags: needinfo?(dteller)
I wish to add that, in this context, the only alternative to crashing Firefox is freezing Firefox forever, which is much worse for a variety of reasons.

To triage these bugs, the stack is useless, please use metadata field "AsyncShutdown Timeout".
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.