Closed Bug 867396 Opened 13 years ago Closed 13 years ago

Failure of NS_DispatchToMainThread() can lead to assertion

Categories

(Firefox OS Graveyard :: General, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-b2g:leo+, b2g18 affected)

RESOLVED INVALID
1.1 CS (11may)
blocking-b2g leo+
Tracking Status
b2g18 --- affected

People

(Reporter: alhadp, Assigned: tzimmermann)

Details

(Whiteboard: [CR 471640] u=fx-os-user c=may-6-17 p=1)

Attachments

(1 file)

If NS_DispatchToMainThread() fails for an nsIRunnable, we end up releasing the runnable right there on the (non-main) thread, as would be expected. The dtor for the runnable ends up calling GetXPConnect(), which asserts - saying that we need to be on main thread. This issue usually is seen after nightly runs of stress tests (which could have caused a memory allocation within NS_DispatchToMainThread() to fail). Here are the steps our test team used: 1.Play FM in background. 2.Turn on BT and enable auto answer. 3.Connect the GPS cable and open the GPS test application. 4.Run a script which will do airplane on/off, MO SMS and Mo call sequentially 5.After the night run, mini dumps are generated in the phone Call stack: Thread 15 (crashed) 0 libxul.so!nsXPConnect::GetXPConnect [nsXPConnect.cpp : 141 + 0x2] r4 = 0x445967c0 r5 = 0x00000000 r6 = 0x440c8f40 r7 = 0x44cffd28 r8 = 0x440c8f40 r9 = 0x48289260 r10 = 0x00000001 fp = 0x00000002 sp = 0x44cffc28 lr = 0x4037636f pc = 0x410d1e64 Found by: given as instruction pointer in context 1 libxul.so!XPCJSRuntime::Get [nsXPConnect.cpp : 243 + 0x3] r4 = 0x445967c0 r5 = 0x00000000 r6 = 0x440c8f40 r7 = 0x44cffd28 r8 = 0x440c8f40 r9 = 0x48289260 r10 = 0x00000001 fp = 0x00000002 sp = 0x44cffc40 pc = 0x410d1eb3 Found by: call frame info 2 libxul.so!nsXPCWrappedJS::Release [XPCWrappedJS.cpp : 188 + 0x3] r4 = 0x445967c0 r5 = 0x00000000 r6 = 0x440c8f40 r7 = 0x44cffd28 r8 = 0x440c8f40 r9 = 0x48289260 r10 = 0x00000001 fp = 0x00000002 sp = 0x44cffc48 pc = 0x410e59ff Found by: call frame info 3 libxul.so!nsXPTCStubBase::Release [xptcall.cpp : 33 + 0x7] r4 = 0x44bc4bf0 r5 = 0x00000000 r6 = 0x440c8f40 r7 = 0x44cffd28 r8 = 0x440c8f40 r9 = 0x48289260 r10 = 0x00000001 fp = 0x00000002 sp = 0x44cffc70 pc = 0x413ad923 Found by: call frame info 4 libqc_b2g_ril.so!nsCOMPtr<nsIRunnable>::nsCOMPtr [nsCOMPtr.h : 408 + 0x5] r4 = 0x44bc4bf0 r5 = 0x00000000 r6 = 0x440c8f40 r7 = 0x44cffd28 r8 = 0x440c8f40 r9 = 0x48289260 r10 = 0x00000001 fp = 0x00000002 sp = 0x44cffc78 pc = 0x423c885f Found by: call frame info
(leo+, blocking product stability)
blocking-b2g: --- → leo+
Whiteboard: [CR 471640]
Target Milestone: --- → 1.1 CS (11may)
Assignee: nobody → tzimmermann
Whiteboard: [CR 471640] → [CR 471640] u=fx-os-user c=may-6-17 p=0
Whiteboard: [CR 471640] u=fx-os-user c=may-6-17 p=0 → [CR 471640] u=fx-os-user c=may-6-17 p=1
(bumping priority to reflect that this bug was the cause of over half the crashes from the weekend stability run)
Severity: normal → critical
Priority: -- → P1
(In reply to Michael Vines [:m1] [:evilmachines] from comment #2) > (bumping priority to reflect that this bug was the cause of over half the > crashes from the weekend stability run) You you have a link to the stack traces or some logs? Thanks.
Flags: needinfo?(mvines)
Hrmp, pretty sure this is not a Gecko bug. Sorry for the noise.
Status: NEW → RESOLVED
blocking-b2g: leo+ → ---
Closed: 13 years ago
Flags: needinfo?(mvines)
Resolution: --- → INVALID
Ok, maybe a Gecko bug after all? If you believe the stack trace, we managed to do this: XPCWrappedJS.cpp:188 [1] -> nsXPConnect.cpp:243 [2] -> nsXPConnect.cpp:141 [3] -> MOZ_CRASH() But before we execute [1], there's a check to ensure that we're allowed to be off the main thread that we pass [4]. Yet nsXPConnect.cpp:141 then induces a crash because were off the main thread. XPConnect bug? [1] http://mxr.mozilla.org/mozilla-b2g18/source/js/xpconnect/src/XPCWrappedJS.cpp#188 [2] http://mxr.mozilla.org/mozilla-b2g18/source/js/xpconnect/src/nsXPConnect.cpp#243 [3] http://mxr.mozilla.org/mozilla-b2g18/source/js/xpconnect/src/nsXPConnect.cpp#140 [4] http://mxr.mozilla.org/mozilla-b2g18/source/js/xpconnect/src/XPCWrappedJS.cpp#171
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
ni?tzimmermann for comment 5 Thomas, are you still the right person to check this base on comment 5?
Flags: needinfo?(tzimmermann)
Could this be happening late in shutdown? NS_DispatchToMainThread should almost never fail unless we're totally out of memory or else the thread manager has been shut down...
(In reply to ben turner [:bent] from comment #7) > Could this be happening late in shutdown? NS_DispatchToMainThread should > almost never fail unless we're totally out of memory or else the thread > manager has been shut down... I can double check some of the logs tomorrow but I don't think we are triggering a Gecko shutdown here. USS for the main process was floating at about 70MB at the time of the crash in one of the logs I looked at today (with plenty of apps in the background), so we're not out of memory either.
blocking-b2g: --- → leo+
We need full stacks for the various threads (at least thread 15 and the main thread) to understand what's happening here.
Attached file full minidump
Thread 0 0 libxul.so!XPCWrappedNative::GetLock [xpcprivate.h : 2748 + 0x0] r0 = 0x4f2b34f0 r1 = 0x00000000 r2 = 0x00000000 r3 = 0x50e0eb60 r4 = 0x4f2b34f0 r5 = 0x4f2b350c r6 = 0xbea69148 r7 = 0x435a6358 r8 = 0x4ee6d280 r9 = 0xbea68f10 r10 = 0xbea69148 fp = 0x51a44244 sp = 0xbea68ea8 lr = 0x410f02c7 pc = 0x410ee1a6 Found by: given as instruction pointer in context
Are you sure that NS_DispatchToMainThread is failing? Those stacks indicate we're not in any situation where it should fail. I suspect your code looks something like class DoStuff : public nsRunnable { public: DoStuff(already_AddRefed<nsIFoo> aFoo) : mFoo (aFoo) { } NS_IMETHOD Run() { NS_ASSERTION(NS_IsMainThread(), "Wrong thread!"); // Do stuff with mFoo, which can only be used on the main thread. return NS_OK; } private: nsCOMPtr<nsIFoo> mFoo; }; nsCOMPtr<nsIFoo> foo; ... nsCOMPtr<nsIRunnable> runnable = new DoStuff(foo.forget()); NS_DispatchToMainThread(runnable); In this case mFoo is cleared by DoStuff::~DoStuff. You have no guarantee here that DoStuff's destructor will run on the main thread though. If the main thread executes DoStuff before the other thread returns from NS_DispatchToMainThread and gets around to running runnable's destructor then ~DoStuff will run on the other thread. The solution to this is to release the reference to the interface when you are on the main thread.
Ah crap. Yep that's the bug. Thank you!
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → INVALID
Flags: needinfo?(tzimmermann)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: