Closed
Bug 1004669
Opened 11 years ago
Closed 11 years ago
Intermittent Windows & OS X shutdown timeout with MediaShutdownManager::Shutdown() on the stack
Categories
(Core :: Audio/Video, defect)
Core
Audio/Video
Tracking
()
RESOLVED
FIXED
mozilla32
People
(Reporter: emorley, Assigned: jwwang)
References
Details
(Keywords: intermittent-failure, Whiteboard: [qa-] )
Attachments
(3 files, 1 obsolete file)
4.91 KB,
patch
|
cpearce
:
review+
rlin
:
feedback+
|
Details | Diff | Splinter Review |
4.88 KB,
patch
|
jwwang
:
review+
Sylvestre
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
5.04 KB,
patch
|
jwwang
:
review+
Sylvestre
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
Breaking out from bug 918759, which has become a dumping ground.
On OS X and Windows, we're seeing intermittent failures where we hang during shutdown - and the induced crash shows MediaShutdownManager on the stack.
Looking at the timeline, I believe bug 938107 caused this regression, since:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=918759&entireHistory=true&tree=trunk
...shows a spike at 2013-12-16/2013-12-17 (note the OS X instances prior to that we're bug 957693 and bug 960605 aiui).
Example logs:
https://tbpl.mozilla.org/php/getParsedLog.php?id=38356405&full=1&branch=try#error0
Windows 7 32-bit try debug test crashtest on 2014-04-23 17:12:28 PDT for push 97491fef6385
17:35:31 INFO - Thread 0
17:35:31 INFO - 0 ntdll.dll + 0x470b4
17:35:31 INFO - eip = 0x772270b4 esp = 0x002af6e8 ebp = 0x002af754 ebx = 0x00000000
17:35:31 INFO - esi = 0x00000050 edi = 0x00000000 eax = 0x005c1d28 ecx = 0x650e6d3c
17:35:31 INFO - edx = 0x650e6d46 efl = 0x00200246
17:35:31 INFO - Found by: given as instruction pointer in context
17:35:31 INFO - 1 kernel32.dll + 0x4baf2
17:35:31 INFO - eip = 0x759cbaf3 esp = 0x002af75c ebp = 0x002af76c
17:35:31 INFO - Found by: previous frame's frame pointer
17:35:31 INFO - 2 kernel32.dll + 0x4baa1
17:35:31 INFO - eip = 0x759cbaa2 esp = 0x002af774 ebp = 0x002af780
17:35:31 INFO - Found by: previous frame's frame pointer
17:35:31 INFO - 3 nss3.dll!_PR_MD_WAIT_CV [w95cv.c:97491fef6385 : 248 + 0x10]
17:35:31 INFO - eip = 0x67a119de esp = 0x002af788 ebp = 0x002af79c
17:35:31 INFO - Found by: previous frame's frame pointer
17:35:31 INFO - 4 nss3.dll!_PR_WaitCondVar [prucv.c:97491fef6385 : 172 + 0x24]
17:35:31 INFO - eip = 0x67a0db1c esp = 0x002af7a4 ebp = 0x002af7b8
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 5 nss3.dll!PR_WaitCondVar [prucv.c:97491fef6385 : 525 + 0xb]
17:35:31 INFO - eip = 0x67a0df36 esp = 0x002af7c0 ebp = 0x002af7d8
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 6 nss3.dll!PR_Wait [prmon.c:97491fef6385 : 294 + 0xf]
17:35:31 INFO - eip = 0x67a0116d esp = 0x002af7e0 ebp = 0x002af7f8
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 7 xul.dll!mozilla::ReentrantMonitor::Wait(unsigned int) [BlockingResourceBase.cpp:97491fef6385 : 320 + 0xb]
17:35:31 INFO - eip = 0x63771999 esp = 0x002af800 ebp = 0x002af820
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 8 xul.dll!nsEventQueue::GetEvent(bool,nsIRunnable * *) [nsEventQueue.cpp:97491fef6385 : 63 + 0x9]
17:35:31 INFO - eip = 0x637e3499 esp = 0x002af828 ebp = 0x002af83c
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 9 xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:97491fef6385 : 693 + 0x23]
17:35:31 INFO - eip = 0x637eeb99 esp = 0x002af844 ebp = 0x002af8a4
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 10 xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:97491fef6385 : 263 + 0xc]
17:35:31 INFO - eip = 0x637867ae esp = 0x002af8ac ebp = 0x002af8b8
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 11 xul.dll!mozilla::SharedThreadPool::SpinUntilShutdown() [SharedThreadPool.cpp:97491fef6385 : 78 + 0xc]
17:35:31 INFO - eip = 0x648addec esp = 0x002af8c0 ebp = 0x002af8d8
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 12 xul.dll!mozilla::MediaShutdownManager::Shutdown() [MediaShutdownManager.cpp:97491fef6385 : 134 + 0x4]
17:35:31 INFO - eip = 0x648b0a0f esp = 0x002af8cc ebp = 0x002af8d8
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 13 xul.dll!mozilla::MediaShutdownManager::Observe(nsISupports *,char const *,wchar_t const *) [MediaShutdownManager.cpp:97491fef6385 : 102 + 0x7]
17:35:31 INFO - eip = 0x648b4005 esp = 0x002af8d8 ebp = 0x002af8d8
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 14 xul.dll!nsObserverList::NotifyObservers(nsISupports *,char const *,wchar_t const *) [nsObserverList.cpp:97491fef6385 : 96 + 0x19]
17:35:31 INFO - eip = 0x637bf6b0 esp = 0x002af8e0 ebp = 0x002af8f8
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 15 xul.dll!nsObserverService::NotifyObservers(nsISupports *,char const *,wchar_t const *) [nsObserverService.cpp:97491fef6385 : 302 + 0xd]
17:35:31 INFO - eip = 0x637bf7cf esp = 0x002af900 ebp = 0x002af914
17:35:31 INFO - Found by: call frame info
17:35:31 INFO - 16 xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager *) [nsXPComInit.cpp:97491fef6385 : 777 + 0x16]
17:35:31 INFO - eip = 0x637868e0 esp = 0x002af91c ebp = 0x002af950
17:35:31 INFO - Found by: call frame info
...
And:
https://tbpl.mozilla.org/php/getParsedLog.php?id=38290528&full=1&branch=mozilla-inbound#error1
Rev4 MacOSX Snow Leopard 10.6 mozilla-inbound debug test mochitest-1 on 2014-04-22 18:06:19 PDT for push 4fca4b6565d8
19:13:51 INFO - Thread 0 (crashed)
19:13:51 INFO - 0 libSystem.B.dylib + 0x3ba6a
19:13:51 INFO - rbx = 0x00007fff7022bcc0 r12 = 0x000000010012bbf0
19:13:51 INFO - r13 = 0x000000010012bb78 r14 = 0x0000000000000001
19:13:51 INFO - r15 = 0x0000000000000000 rip = 0x00007fff819a0a6a
19:13:51 INFO - rsp = 0x00007fff5fbfe238 rbp = 0x00007fff5fbfe2b0
19:13:51 INFO - Found by: given as instruction pointer in context
19:13:51 INFO - 1 libSystem.B.dylib + 0x3f880
19:13:51 INFO - rip = 0x00007fff819a4881 rsp = 0x00007fff5fbfe240
19:13:51 INFO - rbp = 0x00007fff5fbfe2b0
19:13:51 INFO - Found by: stack scanning
19:13:51 INFO - 2 libSystem.B.dylib + 0xdb7f2
19:13:51 INFO - rip = 0x00007fff81a407f3 rsp = 0x00007fff5fbfe268
19:13:51 INFO - rbp = 0x00007fff5fbfe2b0
19:13:51 INFO - Found by: stack scanning
19:13:51 INFO - 3 libSystem.B.dylib + 0x3a12c
19:13:51 INFO - rip = 0x00007fff8199f12d rsp = 0x00007fff5fbfe280
19:13:51 INFO - rbp = 0x00007fff5fbfe2b0
19:13:51 INFO - Found by: stack scanning
19:13:51 INFO - 4 libnss3.dylib!PR_Wait + 0x142
19:13:51 INFO - rip = 0x0000000100525163 rsp = 0x00007fff5fbfe2c0
19:13:51 INFO - rbp = 0x00007fff5fbfe320
19:13:51 INFO - Found by: stack scanning
19:13:51 INFO - 5 XUL!_ZZL13nsEscapeCountPKc12nsEscapeMaskPmE8hexChars + 0x291b
19:13:51 INFO - rip = 0x00000001055e929c rsp = 0x00007fff5fbfe2e8
19:13:51 INFO - rbp = 0x00007fff5fbfe320
19:13:51 INFO - Found by: stack scanning
19:13:51 INFO - 6 libSystem.B.dylib + 0x6699
19:13:51 INFO - rip = 0x00007fff8196b69a rsp = 0x00007fff5fbfe2f0
19:13:51 INFO - rbp = 0x00007fff5fbfe320
19:13:51 INFO - Found by: stack scanning
19:13:51 INFO - 7 XUL!mozilla::ReentrantMonitor::Wait(unsigned int) [BlockingResourceBase.cpp:4fca4b6565d8 : 319 + 0xb]
19:13:51 INFO - rip = 0x0000000101812f53 rsp = 0x00007fff5fbfe330
19:13:51 INFO - rbp = 0x00007fff5fbfe360
19:13:51 INFO - Found by: stack scanning
19:13:51 INFO - 8 XUL!nsEventQueue::GetEvent(bool, nsIRunnable**) [ReentrantMonitor.h:4fca4b6565d8 : 199 + 0xc]
19:13:51 INFO - rbx = 0x000000010012b8d0 r12 = 0x00000001055e929c
19:13:51 INFO - r13 = 0x00000001055e92b4 r14 = 0x00007fff5fbfe3f8
19:13:51 INFO - r15 = 0x00000001055e9ef7 rip = 0x00000001018ab3c9
19:13:51 INFO - rsp = 0x00007fff5fbfe370 rbp = 0x00007fff5fbfe3a0
19:13:51 INFO - Found by: call frame info
19:13:51 INFO - 9 XUL!nsThread::ProcessNextEvent(bool, bool*) [nsThread.h:4fca4b6565d8 : 97 + 0x16]
19:13:51 INFO - rbx = 0x0000000000000000 r12 = 0x000000010012b850
19:13:51 INFO - r13 = 0x000000010012d8a8 r14 = 0x0000000000000001
19:13:51 INFO - r15 = 0x000000010012b890 rip = 0x00000001018aeca0
19:13:51 INFO - rsp = 0x00007fff5fbfe3b0 rbp = 0x00007fff5fbfe480
19:13:51 INFO - Found by: call frame info
19:13:51 INFO - 10 XUL!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:4fca4b6565d8 : 263 + 0xc]
19:13:51 INFO - rbx = 0x0000000000000001 r12 = 0x00000001563bd880
19:13:51 INFO - r13 = 0x000000010012d8a8 r14 = 0x0000000151ab5b78
19:13:51 INFO - r15 = 0x00000001055dbda6 rip = 0x000000010181da53
19:13:51 INFO - rsp = 0x00007fff5fbfe490 rbp = 0x00007fff5fbfe4a0
19:13:51 INFO - Found by: call frame info
19:13:51 INFO - 11 XUL!mozilla::SharedThreadPool::SpinUntilShutdown() [SharedThreadPool.cpp:4fca4b6565d8 : 78 + 0x11]
19:13:51 INFO - rbx = 0x0000000151ab5b60 r12 = 0x00000001563bd880
19:13:51 INFO - r13 = 0x000000010012d8a8 r14 = 0x0000000151ab5b78
19:13:51 INFO - r15 = 0x00000001055dbda6 rip = 0x0000000103102c8c
19:13:51 INFO - rsp = 0x00007fff5fbfe4b0 rbp = 0x00007fff5fbfe4c0
19:13:51 INFO - Found by: call frame info
19:13:51 INFO - 12 XUL!mozilla::MediaShutdownManager::Shutdown() [MediaShutdownManager.cpp:4fca4b6565d8 : 134 + 0x4]
19:13:51 INFO - rbx = 0x0000000151ab5b60 r12 = 0x00000001563bd880
19:13:51 INFO - r13 = 0x000000010012d8a8 r14 = 0x0000000151ab5b78
19:13:51 INFO - r15 = 0x00000001055dbda6 rip = 0x0000000103102b85
19:13:51 INFO - rsp = 0x00007fff5fbfe4d0 rbp = 0x00007fff5fbfe4f0
19:13:51 INFO - Found by: call frame info
19:13:51 INFO - 13 XUL!mozilla::MediaShutdownManager::Observe(nsISupports*, char const*, char16_t const*) [MediaShutdownManager.cpp:4fca4b6565d8 : 102 + 0x7]
19:13:51 INFO - rbx = 0x00000001055dbda6 r12 = 0x00000001563bd880
19:13:51 INFO - r13 = 0x000000010012d8a8 r14 = 0x0000000151ab5b60
19:13:51 INFO - r15 = 0x00000001055dbda6 rip = 0x0000000103102a91
19:13:51 INFO - rsp = 0x00007fff5fbfe500 rbp = 0x00007fff5fbfe510
19:13:51 INFO - Found by: call frame info
19:13:51 INFO - 14 XUL!nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverList.cpp:4fca4b6565d8 : 96 + 0xe]
19:13:51 INFO - rbx = 0x0000000000000005 r12 = 0x00000001563bd880
19:13:51 INFO - r13 = 0x000000010012d8a8 r14 = 0x0000000000000000
19:13:51 INFO - r15 = 0x00000001055dbda6 rip = 0x00000001018645f8
19:13:51 INFO - rsp = 0x00007fff5fbfe520 rbp = 0x00007fff5fbfe550
19:13:51 INFO - Found by: call frame info
19:13:51 INFO - 15 XUL!nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) [nsObserverService.cpp:4fca4b6565d8 : 302 + 0x10]
19:13:51 INFO - rbx = 0x0000000100144ae0 r12 = 0x00000001055dbda6
19:13:51 INFO - r13 = 0x0000000100144b08 r14 = 0x0000000000000000
19:13:51 INFO - r15 = 0x000000010012d8a8 rip = 0x0000000101865ba4
19:13:51 INFO - rsp = 0x00007fff5fbfe560 rbp = 0x00007fff5fbfe590
19:13:51 INFO - Found by: call frame info
19:13:51 INFO - 16 XUL!mozilla::ShutdownXPCOM(nsIServiceManager*) [nsXPComInit.cpp:4fca4b6565d8 : 776 + 0xd]
19:13:51 INFO - rbx = 0x0000000000000000 r12 = 0x0000000000000000
19:13:51 INFO - r13 = 0x00007fff5fbfe9e8 r14 = 0x000000010012d8a8
19:13:51 INFO - r15 = 0x0000000000000000 rip = 0x0000000101825c0c
19:13:51 INFO - rsp = 0x00007fff5fbfe5a0 rbp = 0x00007fff5fbfe610
19:13:51 INFO - Found by: call frame info
...
Reporter | ||
Comment 1•11 years ago
|
||
This intermittent failure is #10 on OrangeFactor - cpearce or roc, could you take a look?
Flags: needinfo?(cpearce)
Reporter | ||
Updated•11 years ago
|
Flags: needinfo?(roc)
Reporter | ||
Comment 2•11 years ago
|
||
Comment hidden (Legacy TBPL/Treeherder Robot) |
This is going to have to be cpearce I think
Flags: needinfo?(roc)
Comment 5•11 years ago
|
||
I'll add this to my queue.
This could caused by *anything* in the media stack, so it will take some time to find a solution.
Flags: needinfo?(cpearce)
Comment 6•11 years ago
|
||
If we block waiting for the mediadecoder shutdown like this in release builds, and we hang like this, this could make the user unable to restart Firefox. Which is bad.
What we can do here is in opt (and thus release but not debug) builds is stick a timeout in SharedThreadPool::SpinUntilShutdown(), and give up if we don't complete shutdown before the timeout expires. Then we're less likely to cause grief with users, but we'll still get data to help us solve this.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 9•11 years ago
|
||
It looks like some leaks preventing sPools->Count() from reaching 0 in SharedThreadPool::Release and ShutdownPoolsEvent is not dispatched. SharedThreadPool::SpinUntilShutdown waits until sPools is destroyed which never happens.
Reporter | ||
Comment 10•11 years ago
|
||
Thank you for looking at this :-)
Assignee | ||
Comment 11•11 years ago
|
||
MediaTaskQueue::Dispatch doesn't take the memory ownership of aRunnable when mIsShutdown is true. (https://hg.mozilla.org/try/file/20ca234fd62b/content/media/MediaTaskQueue.cpp#l33) And there will be a leak if passing |new nsRunnable(xxx)| and the memory ownership isn't transferred. (https://hg.mozilla.org/try/file/20ca234fd62b/content/media/MediaDecoderStateMachine.cpp#l1610)
nsThread::DispatchInternal also has such a problem which can return early before transferring the memory ownership. (https://hg.mozilla.org/try/file/20ca234fd62b/xpcom/threads/nsThread.cpp#l453)
Should we consider |thread->Dispatch(new nsRunnable())| a bad idiom or parameter as raw pointer being bad for the memory ownership is unclear?
Attachment #8420621 -
Flags: review?(cpearce)
Attachment #8420621 -
Flags: feedback?(rlin)
Assignee | ||
Comment 12•11 years ago
|
||
try: https://tbpl.mozilla.org/?tree=Try&rev=a46e3a038a03
Most are green...
Comment 13•11 years ago
|
||
(In reply to JW Wang [:jwwang] from comment #11)
> Should we consider |thread->Dispatch(new nsRunnable())| a bad idiom or
> parameter as raw pointer being bad for the memory ownership is unclear?
Yes! This is a bad idiom. I hit this last week in new code I was writing. We should prevent these silly mistakes.
Comment 14•11 years ago
|
||
Comment on attachment 8420621 [details] [diff] [review]
1004669_fix_leaks.patch
Review of attachment 8420621 [details] [diff] [review]:
-----------------------------------------------------------------
Good catch, but we can tweak the interface to make it explicit.
::: content/media/MediaTaskQueue.cpp
@@ +26,5 @@
> MOZ_COUNT_DTOR(MediaTaskQueue);
> }
>
> nsresult
> +MediaTaskQueue::Dispatch(RefPtr<nsIRunnable> aRunnable)
Please make this:
nsresult
MediaTaskQueue::Dispatch(TemporaryRef<nsIRunnable> aRunnable)
{
RefPtr<nsIRunnable> runnable(aRunnable);
// ...
Then the caller must explicitly call event.forget() and pass ownership of a reference to the task queue, though it may retain another reference it if wishes.
Attachment #8420621 -
Flags: review?(cpearce) → review-
Assignee | ||
Comment 15•11 years ago
|
||
Sometimes I am confused with when to use nsRefPtr or RefPtr. It appears to me RefPtr+TemporaryRef are easier to use and read than nsRefPtr+already_AddRefed. Should we also replace nsRefPtr in MediaDecoderStateMachine?
Comment 16•11 years ago
|
||
(In reply to JW Wang [:jwwang] from comment #15)
> Sometimes I am confused with when to use nsRefPtr or RefPtr. It appears to
> me RefPtr+TemporaryRef are easier to use and read than
> nsRefPtr+already_AddRefed. Should we also replace nsRefPtr in
> MediaDecoderStateMachine?
I prefer mozilla::RefPtr. I started a thread in m.d.platform to clarify:
https://groups.google.com/forum/#!topic/mozilla.dev.platform/Xsuwy4h7lNE
Assignee | ||
Comment 17•11 years ago
|
||
Use TemporaryRef to prevent loss of memory ownership.
Assignee: nobody → jwwang
Attachment #8420621 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8420621 -
Flags: feedback?(rlin)
Attachment #8420744 -
Flags: feedback?(rlin)
Comment 18•11 years ago
|
||
Comment on attachment 8420744 [details] [diff] [review]
1004669_fix_leaks.patch
Look good to me. :)
Attachment #8420744 -
Flags: feedback?(rlin) → feedback+
Assignee | ||
Updated•11 years ago
|
Attachment #8420744 -
Flags: review?(cpearce)
Updated•11 years ago
|
Attachment #8420744 -
Flags: review?(cpearce) → review+
Assignee | ||
Comment 19•11 years ago
|
||
Keywords: checkin-needed
Comment 20•11 years ago
|
||
Keywords: checkin-needed
Comment 21•11 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Comment 22•11 years ago
|
||
Given how this bug started (spun off from a big dumping ground bug), I'm not entirely sure what all branches are affected by this. AFAICT, both Aurora and Beta are affected? JW, can you please confirm and nominate this patch for whatever branches need it? :)
status-firefox30:
--- → affected
status-firefox31:
--- → affected
status-firefox32:
--- → fixed
status-firefox-esr24:
--- → unaffected
Flags: needinfo?(jwwang)
Assignee | ||
Comment 23•11 years ago
|
||
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 962385
User impact if declined: browser might hang if shutdown when playing music
Testing completed (on m-c, etc.): try: https://tbpl.mozilla.org/?tree=Try&rev=f4a1a4a84069
Risk to taking this patch (and alternatives if risky): low
String or IDL/UUID changes made by this patch: none
Attachment #8421592 -
Flags: review+
Attachment #8421592 -
Flags: approval-mozilla-aurora?
Flags: needinfo?(jwwang)
Assignee | ||
Comment 24•11 years ago
|
||
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 962385
User impact if declined: browser might hang if shutdown when playing music
Testing completed (on m-c, etc.): try: https://tbpl.mozilla.org/?tree=Try&rev=37893fcdde60
Risk to taking this patch (and alternatives if risky): low
String or IDL/UUID changes made by this patch: none
Attachment #8421595 -
Flags: review+
Attachment #8421595 -
Flags: approval-mozilla-beta?
Updated•11 years ago
|
Attachment #8421592 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Updated•11 years ago
|
Attachment #8421595 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 25•11 years ago
|
||
Updated•11 years ago
|
Whiteboard: [qa-]
Comment 26•11 years ago
|
||
Updated•11 years ago
|
status-b2g-v1.4:
--- → fixed
status-b2g-v2.0:
--- → fixed
Reporter | ||
Comment 27•11 years ago
|
||
(Tweaking summary to avoid future TBPL false positives, now this is fixed)
Summary: Intermittent *Windows & OS X* TEST-UNEXPECTED-FAIL | Shutdown | application timed out after 330 seconds with no output (with MediaShutdownManager::Shutdown() on the stack) → Intermittent Windows & OS X shutdown timeout with MediaShutdownManager::Shutdown() on the stack
You need to log in
before you can comment on or make changes to this bug.
Description
•