Very slow shutdown on test_mediaDecoding.html

RESOLVED FIXED in Firefox 31

Status

()

defect
P1
normal
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: cpearce, Assigned: padenot)

Tracking

29 Branch
mozilla33
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(firefox31 fixed, firefox32 fixed, firefox33 fixed, b2g-v2.0 fixed, b2g-v2.1 fixed)

Details

(Whiteboard: [qa-])

Attachments

(1 attachment, 1 obsolete attachment)

I noticed that test_mediaDecoding.html has a very slow shutdown time when its run by itself. Usually it takes about one minute from mochitest finish until Gecko has shutdown. XPCOM shutdown is waiting for threads to shutdown, and there's a cubeb thread alive, so I'd guess that is the culprit.

This happens on my Win8.1 box, and in my Ubuntu 12.04 VMWare VM.

Main thread:

ntdll.dll!_NtWaitForSingleObject@12()	Unknown
KernelBase.dll!_WaitForSingleObjectEx@12()	Unknown
KernelBase.dll!_WaitForSingleObject@8()	Unknown
nss3.dll!_PR_MD_WAIT_CV(_MDCVar * cv=0x0110f384, _MDLock * lock=0x0110f220, unsigned int timeout=4294967295) Line 248	C
nss3.dll!_PR_WaitCondVar(PRThread * thread=0x0129d000, PRCondVar * cvar=0x0110f310, PRLock * lock=0x0110f204, unsigned int timeout=4294967295) Line 172	C
nss3.dll!PR_WaitCondVar(PRCondVar * cvar=0x0110f310, unsigned int timeout=4294967295) Line 525	C
nss3.dll!PR_Wait(PRMonitor * mon=0x0110f200, unsigned int ticks=4294967295) Line 294	C
xul.dll!mozilla::ReentrantMonitor::Wait(unsigned int interval=4294967295) Line 320	C++
xul.dll!mozilla::ReentrantMonitorAutoEnter::Wait(unsigned int interval=4294967295) Line 200	C++
xul.dll!nsEventQueue::GetEvent(bool mayWait=true, nsIRunnable * * result=0x00d1f914) Line 64	C++
xul.dll!nsThread::nsChainedEventQueue::GetEvent(bool mayWait=true, nsIRunnable * * event=0x00d1f914) Line 98	C++
xul.dll!nsThread::ProcessNextEvent(bool mayWait=true, bool * result=0x00d1f973) Line 707	C++
xul.dll!NS_ProcessNextEvent(nsIThread * thread=0x010fecc0, bool mayWait=true) Line 263	C++
xul.dll!nsThread::Shutdown() Line 559	C++
xul.dll!nsThreadManager::Shutdown() Line 139	C++
xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager * servMgr=0x01113a44) Line 813	C++
xul.dll!NS_ShutdownXPCOM(nsIServiceManager * servMgr=0x01113a44) Line 732	C++
xul.dll!ScopedXPCOMStartup::~ScopedXPCOMStartup() Line 1202	C++
xul.dll!ScopedXPCOMStartup::`scalar deleting destructor'(unsigned int)	C++
xul.dll!XREMain::XRE_main(int argc=5, char * * argv=0x012923e0, const nsXREAppData * aAppData=0x00d1fcb4) Line 4113	C++


This looks the cubeb thread just waiting around:

ntdll.dll!_NtWaitForSingleObject@12()	Unknown
KernelBase.dll!_WaitForSingleObjectEx@12()	Unknown
KernelBase.dll!_WaitForSingleObject@8()	Unknown
gkmedias.dll!winmm_buffer_thread(void * user_ptr=0x1be48668) Line 175	C
msvcr100d.dll!_callthreadstartex() Line 314	C
msvcr100d.dll!_threadstartex(void * ptd=0x1b746850) Line 297	C
kernel32.dll!@BaseThreadInitThunk@12()	Unknown
ntdll.dll!__RtlUserThreadStart()	Unknown
ntdll.dll!__RtlUserThreadStart@8()	Unknown

Also a MediaManager thread that's not running any events, waiting for more to be posted to it's queue. I don't know what that thread does.
We're also leaking a ReentrantMonitor on when we run just test_mediaDecoding.html:


 1:08.20 == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 29512
 1:08.20 
 1:08.20      |<----------------Class--------------->|<-----Bytes------>|<----------------Objects---------------->|<--------------References-------------->|
 1:08.20                                               Per-Inst   Leaked    Total      Rem      Mean       StdDev     Total      Rem      Mean       StdDev
 1:08.20    0 TOTAL                                          33       32   674158        1 ( 6864.91 +/-  9604.82)  1326608        0 ( 3573.33 +/-  6815.54)
 1:08.20  280 ReentrantMonitor                               32       32      809        1 (   97.78 +/-    31.24)        0        0 (    0.00 +/-     0.00)
 1:08.21 
 1:08.21 nsTraceRefcnt::DumpStatistics: 1046 entries
 1:08.21 TEST-INFO | leakcheck | leaked 1 ReentrantMonitor (32 bytes)
 1:08.21 TEST-UNEXPECTED-FAIL | leakcheck | 32 bytes leaked (ReentrantMonitor)
 1:08.21 runtests.py | Running tests: end.
 1:08.22 TEST-UNEXPECTED-FAIL | leakcheck | 32 bytes leaked (ReentrantMonitor)
(In reply to Chris Pearce (:cpearce) from comment #0)
> Usually it takes about one minute from mochitest finish
> until Gecko has shutdown.

nsThreadPools with default settings can take up to one minute to close their idle threads.  Perhaps this is similar to bug 926522 comment 13, which was worked around in bug 936317.
Priority: -- → P1
Blocks: 1026325
I don't see a MediaManager thread, but I do see a "MediaBu~oder #1" thread (and the same main thread stack).

Reverting https://hg.mozilla.org/mozilla-central/rev/169de60f0b1e removes the ReentrantMonitor leak, but still leaves the pause of up to a minute.
No longer blocks: 1026325
Depends on: 1026325
With 169de60f0b1e reverted (at least), the slow shutdown doesn't always happen, suggesting a race condition.
I'd guess what's happening is that because SharedThreadPool relies on MediaShutdownManager's shutdown observer to destroy it on shutdown, and MediaShutdownManager's shutdown observer is only added if there's an active MediaDecoder, this shutdown hang only happens intermittently, since there's a chance a MediaDecoder is created after this test runs.

There are two options here:
1. The MediaBufferDecoder could null out its mThreadPool reference when it's finished with using the thread pool. SharedThreadPools will shut themselves down automatically when their refcount reaches 0.
2. The MediaShutdownManager singleton could trak live SharedThreadPools and keep its shutdown observer active when any SharedThreadPools are active.

Doing 1. is easier, and is roughly what used to happen before 169de60f0b1e.
I suspect the pause even with 169de60f0b1e reverted is due to MediaBufferDecoder::Shutdown() not being called when MediaBufferDecoder is destructed when the AudioContext is collected without its Shutdown() method is called.
Karl, you were right about comment 6, here is a patch that implements you idea,
and fixes the shutdown hang.
Attachment #8442087 - Flags: review?(karlt)
Assignee: nobody → paul
Status: NEW → ASSIGNED
I think I prefer doing this in MediaBufferDecoder.
The Shutdown() methods of MediaBufferDecoder and AudioContext have different purposes.  MediaBufferDecoder must finalize its thread pool before it is destroyed and if the window is closed before it is destroyed (because final GC/CC may not happen in opt builds).  AudioContext::Shutdown() is there just to stop the graph and break internal references when the window is closed; these would have already happened if the AudioContext destructor is run.
Attachment #8442250 - Flags: review?(paul)
Comment on attachment 8442087 [details] [diff] [review]
Make sure that the AudioContext has been Shutdown() when it is destroyed. r=

But this should be OK, if you prefer?
Attachment #8442087 - Flags: review?(karlt) → review+
Comment on attachment 8442250 [details] [diff] [review]
finalize thread pool when MediaBufferDecoder is destroyed

Review of attachment 8442250 [details] [diff] [review]:
-----------------------------------------------------------------

This is better indeed, as it is functionally equivalent while doing less.
Attachment #8442250 - Flags: review?(paul) → review+
Attachment #8442087 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/25101d87fd9b
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Comment on attachment 8442250 [details] [diff] [review]
finalize thread pool when MediaBufferDecoder is destroyed

[Approval Request Comment]
Bug caused by (feature/regressing bug #): old bug in Web Audio

User impact if declined: 
Up to 1 minute pause on shutdown in some situations.  i.e. if a realtime AudioContext is collected soon after decodeAudioData() is used, and shortly before the user quits the application.  This is much less likely than what was happening in bug 1026325 and a finite shutdown time, but the patch is a simple change to cover the remaining bug.

Testing completed (on m-c, etc.): on m-c, locally

Risk to taking this patch (and alternatives if risky): very low risk.

String or IDL/UUID changes made by this patch: none
Attachment #8442250 - Flags: approval-mozilla-beta?
Attachment #8442250 - Flags: approval-mozilla-aurora?
Attachment #8442250 - Flags: approval-mozilla-beta?
Attachment #8442250 - Flags: approval-mozilla-beta+
Attachment #8442250 - Flags: approval-mozilla-aurora?
Attachment #8442250 - Flags: approval-mozilla-aurora+
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.