Last Comment Bug 679375 - nsSound can cause GC to be reentered
: nsSound can cause GC to be reentered
Status: RESOLVED FIXED
: crash
Product: Core
Classification: Components
Component: Widget: Win32 (show other bugs)
: unspecified
: x86 Windows XP
: -- critical (vote)
: mozilla9
Assigned To: Brian R. Bondy [:bbondy]
:
: Jim Mathies [:jimm]
Mentors:
Depends on: 498079
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-16 09:12 PDT by neil@parkwaycc.co.uk
Modified: 2011-09-05 05:02 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WinDbg stack backtrace (8.36 KB, text/plain)
2011-08-16 09:18 PDT, neil@parkwaycc.co.uk
no flags Details
Fixed GC reentered crash by dispatching to main thread for thread shutdown (1.19 KB, patch)
2011-08-29 06:25 PDT, Brian R. Bondy [:bbondy]
no flags Details | Diff | Splinter Review
Fixed GC reentered crash by dispatching to main thread for thread shutdown v2 (1.17 KB, patch)
2011-08-29 10:11 PDT, Brian R. Bondy [:bbondy]
neil: review+
Details | Diff | Splinter Review
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread (4.90 KB, patch)
2011-08-30 08:58 PDT, Brian R. Bondy [:bbondy]
no flags Details | Diff | Splinter Review
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread v2 (4.97 KB, text/plain)
2011-08-30 18:46 PDT, Brian R. Bondy [:bbondy]
no flags Details
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread v2' (4.97 KB, patch)
2011-08-30 18:53 PDT, Brian R. Bondy [:bbondy]
no flags Details | Diff | Splinter Review
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread v3 (4.79 KB, patch)
2011-09-02 19:52 PDT, Brian R. Bondy [:bbondy]
neil: review+
Details | Diff | Splinter Review

Description neil@parkwaycc.co.uk 2011-08-16 09:12:09 PDT
nsSound can get released as part of garbage collection. nsSound::~nsSound tries to synchronously shut down its player thread. This works by spinning an event loop. But that means that it's possible to, say, fire a GC timer event. Reentering GC is disallowed and causes the application to crash.
Comment 1 neil@parkwaycc.co.uk 2011-08-16 09:18:06 PDT
Created attachment 553502 [details]
WinDbg stack backtrace
Comment 2 Brian R. Bondy [:bbondy] 2011-08-26 20:10:09 PDT
> But that means that it's possible to, say, fire a GC timer event.

How can I fire a GC timer event?
Comment 3 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-08-26 21:22:44 PDT
Well, by "GC timer event" I think he means "any event that could trigger a GC". We have lots of those.

I haven't really looked into it but one simple solution is probably to just join the thread asynchronously (e.g. dispatch a runnable to the main thread that shuts down the sound thread).
Comment 4 neil@parkwaycc.co.uk 2011-08-27 03:09:53 PDT
Well, in my case, it was an actual timer whose purpose is to call GC.
Comment 5 Brian R. Bondy [:bbondy] 2011-08-27 06:11:14 PDT
> Well, in my case, it was an actual timer whose purpose is to call GC.

I guess I'm interested in how to manually call GC.  For example I guess I could release an extra ref on the object early but I think you mean something more explicit.  Maybe if you could point me to an MDC doc or source file?
Comment 6 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-08-27 09:45:15 PDT
As long as you're on the main thread you can call nsJSContext::GarbageCollect() and nsJSContext::CycleCollect().
Comment 7 Brian R. Bondy [:bbondy] 2011-08-27 09:55:15 PDT
great, thank you!
Comment 8 Brian R. Bondy [:bbondy] 2011-08-29 06:25:14 PDT
Created attachment 556531 [details] [diff] [review]
Fixed GC reentered crash by dispatching to main thread for thread shutdown
Comment 9 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-08-29 10:06:14 PDT
Comment on attachment 556531 [details] [diff] [review]
Fixed GC reentered crash by dispatching to main thread for thread shutdown

Sorry for the drive-by but I'd suggest continuing to null out mPlayerThread for sanity.
Comment 10 Brian R. Bondy [:bbondy] 2011-08-29 10:11:19 PDT
Created attachment 556593 [details] [diff] [review]
Fixed GC reentered crash by dispatching to main thread for thread shutdown v2
Comment 11 Brian R. Bondy [:bbondy] 2011-08-29 10:12:30 PDT
> Sorry for the drive-by but I'd suggest continuing to null out mPlayerThread for sanity.

No problem, done.
Comment 12 neil@parkwaycc.co.uk 2011-08-29 11:15:21 PDT
You could probably improve that further by transferring the reference from the nsSound object directly to the new object, although offhand I'm not sure what the ideal way of doing that is (swap and forget are two options for instance).
Comment 13 neil@parkwaycc.co.uk 2011-08-29 16:33:35 PDT
It's not clear to me why we're joining the sound thread from the destructor. Shouldn't we join the sound thread from the sound releaser?
Comment 14 neil@parkwaycc.co.uk 2011-08-29 16:42:49 PDT
(In reply to comment #13)
> It's not clear to me why we're joining the sound thread from the destructor.
> Shouldn't we join the sound thread from the sound releaser?
(and why so many sound threads anyway?)

(In reply to comment #12)
> (swap and forget are two options for instance).
I don't like swap, because it doesn't allow you to construct the new nsCOMPtr. But I'm not sure whether it's better to pass the nsSound object and forget its member or forget its member and pass the already_AddRefed<nsIThread> object to the constructor.
Comment 15 Brian R. Bondy [:bbondy] 2011-08-29 21:47:31 PDT
> and why so many sound threads anyway?

That's a great question.
It's not my code but a lot of that threading code doesn't seem right to me in Win32's nsSound. 

How it currently works:
0. nsSound::PlaySystemSound or nsSound::PlayEventSound is called
1. It calls shutdown() right away on the existing nsSound::mPlayerThread thread if one exists (shutdown() = Join and not accept new dispatches) 
2. Create a new nsSoundPlayer (which is a runnable event), it calls AddRef on the nsSound object to ensure the nsSound destructor won't be hit
3. Create a new thread and store it in nsSound::mPlayerThread
4. Dispatch the runnable event (nsSoundPlayer) to the thread we just created (nsSound::mPlayerThread).
5. Inside the nsSoundPlayer thread we dispatch back to the main thread which calls release on the nsSound.

::PlaySoundW should already be async if you pass SND_ASYNC ( http://msdn.microsoft.com/en-us/library/dd743680(v=vs.85).aspx ) which we do.
So it would seem that all this threading is pointless.  But I looked in the hg logs and seen Bug 498079.
Inside Bug 498079 we did all of this threading because of a reported performance problem. 
I think the reasoning ended up being that some sound cards have bugs and cause non async handling contrary to MSDN documentation.
I'm not sure if that is valid or not though, but I'll assume it is. 

Shouldn't @mozilla.org/sound;1 be an XPCOM service and not instantiate a new object each time we may have sound by the way?

Even if we fixup the nsSound to have a single thread, instead of one per call, it would make no difference currently in our usage of nsISound. 
We always create a new object for each sound we play, so there would still be 1 thread per sound.

It doens't make sense to me to have a whole thread startup and shutdown every time a sound MAY be played. 
And on Windows most of these sounds are defaulted to off, yet we will still create a new thread anyway in case they are turned on.
Currently it will create a new thread just for opening a menu even know the sound is turned off by default for menu opens.
Open task manager and set the update frequency to high and show thread counts.  Then expand and collapse the top left menu, you will see it startup and shutdown a new thread for each time you click the firefox menu.

I would like to propose a shared single Win32 Widget thread.  
Currently it could be used by each instance of nsSound, JumpList icon caching, and I think also LSPAnnotate.
This would eliminate the startup / teardown of a thread for trivial things and also consolodate 2 threads into one and make it so we have a placeholder for making more things async in the future Win32 widget code.

Thoughts?
Comment 16 Brian R. Bondy [:bbondy] 2011-08-29 22:03:30 PDT
> I would like to propose...

Or perhaps propose that we make @mozilla.org/sound;1 a service and update the Win32 nsSound to have only one thread which gets created in the constructor and gets shutdown in the destructor.   Or better yet gets created when we first need it so that it doesn't cause any slowdown to startup.
Comment 17 Brian R. Bondy [:bbondy] 2011-08-29 22:10:38 PDT
Just found this, so perhaps there is nothing to do here?
Bug 520417
Comment 18 Masayuki Nakano [:masayuki] (Mozilla Japan) 2011-08-29 22:14:36 PDT
I think the playing sound thread should be independent from other work. If it were in trouble, the other jobs would be stopped too. I think nsSound should be a singleton class (service), see also bug 520417.
Comment 19 Brian R. Bondy [:bbondy] 2011-08-29 22:17:33 PDT
Ya after thinking about it more I agree with your Comment 18 and my Comment 16 more.  The last update to Bug 520417 was 2010-03-01, is it still being worked on?
Comment 20 Masayuki Nakano [:masayuki] (Mozilla Japan) 2011-08-29 22:26:51 PDT
No. If you would take it, please :-)

Actually, the bug's range is too wide. So, if you wanted to make nsSound a singleton service, you should do it in this bug.
Comment 21 Brian R. Bondy [:bbondy] 2011-08-29 22:30:41 PDT
Ok thanks for the info Masayuki.
Neil, I suggest Comment 16, but please let me know your thoughts.
Comment 22 neil@parkwaycc.co.uk 2011-08-30 01:59:31 PDT
Comment on attachment 556593 [details] [diff] [review]
Fixed GC reentered crash by dispatching to main thread for thread shutdown v2

I guess for the purposes of this bug we should go with this patch as a stopgap.
Comment 23 Brian R. Bondy [:bbondy] 2011-08-30 08:58:06 PDT
Created attachment 556864 [details] [diff] [review]
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread

As per Neil's idea.
Comment 24 neil@parkwaycc.co.uk 2011-08-30 15:32:29 PDT
Comment on attachment 556864 [details] [diff] [review]
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread

>+  // Each event on mPlayerThread holds a reference to nsISound so we will
>+  // only reach here if the thread is already done.
>+  mPlayerThread.forget();
Not sure what you're trying to do here...

>-    mPlayerThread = nsnull;
>+    mPlayerThread.forget();
Why this change?
Comment 25 Brian R. Bondy [:bbondy] 2011-08-30 15:50:51 PDT
Maybe I am misunderstanding the use of forget()? I thought it nulls out the pointer?

> forget()
> 764           // return the value of mRawPtr and null out mRawPtr. 

>+  // Each event on mPlayerThread holds a reference to nsISound so we will
>+  // only reach here if the thread is already done.
>+  mPlayerThread.forget();
> Not sure what you're trying to do here...

The whole comment and line of code can be removed.  I normally wouldn't NULL out anything in a destructor but as per Comment 3 I thought it was wanted. 

>-    mPlayerThread = nsnull;
>+    mPlayerThread.forget();
> Why this change?

I thought you preferred it that way because of Comment 12.

Let me know how to proceed, thanks for the review.
Comment 26 neil@parkwaycc.co.uk 2011-08-30 16:05:31 PDT
(In reply to Brian R. Bondy from comment #25)
> Maybe I am misunderstanding the use of forget()? I thought it nulls out the pointer?
It does, but it also returns the previous value in a form that is intended for assignment into another nsCOMPtr, or into an outparam. (It leaks if you don't.)
Comment 27 Brian R. Bondy [:bbondy] 2011-08-30 16:25:25 PDT
ah thanks for explaining, makes sense that it has to keep a reference since it is returning it.  I'll update both cases in Comment 24 to nsnull.
Comment 28 Brian R. Bondy [:bbondy] 2011-08-30 18:46:59 PDT
Created attachment 557063 [details]
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread v2

Implemented review comments (.forget() to = nsnull)
Comment 29 Brian R. Bondy [:bbondy] 2011-08-30 18:53:05 PDT
Created attachment 557065 [details] [diff] [review]
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread v2'
Comment 30 neil@parkwaycc.co.uk 2011-09-01 13:42:53 PDT
Comment on attachment 557065 [details] [diff] [review]
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread v2'

>+  // Each event on mPlayerThread holds a reference to to the nsSound, so
>+  // the nsSound destructor will not be called if there are events left on
>+  // the thread.  So we do not need to call shutdown() on the mPlayerThread.
>+  mPlayerThread = nsnull;
It just occurs to me that we can probably do better than this - we should be able to assert that mPlayerThread is already null, since the player thread dispatches an event that nulls it out. (You don't actually need to explicitly null it out anyway, because its destructor will do that.)
Comment 31 Brian R. Bondy [:bbondy] 2011-09-02 19:52:58 PDT
Created attachment 558023 [details] [diff] [review]
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread v3
Comment 32 neil@parkwaycc.co.uk 2011-09-03 14:34:15 PDT
Comment on attachment 558023 [details] [diff] [review]
Alternate patch which uses the SoundReleaser to shutdown the mPlayerThread thread v3

No assertions seen locally :-)
Comment 33 Brian R. Bondy [:bbondy] 2011-09-03 16:53:26 PDT
Pushed to try:
https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=93eda5827ea0
Comment 34 Brian R. Bondy [:bbondy] 2011-09-03 17:32:52 PDT
/me ducksed, cancelled, and re-pushed to try with Win32 only to avoid Neil's thwapping:
https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=0c3517cfa0c0
Comment 35 Brian R. Bondy [:bbondy] 2011-09-04 11:42:22 PDT
Pushed to mozilla-inbound:
http://hg.mozilla.org/integration/mozilla-inbound/rev/9f853ef4b664
Comment 36 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-09-05 05:02:09 PDT
http://hg.mozilla.org/mozilla-central/rev/9f853ef4b664

Note You need to log in before you can comment on or make changes to this bug.