Deadlock trying to play a sound @ mozilla::MediaDecoder::Play on Windows

NEW
Unassigned

Status

()

Core
Audio/Video: Playback
--
critical
3 years ago
2 years ago

People

(Reporter: neil@parkwaycc.co.uk, Unassigned)

Tracking

({hang})

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(2 attachments)

(Reporter)

Description

3 years ago
Created attachment 8533372 [details]
Stack backtraces

I had just lost a match when the browser apparently deadlocked on me.
If I've understood the stacks correctly, it looks like we're trying to (re-)play a media element that is still shutting down the audio sink.  The audio sink shutdown seems to be blocked in wasapi_stream_stop waiting on its render thread to exit.

The main thread is waiting on the decoder monitor in MediaDecoder::Play(), the audio thread holds the decoder monitor in AudioLoop and calls wasapi_stream_stop which is waiting on libcubeb's render thread to stop, and the only running render thread is waiting for one of its events to be signalled.  I'm not sure I see the full picture here, since wasapi_stream_stop should've signalled the thread's shutdown event, so either the signal has been lost or the running render thread is from a different stream.

Neil, how easily can you reproduce this?  I noticed you filed against Win10, do you happen to know if the deadlock is exclusive to 10?
(Reporter)

Comment 2

3 years ago
(In reply to Matthew Gregan from comment #1)
> Neil, how easily can you reproduce this?  I noticed you filed against Win10,
> do you happen to know if the deadlock is exclusive to 10?

I don't know how to reproduce this. I filed against Win10 because that's what I was running at the time. I also use the site from Win7 but unfortunately I forgot to build with symbols so I don't know whether I've had a similar hang there. (I've just rebuilt so that should it happen on Win7 I'll be able to let you know.)
I just hit this on a Nightly build on playing YouTube. For some reason I hit a decode error in the middle of a video, and that shutdown the state machine and its AudioSink, and the main thread hung waiting for the decoder monitor while calling GetBuffered().

I also have *thousands* of priority 13 threads stuck in wasapi_stream_render_loop(). That seems very, very wrong. I had been twiddling with media.volume_scale to see if I could increase the volume (I couldn't), and reloaded the video a few times. Not sure if that would help get us to there.

I captured a minidump if you're interested. I'll attach my thread stacks for reference.

Why is the AudioSink holding the decoder monitor while it shuts down the cubeb_stream? Could it safely not?

Could it be that in order to shutdown, wasapi needs to post an event to the win32 main thread?
Flags: needinfo?(kinetik)
Created attachment 8564494 [details]
Stack traces
(In reply to Chris Pearce (:cpearce) from comment #3)
> I also have *thousands* of priority 13 threads stuck in
> wasapi_stream_render_loop(). 

6046 threads stuck in wasapi_stream_render_loop to be specific. A small number of them are priority -3 as well, though I doubt it matters.
(In reply to Chris Pearce (:cpearce) from comment #3)
> I also have *thousands* of priority 13 threads stuck in
> wasapi_stream_render_loop(). That seems very, very wrong. I had been
> twiddling with media.volume_scale to see if I could increase the volume (I
> couldn't), and reloaded the video a few times. Not sure if that would help
> get us to there.

The render thread is created on stream start and destroyed on stream stop.  The error handling in the WASAPI version of those functions is lacking, and if something higher up is making mismatched start/stop calls it looks like it may be possible to end up with multiple (leaked) render threads.  On stream destroy, one of the leaked render threads could be stealing the shutdown event wakeup.

> Why is the AudioSink holding the decoder monitor while it shuts down the
> cubeb_stream? Could it safely not?

I think the only reason is that the MSDM may call GetPosition so GetPosition needs to be safe with respect to shutdown.  It looks like that can be achieved by clearing mAudioStream under the lock and shutting down with the locked dropped.

> Could it be that in order to shutdown, wasapi needs to post an event to the
> win32 main thread?

I don't think so, but since COM is slightly involved with WASAPI, I guess it's possible.
Flags: needinfo?(kinetik)
(In reply to Matthew Gregan [:kinetik] from comment #6)
> The render thread is created on stream start and destroyed on stream stop. 
> The error handling in the WASAPI version of those functions is lacking, and
> if something higher up is making mismatched start/stop calls it looks like
> it may be possible to end up with multiple (leaked) render threads.  On
> stream destroy, one of the leaked render threads could be stealing the
> shutdown event wakeup.

After inspecting the code, debugging, and testing a bit, I can't see how this can happen and I couldn't reproduce the thread leak.

> I think the only reason is that the MSDM may call GetPosition so GetPosition
> needs to be safe with respect to shutdown.  It looks like that can be
> achieved by clearing mAudioStream under the lock and shutting down with the
> locked dropped.

I've made this change in bug 1133600.
(In reply to Matthew Gregan [:kinetik] from comment #7)
> After inspecting the code, debugging, and testing a bit, I can't see how
> this can happen and I couldn't reproduce the thread leak.

So, I *can* provoke this with the recent changes for handling output device changes, but nothing in cpearce's STR mentioned device changes.
Forgot to mention: the assert I added as part of bug 1132257 will at least make that case immediately fatal, rather than leaking threads and eventually deadlocking.

Comment 10

2 years ago
Still able to reprodue?
Severity: normal → critical
Flags: needinfo?(neil)
Flags: needinfo?(cpearce)
I could never reliably repro this. I haven't seen it recently on Win7 and Win8 FWIW, but since I couldn't reliably or regularly repro I don't think that means much.
Flags: needinfo?(cpearce)
(Reporter)

Comment 12

2 years ago
I never had STR, this was just a bug I hit dogfooding trunk and I filed it just in case.
Flags: needinfo?(neil)
Component: Audio/Video → Audio/Video: Playback
Is anyone still seeing this issue?

Comment 14

2 years ago
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #13)
> Is anyone still seeing this issue?

sounds like not :)
but perhaps related to bug 520417?
Summary: Deadlock trying to play a sound → Deadlock trying to play a sound @ mozilla::MediaDecoder::Play on Windows
You need to log in before you can comment on or make changes to this bug.