Open Bug 1654635 Opened 5 years ago Updated 5 years ago

MFTDecoder::Output appears to get stuck and breaks playback

Categories

(Core :: Audio/Video: Playback, defect)

defect

Tracking

()

People

(Reporter: bryce, Unassigned)

Details

While trying to play a twitch.tv video, I had issues with it starting. I captured some profile runs and noticed I have a number of decoders just sitting there janking on the GPU process. I am still able to play VPx, AV1 videos, but any h264 video fails.

Profiles:
https://share.firefox.dev/2ZPnG03 - While trying twitch playback
https://share.firefox.dev/3eVBPgC - Afterwards

The first profile has some weird numbers (a 10 minute jank is shown on web conent 8/8 in a less than 1 min long profile edit: it's been pointed out that some of these numbers may be a product of when threads were created, e.g. the massive times on the decoder thread jank indicates they're been stalled for many hours), but I assume we can trust the janking of the decoders. I suspect the problem on my machine was not triggered by twitch, but my decoders were exhausted before then and that's when I noticed -- this is almost certainly the case as I recall having some issues hours earlier with similar symptoms.

https://share.firefox.dev/39knskG -- later profile shower the same decoders stuck an hour or two later. Same threads: 30, 39, 40, 41 on the GPU process -- thread IDs are stable.

Looking at the threads with Process Hacker (I had to set my symbol path in Process Hacker to get correct moz symbols SRV*C:\Symbols*http://msdl.microsoft.com/download/symbols;SRV*C:\Symbols*https://symbols.mozilla.org/):

tid 30612 stack (Decoder 41)

0, ntdll.dll!NtWaitForMultipleObjects+0x14
1, KernelBase.dll!WaitForMultipleObjectsEx+0x107
2, msmpeg2vdec.dll!DllUnregisterServer+0xf0998
3, msmpeg2vdec.dll!DllUnregisterServer+0xef306
4, msmpeg2vdec.dll!DllGetClassObject+0x3d1a
5, msmpeg2vdec.dll!DllGetClassObject+0xc44b
6, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!mozilla::MFTDecoder::Output+0x82
7, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!mozilla::WMFVideoMFTManager::Output+0x6f
8, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!mozilla::WMFMediaDataDecoder::ProcessOutput+0x51
9, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!mozilla::WMFMediaDataDecoder::ProcessDecode+0xfa
10, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!mozilla::detail::ProxyRunnable<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >,mozilla::MediaResult,1>,RefPtr<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >,mozilla::MediaResult,1> > (mozilla::TheoraDecoder::*)(mozilla::MediaRawData *),mozilla::TheoraDecoder,mozilla::MediaRawData *>::Run+0x2f
11, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!mozilla::TaskQueue::Runner::Run+0x10b
12, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!nsThreadPool::Run+0x686
13, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!nsThread::ProcessNextEvent+0x18e3
14, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!mozilla::ipc::MessagePumpForNonMainThreads::Run+0x181
15, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!MessageLoop::RunHandler+0x28
16, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!MessageLoop::Run+0x51
17, mozed49f20f-3de9-4f30-a937-5ffb1d5a5cd0!nsThread::ThreadFunc+0x12d
18, moz7b38cc08-4e23-4e0b-afa0-a235fa5127ee!_PR_NativeRunThread+0x144
19, moz7b38cc08-4e23-4e0b-afa0-a235fa5127ee!pr_root+0xa
20, ucrtbase.dll!thread_start<unsigned int (__cdecl*)(void *),1>+0x42
21, kernel32.dll!BaseThreadInitThunk+0x14
22, mozf92c2cbe-c87c-4fee-8778-d16e069fceff!patched_BaseThreadInitThunk+0x21
23, ntdll.dll!RtlUserThreadStart+0x21

all our stalled threads are like this, and it's consistent with the profiler. Process hacker indicates the threads are waiting on 2 objects each, but I'm not sure on how to identify further what's being waited on.

Killing the thread decoder 30 is on didn't unblock the others: https://share.firefox.dev/30E0WQ4 and I still see the other threads waiting on 2 objects in process hacker, videos still cannot be played.

Killing all but decoder 41 doesn't help https://share.firefox.dev/30BqCge

Even after killing all the decoder threads I can't play h264 videos. I'd guess something else is left in an inconsistent state. There's a huge number of threads with start points in msmpeg2vdec still alive in the GPU process, maybe there's a deadlock in all these threads touching MS libs, but I don't know at this stage how to trace the wait chain.

After killing the GPU process I'm able to start playing video again.

You need to log in before you can comment on or make changes to this bug.