Shutdown hang intermittent failures with browser wmfme crash tests
Categories
(Core :: Audio/Video: Playback, defect)
Tracking
()
People
(Reporter: yannis, Unassigned)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Whiteboard: [win:stability])
The wmfme crash tests such as dom/media/test/browser/wmfme/browser_wmfme_crash.js
produce intermittent failures where we can observe a shutdown hang, at least on Windows 11. These failures occur on treeherder (see bug 1831236) and I can reproduce the browser_wmfme_crash.js
shutdown hang on my (Windows 11) machine in, say, 1 out of 3 runs after compiling a debug build (though it can probably also happen in release, I'm not sure about that).
My understanding is that the main process creates an IPC channel between a VideoBridgeChild
and a VideoBridgeParent
. This happens in UtilityAudioDecoderChild::CreateVideoBridge()
and the VideoBridgeChild
will live in a MF Media Engine CDM utility process while the VideoBridgeParent
will live in the GPU process. In the test we intentionally kill the utility process and hope that video playback will recover by initiating a new video bridge IPC channel between the GPU process and a new utility process.
Here is what I think I observe when I have a shutdown hang:
- the main process shutdown is waiting for the GPU process down;
- the GPU process shutdown is waiting for the compositor thread shutdown;
- the compositor thread shutdown is waiting for all
CompositorThreadHolder
references to have been released; - the original
VideoBridgeParent
object was never notified of the death of itsVideoBridgeChild
peer and still lives,ActorDestroy
has not been called for this object; - this object has been replaced by the new
VideoBridgeParent
within(*videoBridgeFromProcess)[VideoBridgeSource::MFMediaEngineCDMProcess]
, soVideoBridgeParent::ShutdownInternal()
won't callClose()
for it either; - so this object still holds a
CompositorThreadHolder
reference that it will never release.
Therefore I wonder about the following:
-
What could lead the
VideoBridgeParent
to not get notified of the death of its peer? Can it be an issue with the way the video bridge classes are written? Or is it necessarily a bug in our IPC layer? -
Independently of a potential IPC layer bug, in
VideoBridgeParent::VideoBridgeParent
, if we find that(*videoBridgeFromProcess)[aSource]
is already populated, would it make sense to force a call toClose()
on the object we find there?
Comment 1•1 year ago
|
||
Found it, reminds me of bug 1718210
Reporter | ||
Updated•1 year ago
|
Reporter | ||
Comment 2•1 year ago
|
||
This is probably a duplicate of bug 1805736. I can fix the hang on my machine by forcing a first call to PVideoBridgeParent::SendPing
in VideoBridgeParent::Bind
, because that forces the VideoBridgeParent
to notice that the other end is dead. I can propose a patch that does that in the current bug and we'll see if that fixes the intermittent failures as well.
In the long run this looks like an IPC layer bug. I will file a new bug to track it, with more details. Once the IPC layer bug is solved it will not be necessary to call PVideoBridgeParent::SendPing
anymore.
![]() |
||
Updated•1 year ago
|
Reporter | ||
Comment 3•1 year ago
|
||
We confirmed with :nika that the root cause for this issue is in the IPC layer.
Comment 4•1 year ago
|
||
Per the comment3, we can duplicate this bug to bug 1879375.
Description
•