Closed Bug 1330212 Opened 3 years ago Closed 3 years ago

Intermittent dom/media/tests/mochitest/test_getUserMedia_mediaStreamTrackClone.html | application crashed [@ mozilla::CycleCollectedJSContext::ProcessMetastableStateQueue]

Categories

(Core :: Audio/Video: MediaStreamGraph, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox-esr45 --- wontfix
firefox51 --- wontfix
firefox52 --- fixed
firefox53 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jesup)

References

(Blocks 2 open bugs)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Rank: 15
Component: Audio/Video → Audio/Video: MediaStreamGraph
Priority: -- → P1
Andreas, it says "clone" in the title, any idea of why it happens or a possible dupe ? :-)
Flags: needinfo?(pehrson)
I think the "clone" in the title is a huge coincidence, but I know that the assert here is the same as in bug 1317501. But it looks like it's from a new callsite. MSG runnables run on main thread in stable state and at that point there are certain things they're not allowed to do.

To my untrained eye just looking at the stack, it looks like we're not allowed to destroy the SystemClockDriver directly in the MSG runnable because that will process its thread's queue, and there's some event that is not allowed to run in stable state.

smaug knows more here maybe?
Flags: needinfo?(pehrson) → needinfo?(bugs)
Yeah, totally not acceptable to spin eventloop at that point (nsIThread::Shutdown may spin eventloop).
Call shutdown() using a runnable?
Flags: needinfo?(bugs)
Assignee: nobody → rjesup
Status: NEW → ASSIGNED
Comment on attachment 8826057 [details] [diff] [review]
Release GraphDrivers outside of StableState runnable to avoid spinning the event queue

Is it guaranteed same issue can't happen in other use of SystemClockDriver?
Attachment #8826057 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] from comment #6)
> Comment on attachment 8826057 [details] [diff] [review]
> Release GraphDrivers outside of StableState runnable to avoid spinning the
> event queue
> 
> Is it guaranteed same issue can't happen in other use of SystemClockDriver?

the issue only arises when we're switching drivers in RunInStableState; other switches happen in various Media threads or in non-StableState MainThread runnables.

Smaug: is there any way we can add assertions or warnings to event-loop-spinning in states we don't want it in, even if only for Debug builds, or even if it requires whitelisting a number of known-ok uses?  This is a usually subtle landmine that's tripped many an engineer over the years - and as with this case, the event-loop-spinning is often not obvious to the person who caused it (for example, it may depend on who's holding the last ref to an object such as a thread, like this one).
See previous comment
Flags: needinfo?(bugs)
So we would need to add some annotations on stack when spinning event loop isn't ok and then assert in thread code if someone is trying to spin the loop.
That might be useful. naAutoScriptBlocker could inherit from such "annotation class", so that we'd cover most common cases easily, and then add some more annotations for metastable and stable states.

I guess we could start from main thread only.

This might not cover all the cases, for that some rather heavy static analysis would be needed, but possibly would be useful anyhow.

Please file a bug and I'll try to find time to look at it if no one else will.
This might be useful for Quantum DOM too.
Flags: needinfo?(bugs)
Attachment #8826057 - Flags: review?(padenot) → review+
Blocks: 1331036
Pushed by rjesup@wgate.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c90ec782d481
Release GraphDrivers outside of StableState runnable to avoid spinning the event queue r=smaug,padenot
https://hg.mozilla.org/mozilla-central/rev/c90ec782d481
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Sounds like this affects 51/52 as well? Please nominate it for uplift.
Flags: needinfo?(rjesup)
This was caused by bug 1228226, which has been backed out of 53 already (and likely will soon be backed out of 52).  The fix in this bug is still good to have in the tree; it's a good safety backstop against someone causing this problem again (until we have better checking via bug 1331036).  If we don't back the other out of 52, and this is still needed there, we can uplift it to 52.
Blocks: 1228226
Flags: needinfo?(rjesup)
Fixed on 52 by the backout of bug 1228226.
Blocks: 1452416
You need to log in before you can comment on or make changes to this bug.