Closed Bug 1443803 Opened 2 years ago Closed 2 years ago

Intermittent application crashed [@ mozilla::Atomic<bool, (mozilla::MemoryOrdering)2u, void>::operator bool] in test_peerConnection_transceivers.html

Categories

(Core :: WebRTC: Audio/Video, defect, P1)

60 Branch
Unspecified
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox-esr52 --- unaffected
firefox59 --- unaffected
firefox60 --- fixed
firefox61 --- fixed

People

(Reporter: bwc, Assigned: pehrsons)

References

Details

(Keywords: crash, intermittent-failure, regression)

Attachments

(1 file)

No description provided.
Summary: Intermittent application crashed [@ mozilla::Atomic<bool, (mozilla::MemoryOrdering)2u, void>::operator bool] → Intermittent application crashed [@ mozilla::Atomic<bool, (mozilla::MemoryOrdering)2u, void>::operator bool] in test_peerConnection_transceivers.html
This is by far our worst intermittent, occurring somewhere between 30-50 times a day right now.
Happened for the first time when bug 1440169 landed. Happens most often on linux64 QuantumRender debug, so retriggering there on the previous push that ran the mda tests.

https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=5889cac267e270e4762fe579e8bec0b4b0920532&selectedJob=165481680
Blocks: 1440040
Keywords: regression
Assignee: nobody → docfaraday
Rank: 9
I'm stealing this given that Byron is on PTO this week. Thanks for the investigation so far Byron, it's very helpful!

From the last try push we see:
> [task 2018-03-12T12:31:56.750Z] 12:31:56     INFO - GECKO(2769) | [2827:Unnamed thread 0x7f8b4d7df280]: E/MediaStreamGraph MediaStream 0x7f8b3d2b5d20 unset graph
> [task 2018-03-12T12:31:56.758Z] 12:31:56     INFO - GECKO(2769) | [2827:Unnamed thread 0x7f8b44f05280]: E/MediaManager InsertInGraph: GraphImpl in stream 0x7f8b3d2b5d20 is null!

The first line is from MediaStream::DestroyImpl() (1) which wins in a race with MediaEngineWebRTCMicrophoneSource::InsertInGraph() (2) which needs access to MediaStream::GraphImpl().

The race occurs after a DOMMediaStream is destroyed on main thread. There are two paths we go through to clean up on the MediaStreamGraph thread, similar to the race between (1) and (2) above.
1: Main thread [1.1] -> MSG thread [1.2]
2: Main thread [2.1] -> MediaManager thread [2.2] (which sets state under a mutex, that's accessed on the MSG thread under the same mutex)

Usually the media manager thread would run its runnable first as MSG always waits for stable state before dispatching. That's not a guarantee however, and if the media manager thread is busy enough, the MSG might win, like in (1).


[1.1] https://searchfox.org/mozilla-central/rev/a70da6775d5341a9cca86bf1572a5e3534909153/dom/media/DOMMediaStream.cpp#493
[1.2] https://searchfox.org/mozilla-central/rev/a70da6775d5341a9cca86bf1572a5e3534909153/dom/media/MediaStreamGraph.cpp#2079

[2.1] https://searchfox.org/mozilla-central/rev/a70da6775d5341a9cca86bf1572a5e3534909153/dom/media/MediaStreamTrack.cpp#160
[2.2] https://searchfox.org/mozilla-central/rev/a70da6775d5341a9cca86bf1572a5e3534909153/dom/media/MediaManager.cpp#3984
Assignee: docfaraday → apehrson
Status: NEW → ASSIGNED
I've tried keeping the MediaStream from being destroyed in the MSG until MediaManager has stopped the sources properly, but that had other implications: https://treeherder.mozilla.org/#/jobs?repo=try&revision=803220aa3002495086083056c6e51b8114ae1a69

I'll see if I can get a simpler patch working instead, though it won't be providing the same guarantees as this one.
Comment on attachment 8958880 [details]
Bug 1443803 - Protect against destroyed MediaStream in MediaEngineWebRTCMicrophoneSource.

https://reviewboard.mozilla.org/r/227748/#review233848

Sad.
Attachment #8958880 - Flags: review?(padenot) → review+
Pushed by pehrsons@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/908d9d7f5c3e
Protect against destroyed MediaStream in MediaEngineWebRTCMicrophoneSource. r=padenot
https://hg.mozilla.org/mozilla-central/rev/908d9d7f5c3e
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Please request Beta approval on this when you get a chance.
Flags: needinfo?(apehrson)
Keywords: crash
Comment on attachment 8958880 [details]
Bug 1443803 - Protect against destroyed MediaStream in MediaEngineWebRTCMicrophoneSource.

Approval Request Comment
[Feature/Bug causing the regression]: bug 1426718
[User impact if declined]: Crashes in debug builds on try and for users (if any, but jesup was at least hit)
[Is this code covered by automated tests?]: Yes, see bug 1432025 for an example
[Has the fix been verified in Nightly?]: No, the test in question in bug 1432025 was backed out
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No
[Why is the change risky/not risky?]: All it does is add early exit guards that will get triggered only in shutdown
[String changes made/needed]: None
Flags: needinfo?(apehrson)
Attachment #8958880 - Flags: approval-mozilla-beta?
(In reply to Andreas Pehrson [:pehrsons] from comment #16)
> [Has the fix been verified in Nightly?]: No, the test in question in bug
> 1432025 was backed out
s/backed out/disabled/
Comment on attachment 8958880 [details]
Bug 1443803 - Protect against destroyed MediaStream in MediaEngineWebRTCMicrophoneSource.

webrtc crash fix in debug builds, beta60+
Attachment #8958880 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Blocks: 1447015
You need to log in before you can comment on or make changes to this bug.