Closed Bug 1424342 Opened 7 years ago Closed 6 years ago

WebRTC crashes in random places on Win

Categories

(Core :: WebRTC, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox-esr52 --- unaffected
firefox57 --- unaffected
firefox58 --- unaffected
firefox59 --- fixed

People

(Reporter: drno, Assigned: drno)

References

Details

(Keywords: regression, sec-high)

Fx 59 crashed on Windows in all kind of random places.

STR:
- Open https://webrtc.github.io/samples/src/content/peerconnection/audio/
- Click the "call" button
- permit access to you mic
- wait 5-10sec
- click the "hangup" button

repeat above steps 5 times.

4/5 times it crashes. Most of times right away after clicking "call", sometimes after clicking "hangup".

Some sample crash stats URLs:
https://crash-stats.mozilla.com/report/index/e6e0306c-9d70-426a-99a8-f27a20171207
https://crash-stats.mozilla.com/report/index/1a7dff38-2ddc-4a45-950e-79f8f0171207
https://crash-stats.mozilla.com/report/index/336cf7ca-dd5b-42e5-8c5f-70cc60171207

It has been verified on several Mac OSX machines to no cause crashes. But multiple windows machines can repro this.
Rank: 5
Initial mozregression run points at bug 1423228. But appears highly implausible as it's a test only change.
Also crashes on https://webrtc.github.io/samples/src/content/peerconnection/pc1/ the same way, so it's not the stats which are causing the crashes.
Initial group investigation in #media suggests that crashes happen with any mic which has 2 channels, but it does not crash if the mic has only a single channel.
Paul at this point we are pretty sure that the patches in bug 1397793 are causing these problems. Which might also be an explanation for the crash bugs linked back to bug 1397793.
Depends on: 1397793
NI for awareness
Flags: needinfo?(padenot)
From irc [1], on my Windows 10 box, comment 0 crashes for one camera I have but not the other:

 - Logitech Pro 9000 with 1 channel,  16 bit, 48khz: works fine
 - Logitech c920     with 2 channels, 16 bit, 16000: crashes almost immediately when I hit Call button

Fyi. Maybe it has to do with the number of channels?

The STRs seem to require peer connection. I noticed bug 1397793 changed the sample size from int_16 to 32-bit float on all platforms but android. Does this break perhaps any assumptions in PeerConnection?

[1] https://mozilla.logbot.info/media/20171208#c13992982
Crashes are all over the place, suggesting memory trashing. Should probably hide this.

bp-011b02fb-44f2-4067-bb46-4f2710171208
bp-dad5084f-1f1a-43d1-89be-9bb280171208
bp-d1a9cd87-7ead-4c32-9828-a40c50171208 mozilla::dom::RTCRtpSenderJSImpl::GetStreams
Group: core-security
Two of the three crashes linked in comment 0 die on "MOZ_DIAGNOSTIC_ASSERT(run->mMagic == ARENA_RUN_MAGIC);" in different alloc functions. Something has definitely stomped on memory.
Blocks: 1397793
No longer depends on: 1397793
Keywords: regression, sec-high
Some more breadcrumbs:

When I try repro with a local build in Visual Studio I hit problems a lot less often. Maybe the whole problem is timing depend?

Another pointer is it is very hard to repro this with the build in mic in Visual Studio. But when I attach the C920 via USB it becomes more often again.

When running in Visual Studio I keep hitting an assertion where the a buffer in speex resampling is not filled to the expected length (it's always short). Unfortunately I can't lookup the exact assertion any more, because I rendered my Windows laptop useless in the attempt to locate the issue.
I did manually build the revision before the first patch of bug 1397793. And with that build I was no longer able to produce any crash. So I'm pretty sure bug 1397793 causes these crashes.

I think it's time to backout bug 1397793
:Aryx cosmin_sabou told me to ask you for help with backing out bug 1397793.
Flags: needinfo?(aryx.bugmail)
Backouts pushed.
Flags: needinfo?(aryx.bugmail)
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #12)
> Backouts pushed.

Thank you.
Fixed in https://reviewboard.mozilla.org/r/206292/diff/1-2/, that has landed on central.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(padenot)
Resolution: --- → FIXED
Group: core-security → core-security-release
Target Milestone: --- → mozilla59
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.