Hit MOZ_CRASH() at PeerConnectionMedia.cpp:490

RESOLVED FIXED in Firefox 59

Status

()

defect
P2
normal
Rank:
15
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: jose.recio, Assigned: bwc)

Tracking

({regression, regressionwindow-wanted})

Trunk
mozilla59
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox58 unaffected, firefox59 fixed)

Details

Attachments

(4 attachments)

(Reporter)

Description

a year ago
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
Build ID: 20180111124119

Steps to reproduce:

KITE-AppRTC-Test (https://github.com/webrtc/KITE): Opens https://appr.tc on Firefox nightly and Safari. Creates and joins a common WebRTC session, executes script every second to check value of WebRTC related object (PeerConnection status).




Actual results:

Testing Firefox nightly on MacOSX and Linux vs Safari 11 and Safari Tech Preview, results on a MOZ_CRASH, including stacktrace (see below).

This is consistently reproducible. Same testing with Firefox nightly with other Firefox versions doesn't show this problem. Not seen on Firefox 57.04 or Firefox 58.0b13

Initially reported as geckodriver bug in https://github.com/mozilla/geckodriver/issues/1125

Following trace and attached logs were taken on Linux using https://index.taskcluster.net/v1/task/gecko.v2.mozilla-central.latest.firefox.linux64-debug/artifacts/public/build/target.tar.bz2

Hit MOZ_CRASH() at /builds/worker/workspace/build/src/media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.cpp:490
#01: ???[/tmp/firefox-trunk-debug/libxul.so +0x1509d30]
#02: ???[/tmp/firefox-trunk-debug/libxul.so +0x150bc35]
#03: ???[/tmp/firefox-trunk-debug/libxul.so +0x1bdbf91]
#04: ???[/tmp/firefox-trunk-debug/libxul.so +0x2100f4c]
#05: ???[/tmp/firefox-trunk-debug/libxul.so +0x39ff3fe]
#06: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a149df]
#07: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a14d9e]
#08: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a0862d]
#09: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a145b6]
#10: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a14a97]
#11: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a14d9e]
#12: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a14ef0]
#13: ???[/tmp/firefox-trunk-debug/libxul.so +0x3ab9a03]
#14: ???[/tmp/firefox-trunk-debug/libxul.so +0x3ab9ef1]
#15: ???[/tmp/firefox-trunk-debug/libxul.so +0x39ff3fe]
#16: ???[/tmp/firefox-trunk-debug/libxul.so +0x39ff57c]
#17: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a15ec4]
#18: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a0fced]
#19: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a145b6]
#20: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a14a97]
#21: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a14d9e]
#22: ???[/tmp/firefox-trunk-debug/libxul.so +0x3a14ef0]
#23: ???[/tmp/firefox-trunk-debug/libxul.so +0x3cf8ab9]
#24: ??? (???:???)
[Parent 17515, Gecko_IOThread] WARNING: pipe error (58): Connection reset by peer: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 353

###!!! [Parent][MessageChannel] Error: (msgtype=0x150082,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv

++DOCSHELL 0x7f0d90ebe000 == 6 [pid = 17515] [id = {741409a7-d60e-48ab-b994-b97efab99194}]
++DOMWINDOW == 17 (0x7f0dc1cf56d0) [pid = 17515] [serial = 17] [outer = (nil)]
++DOMWINDOW == 18 (0x7f0daef5a800) [pid = 17515] [serial = 18] [outer = 0x7f0dc1cf56d0]
1515999067295	Marionette	DEBUG	Register listener.js for window 17
A content process crashed and MOZ_CRASHREPORTER_SHUTDOWN is set, shutting down




Expected results:

Script executed through WebDriver should return value of PeerConnection status.
Ted, what would be the best way to get the symbols from this crash stack? I assume we would know the exact build? The TC link above is just from latest, and we don't know what it has been downloaded.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(ted)
(Reporter)

Comment 2

a year ago
I still have the binary used for the tests. I think I can readily reproduce with another binary if needed.
This could help:

$ strings firefox/firefox | grep -i version
[...]
https://crash-reports.mozilla.com/submit?id={ec8030f7-c20a-464f-9b0e-13a3a9e97384}&version=59.0a1&buildid=20180114221611
[...]
$
Jose, do you have a minidump file laying around in the `minidumps` subfolder of the used Firefox user profile? If yes, it would help if you could upload it as attachment.
The crashreporter-symbols-full.zip next to the build should have full debug symbols in it:
https://index.taskcluster.net/v1/task/gecko.v2.mozilla-central.latest.firefox.linux64-debug/artifacts/public/build/target.crashreporter-symbols-full.zip

You'll need to manually unzip it, then `gzip -d` each .dbg.gz file and place them next to their respective binaries (sorry, there's no script for this). If you do that, gdb should be able to get a usable stack out.

Alternately, someone can grab the crashreporter-symbols.zip matching this build:
https://index.taskcluster.net/v1/task/gecko.v2.mozilla-central.latest.firefox.linux64-debug/artifacts/public/build/target.crashreporter-symbols.zip

unzip it somewhere, and pipe the stack from comment 0 through fix_stack_using_bpsyms.py, pointing it at the directory where you unzipped the symbols.
Flags: needinfo?(ted)
(Assignee)

Comment 6

a year ago
I doubt the stack is going to help very much in diagnosing this. What we need first are logs. Can you re-run this test with the following environment variable set?

MOZ_LOG=transceiverimpl:5
Flags: needinfo?(jose.recio)
(In reply to Henrik Skupin (:whimboo) from comment #1)
> Ted, what would be the best way to get the symbols from this crash stack? I
> assume we would know the exact build? The TC link above is just from latest,
> and we don't know what it has been downloaded.

Oh, sorry, I glossed over that. You can get the build id out of application.ini and look up the specific build by that in the Taskcluster index.
(Reporter)

Comment 8

a year ago
Logs setting MOZ_LOG=transceiverimpl:5 

BuildId in application.ini:
BuildID=20180114221611
Flags: needinfo?(jose.recio)
(Reporter)

Comment 9

a year ago
(In reply to Henrik Skupin (:whimboo) from comment #3)
> Jose, do you have a minidump file laying around in the `minidumps` subfolder
> of the used Firefox user profile? If yes, it would help if you could upload
> it as attachment.

minidump is not generated even after waiting for a while with WebDriver stopped (so the profile doesn't get removed right away after failure)
(In reply to Jose M Recio from comment #9)
> > Jose, do you have a minidump file laying around in the `minidumps` subfolder
> > of the used Firefox user profile? If yes, it would help if you could upload
> > it as attachment.
> 
> minidump is not generated even after waiting for a while with WebDriver
> stopped (so the profile doesn't get removed right away after failure)

As I learned two days ago the profile gets removed when geckodriver disconnects. So you really cannot wait for it. In such a case a custom profile has to be used. Sorry for that. But I think it would be overkill for now. Lets wait what Byron will say about the log.

Also given by comment 0 this is a regression. So marking it as such for Firefox 59.
Rank: 15
Priority: -- → P2
(Assignee)

Comment 11

a year ago
Can I see the SDP for this case? It looks like there's some error-checking we need to add earlier in the codepath.
Flags: needinfo?(jose.recio)
(Reporter)

Comment 12

a year ago
Flags: needinfo?(jose.recio)
(Assignee)

Comment 13

a year ago
Ok, looks like debug builds are MOZ_CRASHing when the remote description has garbage codec/s in the m-line (probably due to negotiation failure?), and the direction attribute isn't a=inactive. Looks like we need some checking earlier on for conditions like this.
Assignee: nobody → docfaraday
(Reporter)

Comment 14

a year ago
(In reply to Byron Campen [:bwc] from comment #13)
> Ok, looks like debug builds are MOZ_CRASHing 

This was first detected on the nightly trunk builds, I think these are non-debug builds? (not sure, just so it's clear).

> (probably due to negotiation failure?)

This is a test involving Firefox and Safari and Safari Tech Preview. Same test using with other Firefox versions ends in failure so there's indeed some compatibility problem (we are looking into that separately).

Using other Firefox versions, test fail but there's no crash. I can provide logs for that if it would help.
Byron, also if it would help I could teach Jose how to do an automated regression test by using mozregression, so that we could figure out which commit(s) actually caused this crash. Please let us know.
(Assignee)

Comment 16

a year ago
(In reply to Jose M Recio from comment #14)
> (In reply to Byron Campen [:bwc] from comment #13)
> > Ok, looks like debug builds are MOZ_CRASHing 
> 
> This was first detected on the nightly trunk builds, I think these are
> non-debug builds? (not sure, just so it's clear).

   Ah, I should probably make this a debug-only thing.
(Assignee)

Comment 17

a year ago
(In reply to Henrik Skupin (:whimboo) from comment #15)
> Byron, also if it would help I could teach Jose how to do an automated
> regression test by using mozregression, so that we could figure out which
> commit(s) actually caused this crash. Please let us know.

I am pretty sure bug 1290948 is where this regressed.
(Assignee)

Comment 18

a year ago
(In reply to Jose M Recio from comment #14)
> (In reply to Byron Campen [:bwc] from comment #13)
> > (probably due to negotiation failure?)
> 
> This is a test involving Firefox and Safari and Safari Tech Preview. Same
> test using with other Firefox versions ends in failure so there's indeed
> some compatibility problem (we are looking into that separately).
> 
> Using other Firefox versions, test fail but there's no crash. I can provide
> logs for that if it would help.

I think Safari only supports H264, and we only offered VP8/VP9 in that log you attached, so a negotiation failure is expected there. In order for us to offer H264, the plugin would need to be installed (the H264 plugin auto-installs, but it takes a little bit to do so, so if you're testing against a brand new profile it might not have it yet).
Comment hidden (mozreview-request)
(In reply to Byron Campen [:bwc] from comment #17)
> I am pretty sure bug 1290948 is where this regressed.

Lets see. Jose, can you please test this build? It should not cause this problem:
https://archive.mozilla.org/pub/firefox/nightly/2017/11/2017-11-28-10-04-40-mozilla-central/

Comment 21

a year ago
mozreview-review
Comment on attachment 8943957 [details]
Bug 1430707: Don't MOZ_CRASH when conduit operations fail.

https://reviewboard.mozilla.org/r/214296/#review220052

LGTM
Attachment #8943957 - Flags: review?(drno) → review+

Comment 22

a year ago
Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a36c57e9279d
Don't MOZ_CRASH when conduit operations fail. r=drno

Comment 23

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/a36c57e9279d
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
(Reporter)

Comment 24

a year ago
Tested, crash not seen anymore, fixed.
Thanks!
You need to log in before you can comment on or make changes to this bug.