Open Bug 1502992 Opened 6 years ago Updated 1 year ago

Intermittent TEST-UNEXPECTED-CRASH | /webrtc/RTCRtpTransceiver.https.html | expected OK

Categories

(Core :: WebRTC, defect, P5)

defect

Tracking

()

REOPENED

People

(Reporter: intermittent-bug-filer, Assigned: aryx)

References

Details

(Keywords: intermittent-failure, leave-open)

Attachments

(2 files)

Filed by: rgurzau [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=208390208&repo=mozilla-beta

https://queue.taskcluster.net/v1/task/fToOad4DRwatelUFSOVPJw/runs/0/artifacts/public/logs/live_backing.log

6:00:55     INFO - mozcrash Copy/paste: Z:\task_1540825970\build\win32-minidump_stackwalk.exe c:\users\task_1540825970\appdata\local\temp\tmpbecytn.mozrunner\minidumps\a8a7c9c6-3ba6-490a-a5bf-75222e27c6eb.dmp c:\users\task_1540825970\appdata\local\temp\tmpvlh9ql
16:01:05     INFO - mozcrash Saved minidump as Z:\task_1540825970\build\blobber_upload_dir\a8a7c9c6-3ba6-490a-a5bf-75222e27c6eb.dmp
16:01:05     INFO - mozcrash Saved app info as Z:\task_1540825970\build\blobber_upload_dir\a8a7c9c6-3ba6-490a-a5bf-75222e27c6eb.extra
16:01:05     INFO - TEST-UNEXPECTED-CRASH | /webrtc/RTCRtpTransceiver.https.html | expected OK
16:01:05     INFO - TEST-INFO took 19425ms
16:01:05     INFO - Browser exited with return code 0
16:01:05     INFO - Closing logging queue
16:01:05     INFO - queue closed
16:01:05     INFO - Setting up ssl
16:01:05     INFO - certutil | 
16:01:05     INFO - certutil | 
16:01:05     INFO - certutil | 
16:01:05     INFO - Certificate Nickname                                         Trust Attributes
16:01:05     INFO -                                                              SSL,S/MIME,JAR/XPI
16:01:05     INFO - 
16:01:05     INFO - web-platform-tests                                           CT,, 
16:01:05     INFO - 
16:01:06     INFO - Application command: Z:\task_1540825970\build\application\firefox\firefox.exe -marionette about:blank --wait-for-browser -profile c:\users\task_1540825970\appdata\local\temp\tmpy6v7pv.mozrunner
Looks like a transceiver assertion:

16:00:49     INFO - PID 5432 | Asser[Child 4052: Socket Thread]: D/mtransport Couldn't get default ICE candidate for '1540828848948000 (id=6442451202 url=https://web-platform.test:8443/webrtc/RTCRtpTransceiver.https.html) transport-id=transport_0', no candidates.
16:00:49     INFO - PID 5432 | tion failure: transceiver->IsAssociated() (ICE candidate was gathered before the transceiver was associated! This should never happen.), at z:/build/build/src/media/webrtc/signaling/src/jsep/JsepSessionImpl.cpp:2271

Would have matched bug 1487278, but it looks like STS logging stomped on main's log string.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b9db1780bb9c
Disable frequently failing webrtc/RTCRtpTransceiver.https.html on Windows 7
Reopening this one to track the frequent failure after the WebRTC upgrade in bug 1376873.
Blocks: 1376873
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: DUPLICATE → ---
A bunch of the failures in comment 5 are bug 1487278.

Sure would be nice if we had a stack here...

(In reply to Byron Campen [:bwc] from comment #16)

Sure would be nice if we had a stack here...

This is all we have:

[task 2020-08-09T12:31:16.661Z] 12:31:16 INFO - TEST-START | /webrtc/RTCRtpTransceiver.https.html
[task 2020-08-09T12:31:16.662Z] 12:31:16 INFO - Clearing pref media.navigator.permission.disabled
[task 2020-08-09T12:31:16.678Z] 12:31:16 INFO - Clearing pref media.navigator.streams.fake
[task 2020-08-09T12:31:16.686Z] 12:31:16 INFO - Clearing pref privacy.resistFingerprinting.reduceTimerPrecision.jitter
[task 2020-08-09T12:31:16.702Z] 12:31:16 INFO - Clearing pref privacy.reduceTimerPrecision
[task 2020-08-09T12:31:16.710Z] 12:31:16 INFO - Clearing pref media.peerconnection.ice.trickle_grace_period
[task 2020-08-09T12:31:16.730Z] 12:31:16 INFO - Setting pref media.navigator.permission.disabled (true)
[task 2020-08-09T12:31:16.746Z] 12:31:16 INFO - Setting pref media.navigator.streams.fake (true)
[task 2020-08-09T12:31:16.762Z] 12:31:16 INFO - Setting pref privacy.resistFingerprinting.reduceTimerPrecision.jitter (false)
[task 2020-08-09T12:31:16.778Z] 12:31:16 INFO - Setting pref privacy.reduceTimerPrecision (false)
[task 2020-08-09T12:31:16.799Z] 12:31:16 INFO - Setting pref media.peerconnection.ice.trickle_grace_period (10000)
[task 2020-08-09T12:31:16.815Z] 12:31:16 INFO - Closing window 38
[task 2020-08-09T12:31:17.774Z] 12:31:17 INFO - Browser not responding, setting status to CRASH
[task 2020-08-09T12:31:18.243Z] 12:31:18 INFO - TEST-UNEXPECTED-CRASH | /webrtc/RTCRtpTransceiver.https.html | expected OK
[task 2020-08-09T12:31:18.243Z] 12:31:18 INFO - TEST-INFO took 1581ms
[task 2020-08-09T12:31:19.168Z] 12:31:19 INFO - Closing logging queue
[task 2020-08-09T12:31:19.168Z] 12:31:19 INFO - queue closed
[task 2020-08-09T12:31:19.191Z] 12:31:19 INFO - Setting up ssl
[task 2020-08-09T12:31:19.215Z] 12:31:19 INFO - certutil |
[task 2020-08-09T12:31:19.239Z] 12:31:19 INFO - certutil |
[task 2020-08-09T12:31:19.260Z] 12:31:19 INFO - certutil |
[task 2020-08-09T12:31:19.260Z] 12:31:19 INFO - Certificate Nickname Trust Attributes
[task 2020-08-09T12:31:19.260Z] 12:31:19 INFO - SSL,S/MIME,JAR/XPI
[task 2020-08-09T12:31:19.260Z] 12:31:19 INFO -
[task 2020-08-09T12:31:19.261Z] 12:31:19 INFO - web-platform-tests CT,,
[task 2020-08-09T12:31:19.261Z] 12:31:19 INFO -
[task 2020-08-09T12:31:22.749Z] 12:31:22 INFO - adb Granting important runtime permissions to org.mozilla.geckoview.test
[task 2020-08-09T12:31:24.145Z] 12:31:24 INFO - adb launch_application: am start -W -n org.mozilla.geckoview.test/org.mozilla.geckoview.test.TestRunnerActivity -a android.intent.action.MAIN --es env9 MOZ_DISABLE_NONLOCAL_CONNECTIONS=1 --es env8 R_LOG_DESTINATION=stderr --es args "-no-remote -profile /data/local/tmp/test_root/profile --marionette about:blank" --es env3 MOZ_HIDE_RESULTS_TABLE=1 --es env2 R_LOG_VERBOSE=1 --es env1 MOZ_WEBRENDER=0 --es env0 MOZ_CRASHREPORTER=1 --es env7 MOZ_CRASHREPORTER_SHUTDOWN=1 --es env6 MOZ_IN_AUTOMATION=1 --es env5 MOZ_LOG=signaling:3,mtransport:4,DataChannel:4,jsep:4 --es env4 STYLO_THREADS=1 --ez use_multiprocess True --es env12 R_LOG_LEVEL=6 --es env11 MOZ_PROCESS_LOG=/tmp/tmpMcDDUKpidlog --es env10 MOZ_CRASHREPORTER_NO_REPORT=1
[task 2020-08-09T12:31:26.015Z] 12:31:26 INFO - Starting runner

See Also: → 1663271
Flags: needinfo?(docfaraday)

There's almost nothing in the logs to go on here. The only thing I can find is this:

09-03 19:39:49.380 22301 22320 I Gecko   : [Parent 22301, Unnamed thread 7cb4c0f3e460] WARNING: waitpid failed pid:22325 errno:10: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_posix.cc, line 237

But that seems to happen after the tab has died.

I should note that this seems to be happening after this test is done running:

[task 2020-09-03T18:39:47.944Z] 18:39:47     INFO - Closing window 35
[task 2020-09-03T18:39:48.785Z] 18:39:48     INFO - TEST-OK | /webrtc/RTCRtpTransceiver-stop.html | took 845ms
[task 2020-09-03T18:39:48.785Z] 18:39:48     INFO - TEST-START | /webrtc/RTCRtpTransceiver.https.html
[task 2020-09-03T18:39:48.789Z] 18:39:48     INFO - Clearing pref media.navigator.permission.disabled
[task 2020-09-03T18:39:48.804Z] 18:39:48     INFO - Clearing pref media.navigator.streams.fake
[task 2020-09-03T18:39:48.808Z] 18:39:48     INFO - Clearing pref privacy.resistFingerprinting.reduceTimerPrecision.jitter
[task 2020-09-03T18:39:48.824Z] 18:39:48     INFO - Clearing pref privacy.reduceTimerPrecision
[task 2020-09-03T18:39:48.828Z] 18:39:48     INFO - Clearing pref media.peerconnection.ice.trickle_grace_period
[task 2020-09-03T18:39:48.849Z] 18:39:48     INFO - Setting pref media.navigator.permission.disabled (true)
[task 2020-09-03T18:39:48.865Z] 18:39:48     INFO - Setting pref media.navigator.streams.fake (true)
[task 2020-09-03T18:39:48.885Z] 18:39:48     INFO - Setting pref privacy.resistFingerprinting.reduceTimerPrecision.jitter (false)
[task 2020-09-03T18:39:48.906Z] 18:39:48     INFO - Setting pref privacy.reduceTimerPrecision (false)
[task 2020-09-03T18:39:48.921Z] 18:39:48     INFO - Setting pref media.peerconnection.ice.trickle_grace_period (10000)
[task 2020-09-03T18:39:48.929Z] 18:39:48     INFO - Closing window 38
[task 2020-09-03T18:39:50.255Z] 18:39:50     INFO - Browser not responding, setting status to CRASH
[task 2020-09-03T18:39:50.732Z] 18:39:50     INFO - TEST-UNEXPECTED-CRASH | /webrtc/RTCRtpTransceiver.https.html | expected OK
[task 2020-09-03T18:39:50.732Z] 18:39:50     INFO - TEST-INFO took 1944ms

And I should also note that we restart the browser after this test is done on android (to work around bug 1641237):

https://searchfox.org/mozilla-central/rev/76a83d0a218837ba6937d6a0fac51cb0008c2334/testing/web-platform/meta/webrtc/RTCRtpTransceiver.https.html.ini

There are other tests where we restart the browser because of bug 1641237:

https://searchfox.org/mozilla-central/search?q=1641237&case=true&path=

But all of these these other tests run just fine, before this test runs, so I doubt the restart-after is the cause of this problem.

Interesting. We aren't just having a single tab crash on this test. We are having around 50 or so tab crashes, starting pretty much at the very beginning of the wpt suite (css tests), and happening up until the end of the suite (content-security-policy tests). This seems pretty broken to me...

09-03 23:15:03.144  1616  1866 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 2900) has died
09-03 23:15:12.524  1616  1627 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 3233) has died
09-03 23:15:37.714  1616  1982 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 3530) has died
09-03 23:15:46.921  1616  1980 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 3942) has died
09-03 23:15:59.898  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 4244) has died
09-03 23:16:24.774  1616  1646 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 4573) has died
09-03 23:17:03.194  1616  1799 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 4972) has died
09-03 23:17:44.078  1616  1982 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 5429) has died
09-03 23:17:59.424  1616  2101 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 6138) has died
09-03 23:18:49.494  1616  1628 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 6507) has died
09-03 23:19:04.724  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 6823) has died
09-03 23:19:20.944  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 7206) has died
09-03 23:19:30.364  1616  1982 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 7545) has died
09-03 23:19:43.104  1616  1982 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 7849) has died
09-03 23:20:46.324  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 8184) has died
09-03 23:21:11.814  1616  1799 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 8643) has died
09-03 23:24:25.714  1616  2007 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 9156) has died
09-03 23:24:35.674  1616  2007 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 10425) has died
09-03 23:24:57.064  1616  1628 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 10740) has died
09-03 23:25:11.290  1616  1866 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 11187) has died
09-03 23:25:32.294  1616  1980 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 11558) has died
09-03 23:27:33.674  1616  1985 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 12010) has died
09-03 23:27:45.464  1616  2101 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 13118) has died
09-03 23:27:58.274  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 13457) has died
09-03 23:28:14.424  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 13789) has died
09-03 23:29:14.154  1616  1646 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 14180) has died
09-03 23:30:17.804  1616  1980 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 15113) has died
09-03 23:31:38.074  1616  1982 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 15718) has died
09-03 23:32:00.664  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 20283) has died
09-03 23:32:12.078  1616  1866 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 21603) has died
09-03 23:34:13.644  1616  2101 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 22347) has died
09-03 23:34:35.434  1616  1627 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 23502) has died
09-03 23:34:48.616  1616  1866 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 24156) has died
09-03 23:34:58.924  1616  1980 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 24578) has died
09-03 23:35:09.944  1616  2007 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 24880) has died
09-03 23:35:20.184  1616  1866 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 25208) has died
09-03 23:35:29.774  1616  1799 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 25516) has died
09-03 23:35:46.154  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 25823) has died
09-03 23:36:20.904  1616  1985 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 26214) has died
09-03 23:36:31.854  1616  1628 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 26796) has died
09-03 23:37:04.964  1616  1627 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 27108) has died
09-03 23:38:08.754  1616  1982 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 27677) has died
09-03 23:38:27.994  1616  1985 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 28620) has died
09-03 23:38:39.784  1616  1985 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 29053) has died
09-03 23:39:00.444  1616  1627 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 29389) has died
09-03 23:39:11.434  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 29847) has died
09-03 23:39:33.684  1616  1985 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 30168) has died
09-03 23:39:46.084  1616  1980 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 30644) has died
09-03 23:40:08.314  1616  1646 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 30981) has died
09-03 23:40:20.304  1616  1980 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 31432) has died
09-03 23:40:41.164  1616  1866 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 31766) has died
09-03 23:40:50.754  1616  1982 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 32182) has died
09-03 23:41:29.674  1616  1627 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 32490) has died
09-03 23:41:48.494  1616  2100 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 735) has died
09-03 23:41:58.064  1616  2007 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 1056) has died
09-03 23:42:13.834  1616  1628 I ActivityManager: Process org.mozilla.geckoview.test:tab0 (pid 1429) has died

https://firefoxci.taskcluster-artifacts.net/BiSk-8AcTKSjoemFY9-Zew/0/public/test_info//logcat-emulator-5554.log
https://firefoxci.taskcluster-artifacts.net/BiSk-8AcTKSjoemFY9-Zew/0/public/logs/live_backing.log

I am also unsure of why this would only happen on beta. Maybe there is some issue with running without the socket process? Let's see what happens if we enable socket process on beta...

Hmm. It seems that we disable socket process for android even on release, so that's not going to cause a difference in beta. Maybe there's something else that is EARLY_BETA_OR_EARLIER?

The recent failure spike started with the expectation update in bug 1666427 and bug 1661136 - the latter also modified the test.

Flags: needinfo?(james)

I think it's more likely to be the test modification that's the problem here.

We could add if os == "android": [OK, CRASH] to the expectations, but bwc should likely look at the actual issue here.

snorp, any idea about what is going on in comment 25? Any idea on how we could get some kind of actionable information on what is happening here? Without any stacks or debug output, I'm afraid there's nothing I can do to diagnose this bug...

Flags: needinfo?(docfaraday) → needinfo?(snorp)
Assignee: nobody → aryx.bugmail
Whiteboard: [stockwell disable-recommended] → [stockwell needswork:owner]
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/08ae3ebf13e9
set RTCRtpTransceiver.https.html as intermittent on Android opt r=intermittent-reviewers,jmaher DONTBUILD
Whiteboard: [stockwell disable-recommended]
Whiteboard: [stockwell disable-recommended]
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/76e6c016214a
set RTCRtpTransceiver.https.html as intermittent on Android opt. a=testonly DONTBUILD
Whiteboard: [stockwell disable-recommended]

(In reply to Byron Campen [:bwc] from comment #36)

snorp, any idea about what is going on in comment 25? Any idea on how we could get some kind of actionable information on what is happening here? Without any stacks or debug output, I'm afraid there's nothing I can do to diagnose this bug...

Yeah I don't see anything obvious in the log. The stuff about test:tab0 process is probably normal if we are frequently opening and closing a single tab.

Flags: needinfo?(snorp)
Pushed by btara@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c9c0c90ecfe9
set RTCRtpTransceiver.https.html as intermittent on Android r=intermittent-reviewers,jmaher DONTBUILD
Whiteboard: [stockwell disable-recommended]
Whiteboard: [stockwell disable-recommended]

Latest failures are on mozilla-release.

Aryx there are 10 total failures on release, should this be uplifted there?

Flags: needinfo?(james) → needinfo?(aryx.bugmail)
Whiteboard: [stockwell disable-recommended]

Should get checked in if more lands on release.

Flags: needinfo?(aryx.bugmail)
Whiteboard: [checkin-needed-release]
Whiteboard: [checkin-needed-release][stockwell disable-recommended] → [checkin-needed-release]

release branch has been upgraded to 83 for which the expectation changes landed.

Whiteboard: [checkin-needed-release]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: