Closed Bug 1059240 Opened 10 years ago Closed 6 years ago

Intermittent test_peerConnection_replaceTrack.html | Test timed out.

Categories

(Core :: WebRTC, defect, P4)

x86
Linux
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: cbook, Unassigned)

References

(Depends on 1 open bug, )

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell unknown][stockwell needswork:owner])

Ubuntu VM 12.04 mozilla-inbound opt test mochitest-e10s-3 on 2014-08-27 03:00:23 PDT for push 00db70d58a1a slave: tst-linux32-spot-1059 https://tbpl.mozilla.org/php/getParsedLog.php?id=46846046&tree=Mozilla-Inbound 11397 INFO TEST-UNEXPECTED-FAIL | /tests/dom/media/tests/mochitest/test_peerConnection_replaceTrack.html | Test timed out.
Looks like a problem with the OS level fake loopback device drivers, as gUM never returned. (On which machines do we actually use the OS level device drivers vs. FF build in fake device?)
backlog: --- → webRTC+
Rank: 35
Priority: -- → P3
Mass change P3->P4 to align with new Mozilla triage process.
Priority: P3 → P4
Over the last 7 days there are 31 failures present on this bug. These happen on Linux, Linux x64, linux32-stylo-disabled and linux64-stylo-disabled. Here is the most recent log example: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=162867993&lineNumber=42889 Here is a relevant part from that log: [task 2018-02-18T10:50:58.391Z] 10:50:58 INFO - Run step 43: PC_LOCAL_AUDIOVIDEO_REPLACE_VIDEOTRACK_1 [task 2018-02-18T10:50:58.393Z] 10:50:58 INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_replaceTrack.html | We have a sender for video [task 2018-02-18T10:50:58.397Z] 10:50:58 INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_replaceTrack.html | Shouldn't have any local streams without a corresponding sender [task 2018-02-18T10:50:58.400Z] 10:50:58 INFO - TEST-PASS | dom/media/tests/mochitest/test_peerConnection_replaceTrack.html | Shouldn't have any remote streams without a corresponding receiver [task 2018-02-18T10:50:58.402Z] 10:50:58 INFO - Call getUserMedia for {"video":true,"audio":{"autoGainControl":false,"echoCancellation":false,"noiseSuppression":false}} [task 2018-02-18T10:50:58.403Z] 10:50:58 INFO - Buffered messages finished [task 2018-02-18T10:50:58.416Z] 10:50:58 INFO - TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_peerConnection_replaceTrack.html | Test timed out. [task 2018-02-18T10:50:58.416Z] 10:50:58 INFO - reportError@SimpleTest/TestRunner.js:121:7 [task 2018-02-18T10:50:58.417Z] 10:50:58 INFO - TestRunner._checkForHangs@SimpleTest/TestRunner.js:142:7
Flags: needinfo?(drno)
Whiteboard: [stockwell needswork]
See Also: → 1431056
Depends on: gUM_timeout
There are 40 failures in the past 7 days. Platforms: most of them on Linux debug, linux32-stylo-disabled debug, once on Linux x64 asan, linux64-ccov opt and Windows 7 pgo Most recent failure log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=164796045&lineNumber=9069 Reftest analyzer: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://queue.taskcluster.net/v1/task/CgdwHZ1bTEynh-9S5aJJEg/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1 Waiting for update on ticket 1441563
Yes lets try to find an owner for bug 1441563. That seems more useful then trying to solve all of these intermittents individually.
Flags: needinfo?(drno)
There have been 38 failures in the last 7 days: - 3 failures on Linux x64 asan+debug - 35 failures on Linux x32 debug Recent failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=190796727&repo=mozilla-inbound&lineNumber=175439 [task 2018-07-29T23:03:09.098Z] 23:03:09 INFO - TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_peerConnection_replaceTrack.html | Test timed out. [task 2018-07-29T23:03:09.099Z] 23:03:09 INFO - reportError@SimpleTest/TestRunner.js:121:7 [task 2018-07-29T23:03:09.101Z] 23:03:09 INFO - TestRunner._checkForHangs@SimpleTest/TestRunner.js:142:7 [task 2018-07-29T23:03:09.103Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.104Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.105Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.106Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.107Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.108Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.109Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.110Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.111Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.112Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.115Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.115Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.116Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.116Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.117Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.121Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.123Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.127Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.128Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.129Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.134Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.135Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.139Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.143Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.143Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.144Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.155Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.157Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.159Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.166Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.166Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.166Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.166Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.179Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.181Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.186Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.187Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.191Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.192Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.194Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.212Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.216Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.224Z] 23:03:09 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2018-07-29T23:03:09.226Z] 23:03:09 INFO - TestRunner.runTests/<@SimpleTest/TestRunner.js:381:9 [task 2018-07-29T23:03:09.233Z] 23:03:09 INFO - promise callback*TestRunner.runTests@SimpleTest/TestRunner.js:368:5 [task 2018-07-29T23:03:09.234Z] 23:03:09 INFO - RunSet.runtests@SimpleTest/setup.js:194:3 [task 2018-07-29T23:03:09.239Z] 23:03:09 INFO - RunSet.runall@SimpleTest/setup.js:173:5 [task 2018-07-29T23:03:09.240Z] 23:03:09 INFO - hookupTests@SimpleTest/setup.js:266:5 [task 2018-07-29T23:03:09.241Z] 23:03:09 INFO - parseTestManifest@http://mochi.test:8888/manifestLibrary.js:36:5 [task 2018-07-29T23:03:09.242Z] 23:03:09 INFO - getTestManifest/req.onload@http://mochi.test:8888/manifestLibrary.js:49:11 [task 2018-07-29T23:03:09.247Z] 23:03:09 INFO - EventHandlerNonNull*getTestManifest@http://mochi.test:8888/manifestLibrary.js:45:3 [task 2018-07-29T23:03:09.249Z] 23:03:09 INFO - hookup@SimpleTest/setup.js:246:5 [task 2018-07-29T23:03:09.250Z] 23:03:09 INFO - EventHandlerNonNull*@http://mochi.test:8888/tests?autorun=1&closeWhenDone=1&consoleLevel=INFO&manifestFile=tests.json&dumpOutputDirectory=%2Ftmp&cleanupCrashes=true:11:1 [task 2018-07-29T23:03:09.255Z] 23:03:09 INFO - GECKO(2958) | [Child 3010: Socket Thread]: D/mtransport Attempting to protect RTP... [task 2018-07-29T23:03:09.260Z] 23:03:09 INFO - GECKO(2958) | [Child 3010: Socket Thread]: D/mtransport Attempting to unprotect RTP... [task 2018-07-29T23:03:09.261Z] 23:03:09 INFO - GECKO(2958) | [Child 3010: Socket Thread]: D/mtransport Attempting to unprotect RTP... [task 2018-07-29T23:03:09.263Z] 23:03:09 INFO - GECKO(2958) | [Child 3010: Socket Thread]: D/mtransport Attempting to protect RTP... [task 2018-07-29T23:03:09.267Z] 23:03:09 INFO - GECKO(2958) | [Child 3010: Socket Thread]: D/mtransport Attempting to unprotect RTP... [task 2018-07-29T23:03:09.270Z] 23:03:09 INFO - GECKO(2958) | [Child 3010: Socket Thread]: D/mtransport Attempting to protect RTP...
Flags: needinfo?(drno)
Whiteboard: [stockwell unknown] → [stockwell unknown][stockwell needswork:owner]
Trying to determine what caused the failure rate to jump...
Seems to have happened somewhere in this range. Will continue investigating next week.
Flags: needinfo?(docfaraday)
Flags: needinfo?(docfaraday)
Thanks Byron for taking a look.
Flags: needinfo?(drno)
I'm having trouble isolating when this really started. When running lots of retriggers, it seems to come and go. I do not know whether this is because there is some other factor influencing its frequency, or whether several changesets are altering its probability; some making it significantly more common, others making it significantly less so.
Flags: needinfo?(docfaraday)
After running retriggers on autoland for the commits in the merge that caused the failure rate to increase on inbound, I don't see a culprit. I see a couple of possibilities here: 1. The increase is caused by an interaction between something that landed on autoland around the 19th, and something else that landed on inbound on the 20th. 2. The increase has something to do with infra or some other factor external to the code.
Because the merge on inbound (let's call this "merge A") was the first time we saw the increase (that I can tell), I have looked through the changesets merged into autoland on the 20th (let's call this "merge B") to see which of them were originally pushed to inbound before merge A happened. Here's that list, with backouts removed: 7afb8c53da3a Marco Castelluccio — Bug 1476574 - Enable reset/dump for wpt and cleanup harness code for supporting reset/dump now that it is supported by all test suites. r=jmaher 5e961986f536 Marco Castelluccio — Bug 1476574 - Support resetting/dumping code coverage counters before/after web-platform-tests. r=jgraham 76308197fa05 Ehsan Akhgari — Bug 1476796 - Enable AntiTrackingCommon::AddFirstPartyStorageAccessGrantedFor() to notify consumers about completion of asynchronous results; r=baku d10de6036b74 Jed Davis — Bug 1467889 - Adjust some uses of XPCOM strings. r=mrbkap r=mstange c611225375ec Eric Rahm — Bug 1442765 - Part 2: Switch nsTraceRefcnt's hashtables to use xpcom hashtables. r=mccr8 b60b9d07842e Eric Rahm — Bug 1442765 - Part 1: Add intptr_t hashkey type. r=froydnj b9193e72be19 Sebastian Hengst — Bug 1354232 - Annotate html/infrastructure/urls/resolving-urls/query-encoding/navigation.sub.html as passing on Linux asan. r=RyanVM on IRC It seems as though one (or more) of the changesets in this list, when combined with the changesets in merge A, causes the increase. I'm going to push some backouts to try.
Hang on a minute. The rate of this seems to have dropped to 0 on the 5th. I'm not sure whether this is because the orange-tracking for this bug has been stopped (perhaps due to my retriggers), or whether the orange has simply stopped happening. Checking that now.
Welp. This seems to have stopped happening. It might be getting masked by another intermittent, or it might actually be fixed, it is hard to say.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
This appears to have been fixed by bug 1404977.

The failures in comment 70 were all due to gUM_timeout, all within a two-hour timespan. Infra hiccup maybe?

You need to log in before you can comment on or make changes to this bug.