1180388 - test_peerConnection_iceFailure.html is disabled on all platforms because of test timeouts

Reporter

Description

•

9 years ago

We would like to have --run-by-dir become more reliable i.e. lesser oranges and more greens. And test_peerConnection_iceFailure.html is producing oranges for all platforms. Here's the try results : https://treeherder.mozilla.org/#/jobs?repo=try&revision=1711e6339ee8. An excerpt from the raw log : 11:41:04 INFO - 4613 INFO TEST-UNEXPECTED-FAIL | dom/media/tests/mochitest/test_peerConnection_iceFailure.html | Test timed out. - expected PASS 11:41:04 INFO - -1929893056[9c5e68c0]: [|WebrtcAudioSessionConduit] AudioConduit.cpp:666: A/V sync: GetAVStats failed 11:41:04 INFO - -1929893056[9c5e68c0]: [|WebrtcAudioSessionConduit] AudioConduit.cpp:666: A/V sync: GetAVStats failed 11:41:04 INFO - MEMORY STAT | vsize 1147MB | residentFast 163MB | heapAllocated 92MB 11:41:04 INFO - 4614 INFO TEST-OK | dom/media/tests/mochitest/test_peerConnection_iceFailure.html | took 308454ms 11:41:04 INFO - -1220557056[b715f180]: [main|PeerConnectionImpl] PeerConnectionImpl.cpp:2353: CloseInt: Closing PeerConnectionImpl b23f42fb5681bf3e; ending call 11:41:04 INFO - -1220557056[b715f180]: [1435948556927325 (id=232 url=http://mochi.test:8888/tests/dom/media/tests/mochitest/test_peerConnection_iceFailure.html)]: stable -> closed

Kaustabh Datta Choudhury

Reporter

Updated

•

9 years ago

Blocks: 1162003

Kaustabh Datta Choudhury

Reporter

Updated

•

9 years ago

Flags: needinfo?(mshal)

Joel Maher ( :jmaher ) (UTC -8)

Comment 1

•

9 years ago

:juanb, I see you wrote this test not too long ago, can you help us figure out why this times out when run as a standalone directory instead of as a full run or a larger chunk of tests? Most likely it depends on something which isn't there, initialized, or a pref.

Flags: needinfo?(mshal) → needinfo?(jbecerra)

juan becerra [:juanb]

Comment 2

•

9 years ago

I'm looking into this. I will chat with Nils and Byron to see how to fix this.

Flags: needinfo?(jbecerra)

Nils Ohlmeier [:drno]

Comment 3

•

9 years ago

Not sure what is going on here, but I think it would help if we add this line: SimpleTest.requestCompleteLog(); into the test_peerConnection_iceFailure.html test case and re-run. This should give us full mochitest logs and the mochitest log lines should get more or less alligned with the ICE log messages.

Kaustabh Datta Choudhury

Reporter

Updated

•

9 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

Nils Ohlmeier [:drno]

Comment 4

•

9 years ago

Juan I'm able to reproduce the problem by executing: ./mach mochitest --run-by-dir dom/media/ But it works fine if execute: ./mach mochitest dom/media/tests/mochitest/test_peerConnection_iceFailure.html

Joel Maher ( :jmaher ) (UTC -8)

Comment 5

•

9 years ago

glad to hear this reproduces- sounds more like an odd case of a previous test doing something which causes this to fail.

Kaustabh Datta Choudhury

Reporter

Comment 6

•

9 years ago

(In reply to juan becerra [:juanb] from comment #2) > I'm looking into this. I will chat with Nils and Byron to see how to fix > this. Thanks for looking into this. I am checking in to see if we're close to fixing this. Please let me know if I can help in any way.

Kaustabh Datta Choudhury

Reporter

Comment 7

•

9 years ago

Hi, Any updates on this?

Flags: needinfo?(jbecerra)

Flags: needinfo?(drno)

juan becerra [:juanb]

Comment 8

•

9 years ago

This had fallen off my radar, but I'll look at it today.

Flags: needinfo?(jbecerra)

Kaustabh Datta Choudhury

Reporter

Comment 9

•

9 years ago

(In reply to juan becerra [:juanb] from comment #8) > This had fallen off my radar, but I'll look at it today. Thank you. It will be really great to have this fixed.

Kaustabh Datta Choudhury

Reporter

Comment 10

•

9 years ago

Hi Juan, Any news on this? Here's the recent state of try : https://treeherder.mozilla.org/#/jobs?repo=try&revision=aeae0c94a595 I hope this helps.

Flags: needinfo?(jbecerra)

Nils Ohlmeier [:drno]

Comment 11

•

9 years ago

Unfortunately Juan no longer has the capacity to investigate this. I'll try to look into this more deeply this week some time.

Flags: needinfo?(jbecerra)

Kaustabh Datta Choudhury

Reporter

Comment 12

•

9 years ago

Hi Nils, Any updates about this? Please let me know if I can help out in any way.

Nils Ohlmeier [:drno]

Comment 13

•

9 years ago

Thanks for the reminder Kaustabh. Interestingly I'm no longer able to reproduce this problem locally on my Mac with the latest mozilla-central code. Could it be that the problem solved itself? Do you still get failures when running it on treeherder? I think I remember then when I was able to reproduce the problem locally back in July a new Firefox would start (and close) for each subdirectory mochitest started to execute tests for. This no longer seem to be the case. The WebRTC tests are executed in the same browser instance as before the /dom/media/test tests. Is that new or old (from July) behavior intended? Maybe that is/was what is/was causing the problem?

Flags: needinfo?(drno)

Joel Maher ( :jmaher ) (UTC -8)

Comment 14

•

9 years ago

:drno, the change is to get --run-by-dir going for all mochitests, in this case it fails with --run-by-dir (a fresh browser session for each directory), but passes when all the directories are run in the same browser session. To fix this just run the test: ./mach mochitest dom/media/tests/mochitest it could be a few other things have changed.

Nils Ohlmeier [:drno]

Comment 15

•

9 years ago

(In reply to Joel Maher (:jmaher) from comment #14) > :drno, the change is to get --run-by-dir going for all mochitests, in this > case it fails with --run-by-dir (a fresh browser session for each > directory), but passes when all the directories are run in the same browser > session. Yes I understand what this ticket is about. I just took the latest code from mozilla-central, compiled it after a clobber and execute this command: ./mach mochitest --run-by-dir dom/media/ And I was surprised that I did NOT see the browser closing and re-opening again. I did see that happening back in July when I reported in comment #4 that I am able to reproduce this problem. I don't know what has changed but for some reason that command above no longer opens and closes Firefox for each directory for me. > To fix this just run the test: > ./mach mochitest dom/media/tests/mochitest I run this command pretty regularly and this is not causing problems. For me the problem was only reproducible with the '--run-by-dir'.

Kaustabh Datta Choudhury

Reporter

Comment 16

•

9 years ago

Here's a recent push to try : https://treeherder.mozilla.org/#/jobs?repo=try&revision=114d7812b125

Nils Ohlmeier [:drno]

Comment 17

•

9 years ago

Would have really appreciated if you would have let us know that you apparently disabled our test completely: https://hg.mozilla.org/try/rev/1306c43f4996#l2.13 From reading this I was under the impression that you were still trying to get --run-by-dir working and this is a blocker to get it working.

Summary: test_peerConnection_iceFailure.html fails on Windows, Linux, Mac on opt as well as debug with --run-by-dir enabled → test_peerConnection_iceFailure.html is disabled on all platforms because of test timeouts

Nils Ohlmeier [:drno]

Comment 18

•

9 years ago

I still can not reproduce the problem locally. Could someone please verify locally on his machine that if you run: ./mach mochitest --run-by-dir dom/media/tests actually re-opens Firefox three times, one Fx for the idendity subdir, one Fx for the ipc subdir and one for the mochitest directory itself? Because for me on my Mac and my Linux machine this is not happening. I only see Fx closing and re-starting when it switches between Chrome tests and mochitest-plain for dom/media. Although I agree that the log files from try look like there it is actually opening and closing browser instances per each sub-directory. I sincerely hope this is not yet another instance where our test framework behaves differently locally vs. on try server. I compared the log files from your last try run with log files from a working local run, but that did not really tell me what the problem could be. Therefore I created another try run with increased log level: https://treeherder.mozilla.org/#/jobs?repo=try&revision=85ad3e88b37b

Nils Ohlmeier [:drno]

Updated

•

9 years ago

Assignee: nobody → drno

Nils Ohlmeier [:drno]

Comment 19

•

9 years ago

Apparently my first attempt of increasing the ICE log levels were not successful. Next try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=52433b75d6d6

Nils Ohlmeier [:drno]

Comment 20

•

9 years ago

So after lots of digging and quite a few try runs with more and more debug log messages I finally found the problem: The first WebRTC test case which gets executed, sets the nICEr registry via the user pref settings in head.js, specifically the values ''media.peerconnection.ice.stun_client_maximum_transmits' and 'media.peerconnection.ice.trickle_grace_period'. Unfortunately that means any subsequent attempts to modify any of these values via SpecialPowers.pushPrefEnv() are useless as the initial value in nICEr registry does not get over-written. And that is exactly what the iceFailure test case here tries to do. I guess I see three possible solutions right now: A) ditch this test case alltogether B) move the test case into its own subdirectory, so that a new Fx instances gets started with a fresh nICEr registry (and find a way to have only one call to pushPrefEnv() with the right value for the test) C) find a way reset the nICEr registry like we do in the ice unittest and expose/access that via SpecialPowers Byron any thoughts on this?

Flags: needinfo?(docfaraday)

Joel Maher ( :jmaher ) (UTC -8)

Comment 21

•

9 years ago

it looks like we do specialpowers.pushprefenv in head.js: https://dxr.mozilla.org/mozilla-central/source/dom/media/tests/mochitest/head.js#235 I don't understand why this fails to allow us to set the pref? Is this because we always setup_environment prior to the test starting and the test case can then not set the prefs it needs? The confusion I have is that in future test cases we shouldn't be having issues with prefs from previous test cases unless we are not push pushprefenv, or there is a bug in it or firefox. Maybe this is a special circumstance. Either way, good find!

Nils Ohlmeier [:drno]

Comment 22

•

9 years ago

FYI locally this behaves differently, because at least for me, the identity test case from the identity subdir get executed first and they set the stun_client_maximum_transmits to 7, which is probably just long enough to make this test case pass, and then all tests run with that 7 as the value.

Nils Ohlmeier [:drno]

Comment 23

•

9 years ago

(In reply to Joel Maher (:jmaher) from comment #21) > it looks like we do specialpowers.pushprefenv in head.js: > https://dxr.mozilla.org/mozilla-central/source/dom/media/tests/mochitest/ > head.js#235 > > I don't understand why this fails to allow us to set the pref? Is this > because we always setup_environment prior to the test starting and the test > case can then not set the prefs it needs? > > The confusion I have is that in future test cases we shouldn't be having > issues with prefs from previous test cases unless we are not push > pushprefenv, or there is a bug in it or firefox. Maybe this is a special > circumstance. This is kind of a special case: nICEr is a protocol stack like your HTTP stack. But nICEr was not written so that it is allowed to change parameters in the protocol stack while it is running. So basically user pref which modify settings for nICEr require a restart of Firefox to be sure that they get applied. Only the two user prefs mentioned above are affected by this. All the other user prefs set in head.js should be fine to modify whenever you want/need.

Joel Maher ( :jmaher ) (UTC -8)

Comment 24

•

9 years ago

thanks for the explanation on this!