Closed Bug 1195540 Opened 10 years ago Closed 10 years ago

firefox-media-tests hang (output timeout) in automation on Windows builders only, e10s only, Jenkins only

Categories

(Testing Graveyard :: external-media-tests, defect, P1)

defect

Tracking

(e10s+)

RESOLVED FIXED
Tracking Status
e10s + ---

People

(Reporter: impossibus, Assigned: sydpolk)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

We have jobs running on mac 10.10, win 7, win 8.1. All the windows jobs consistently hit an output timeout mid-test when e10s is on; mac jobs do not. I haven't been able to reproduce this outside of automation. 2015-08-16 15:17:53 INFO - Copy/paste: firefox-media-tests --binary c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\build\application\firefox\firefox.exe --symbols-path c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\build\symbols --urls firefox_media_tests/urls/default.ini ./firefox_media_tests/playback/youtube/manifest.ini --e10s --gecko-log c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\logs\gecko.log --log-tbpl - --log-html c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\logs\media_tests.html --log-mach c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\logs\media_tests_mach.log 2015-08-16 15:17:53 INFO - Calling ['firefox-media-tests', '--binary', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\build\\application\\firefox\\firefox.exe', '--symbols-path', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\build\\symbols', '--urls', 'firefox_media_tests/urls/default.ini', './firefox_media_tests/playback/youtube/manifest.ini', '--e10s', '--gecko-log', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\gecko.log', '--log-tbpl', '-', '--log-html', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\media_tests.html', '--log-mach', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\media_tests_mach.log'] with output_timeout 400 2015-08-16 15:18:09 INFO - starting httpd 2015-08-16 15:18:09 INFO - running httpd on http://127.0.0.1:58768/ 2015-08-16 15:18:09 INFO - mozversion application_buildid: 20150816091433 2015-08-16 15:18:09 INFO - mozversion application_changeset: 0876695d1abdeb363a780bda8b6cc84f20ba51c9 2015-08-16 15:18:09 INFO - mozversion application_display_name: Nightly 2015-08-16 15:18:09 INFO - mozversion application_id: {ec8030f7-c20a-464f-9b0e-13a3a9e97384} 2015-08-16 15:18:09 INFO - mozversion application_name: Firefox 2015-08-16 15:18:09 INFO - mozversion application_remotingname: firefox 2015-08-16 15:18:09 INFO - mozversion application_repository: https://hg.mozilla.org/mozilla-central 2015-08-16 15:18:09 INFO - mozversion application_vendor: Mozilla 2015-08-16 15:18:09 INFO - mozversion application_version: 43.0a1 2015-08-16 15:18:09 INFO - mozversion platform_buildid: 20150816091433 2015-08-16 15:18:09 INFO - mozversion platform_changeset: 0876695d1abdeb363a780bda8b6cc84f20ba51c9 2015-08-16 15:18:09 INFO - mozversion platform_repository: https://hg.mozilla.org/mozilla-central 2015-08-16 15:18:09 INFO - mozversion platform_version: 43.0a1 2015-08-16 15:18:09 INFO - SUITE-START | Running 1 tests 2015-08-16 15:18:09 INFO - TEST-START | test_basic_playback.py TestBasicYouTubePlayback.test_mse_is_enabled_by_default 2015-08-16 15:24:49 INFO - Automation Error: mozprocess timed out after 400 seconds running ['firefox-media-tests', '--binary', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\build\\application\\firefox\\firefox.exe', '--symbols-path', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\build\\symbols', '--urls', 'firefox_media_tests/urls/default.ini', './firefox_media_tests/playback/youtube/manifest.ini', '--e10s', '--gecko-log', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\gecko.log', '--log-tbpl', '-', '--log-html', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\media_tests.html', '--log-mach', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\media_tests_mach.log'] 2015-08-16 15:24:49 ERROR - timed out after 400 seconds of no output 2015-08-16 15:24:49 ERROR - Return code: 572
Blocks: e10s-tests
tracking-e10s: --- → +
Summary: firefox-media-tests hang (output timeout) in automation on Windows builders only, e10s only → firefox-media-tests hang (output timeout) in automation on Windows builders only, e10s only, Jenkins only
I've only been able to reproduce the hang locally by running the tests in my local Jenkins instance, which uses a Win 7 VM as a builder. The issue might be due to an interaction between Jenkins and mozilla-build? When I run the tests manually (without Jenkins), with or without mozharness, the hang does not occur. The Jenkins build step ('Execute Windows batch command') is something like this: > c:\mozilla-build\msys\bin\bash -xe -c "export PATH=$MOZBUILDPATH; python ./run_media_tests.py ..." which, in turn, calls: > firefox-media-tests --binary ... [and other options] which essentially extends |BaseMarionetteTestRunner| with an extra command-line arg. [1] [1] https://github.com/mjzffr/firefox-media-tests/blob/master/media_test_harness/runtests.py#L141
I repeatedly get this crash when I run a very simple test via my local Jenkins instance, as described in my previous comment. The test: > def test_nothing(self): > self.logger.info('hi') Of note: when this crash happens, FirefoxTestCase teardown method reports leaked window handles. > ERROR - TEST-UNEXPECTED-ERROR | test_basic_playback.py TestBasicYouTubePlayback.test_nothing | AssertionError: A test must not leak > window handles. This test started the browser with 1 open top level browsing contexts, but ended with 2.
I should add that when the crash occurs, the browser is displaying the 'new tab' page.
Flags: needinfo?(dburns)
Moreover, if I run the Jenkins job on a Windows builder where the Jenkins slave is launched as a JNLP agent, all tests pass and the job completes successfully. In other words, I've only been able to reproduce the hang (the output timeout) in Jenkins jobs running in 'headless' mode. Notes: Normally there's a 'Jenkins Slave' service running on our Windows builders, which means the jobs run in headless mode (no 'desktop interaction', using Jenkins terminology).
The big advantage about running the Jenkins slave as a service is that it will survive a reboot. However, mozmill-ci does not do this, and I am not wedded to it. When you run it as a service, there is a fake Windows session somewhere that Jenkins jobs get dumped into. We could look through the Jenkins bug database to see if anybody else has experienced problems. Or we could just not run it as a service.
Priority: -- → P1
Assignee: nobody → spolk
None of the Windows slaves run the Jenkins slave software as a service now. They are all launched via jnlp now. This should not be a problem any more.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Flags: needinfo?(dburns)
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: