firefox-media-tests hang (output timeout) in automation on Windows builders only, e10s only, Jenkins only

RESOLVED FIXED

Status

P1
normal
RESOLVED FIXED
3 years ago
9 months ago

People

(Reporter: maja_zf, Assigned: sydpolk)

Tracking

(Blocks: 1 bug)

Details

Attachments

(2 attachments)

We have jobs running on mac 10.10, win 7, win 8.1. All the windows jobs consistently hit an output timeout mid-test when e10s is on; mac jobs do not. I haven't been able to reproduce this outside of automation.

2015-08-16 15:17:53     INFO - Copy/paste: firefox-media-tests --binary c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\build\application\firefox\firefox.exe --symbols-path c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\build\symbols --urls firefox_media_tests/urls/default.ini ./firefox_media_tests/playback/youtube/manifest.ini --e10s --gecko-log c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\logs\gecko.log --log-tbpl - --log-html c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\logs\media_tests.html --log-mach c:\jenkins\workspace\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\logs\media_tests_mach.log
2015-08-16 15:17:53     INFO - Calling ['firefox-media-tests', '--binary', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\build\\application\\firefox\\firefox.exe', '--symbols-path', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\build\\symbols', '--urls', 'firefox_media_tests/urls/default.ini', './firefox_media_tests/playback/youtube/manifest.ini', '--e10s', '--gecko-log', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\gecko.log', '--log-tbpl', '-', '--log-html', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\media_tests.html', '--log-mach', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\media_tests_mach.log'] with output_timeout 400
2015-08-16 15:18:09     INFO -  starting httpd
2015-08-16 15:18:09     INFO -  running httpd on http://127.0.0.1:58768/
2015-08-16 15:18:09     INFO -  mozversion application_buildid: 20150816091433
2015-08-16 15:18:09     INFO -  mozversion application_changeset: 0876695d1abdeb363a780bda8b6cc84f20ba51c9
2015-08-16 15:18:09     INFO -  mozversion application_display_name: Nightly
2015-08-16 15:18:09     INFO -  mozversion application_id: {ec8030f7-c20a-464f-9b0e-13a3a9e97384}
2015-08-16 15:18:09     INFO -  mozversion application_name: Firefox
2015-08-16 15:18:09     INFO -  mozversion application_remotingname: firefox
2015-08-16 15:18:09     INFO -  mozversion application_repository: https://hg.mozilla.org/mozilla-central
2015-08-16 15:18:09     INFO -  mozversion application_vendor: Mozilla
2015-08-16 15:18:09     INFO -  mozversion application_version: 43.0a1
2015-08-16 15:18:09     INFO -  mozversion platform_buildid: 20150816091433
2015-08-16 15:18:09     INFO -  mozversion platform_changeset: 0876695d1abdeb363a780bda8b6cc84f20ba51c9
2015-08-16 15:18:09     INFO -  mozversion platform_repository: https://hg.mozilla.org/mozilla-central
2015-08-16 15:18:09     INFO -  mozversion platform_version: 43.0a1
2015-08-16 15:18:09     INFO -  SUITE-START | Running 1 tests
2015-08-16 15:18:09     INFO -  TEST-START | test_basic_playback.py TestBasicYouTubePlayback.test_mse_is_enabled_by_default
2015-08-16 15:24:49     INFO - Automation Error: mozprocess timed out after 400 seconds running ['firefox-media-tests', '--binary', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\build\\application\\firefox\\firefox.exe', '--symbols-path', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\build\\symbols', '--urls', 'firefox_media_tests/urls/default.ini', './firefox_media_tests/playback/youtube/manifest.ini', '--e10s', '--gecko-log', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\gecko.log', '--log-tbpl', '-', '--log-html', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\media_tests.html', '--log-mach', 'c:\\jenkins\\workspace\\mn-mse-youtube-basic-y-e10s-nightly-win7_32_64\\logs\\media_tests_mach.log']
2015-08-16 15:24:49    ERROR - timed out after 400 seconds of no output
2015-08-16 15:24:49    ERROR - Return code: 572

Updated

3 years ago
Blocks: 984139
tracking-e10s: --- → +
(Reporter)

Updated

3 years ago
Summary: firefox-media-tests hang (output timeout) in automation on Windows builders only, e10s only → firefox-media-tests hang (output timeout) in automation on Windows builders only, e10s only, Jenkins only
(Reporter)

Comment 1

3 years ago
I've only been able to reproduce the hang locally by running the tests in my local Jenkins instance, which uses a Win 7 VM as a builder. The issue might be due to an interaction between Jenkins and mozilla-build? When I run the tests manually (without Jenkins), with or without mozharness, the hang does not occur.

The Jenkins build step ('Execute Windows batch command') is something like this: 
> c:\mozilla-build\msys\bin\bash -xe -c "export PATH=$MOZBUILDPATH; python ./run_media_tests.py ..."

which, in turn, calls:
> firefox-media-tests --binary ... [and other options]

which essentially extends |BaseMarionetteTestRunner| with an extra command-line arg. [1]

[1] https://github.com/mjzffr/firefox-media-tests/blob/master/media_test_harness/runtests.py#L141
(Reporter)

Comment 2

3 years ago
Created attachment 8661386 [details]
PROCESS-CRASH | runner.py | application crashed [@ InitLayersAccelerationPrefs()]

I repeatedly get this crash when I run a very simple test via my local Jenkins instance, as described in my previous comment.

The test:
>    def test_nothing(self):
>        self.logger.info('hi')

Of note: when this crash happens, FirefoxTestCase teardown method reports leaked window handles.

> ERROR -  TEST-UNEXPECTED-ERROR | test_basic_playback.py TestBasicYouTubePlayback.test_nothing | AssertionError: A test must not leak
> window handles. This test started the browser with 1 open top level browsing contexts, but ended with 2.
(Reporter)

Comment 3

3 years ago
I should add that when the crash occurs, the browser is displaying the 'new tab' page.
(Reporter)

Comment 4

3 years ago
Created attachment 8661392 [details]
Screenshot associated with 'extra window handles' failure
(Reporter)

Updated

3 years ago
Flags: needinfo?(dburns)
(Reporter)

Comment 5

3 years ago
Moreover, if I run the Jenkins job on a Windows builder where the Jenkins slave is launched as a JNLP agent, all tests pass and the job completes successfully. In other words, I've only been able to reproduce the hang (the output timeout) in Jenkins jobs running in 'headless' mode.

Notes: Normally there's a 'Jenkins Slave' service running on our Windows builders, which means the jobs run in headless mode (no 'desktop interaction', using Jenkins terminology).
(Assignee)

Comment 6

3 years ago
The big advantage about running the Jenkins slave as a service is that it will survive a reboot. However, mozmill-ci does not do this, and I am not wedded to it.

When you run it as a service, there is a fake Windows session somewhere that Jenkins jobs get dumped into. We could look through the Jenkins bug database to see if anybody else has experienced problems. Or we could just not run it as a service.
(Reporter)

Updated

3 years ago
Priority: -- → P1
(Reporter)

Updated

3 years ago
Assignee: nobody → spolk
(Assignee)

Comment 7

3 years ago
None of the Windows slaves run the Jenkins slave software as a service now. They are all launched via jnlp now. This should not be a problem any more.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
(Reporter)

Updated

3 years ago
Flags: needinfo?(dburns)

Updated

9 months ago
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.