Closed Bug 1391545 Opened 7 years ago Closed 2 years ago

[meta] Intermittent Mn/MnH Automation Error: mozprocess timed out after 1000 seconds running - DON'T USE FOR CLASSIFICATION

Categories

(Remote Protocol :: Marionette, defect, P2)

defect

Tracking

(firefox58 wontfix, firefox59 wontfix, firefox60 wontfix, firefox61 wontfix, firefox63 wontfix, firefox64 wontfix, firefox65 affected)

RESOLVED WORKSFORME
mozilla65
Tracking Status
firefox58 --- wontfix
firefox59 --- wontfix
firefox60 --- wontfix
firefox61 --- wontfix
firefox63 --- wontfix
firefox64 --- wontfix
firefox65 --- affected

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Depends on 3 open bugs)

Details

(Keywords: meta, Whiteboard: [stockwell unknown])

Attachments

(2 obsolete files)

> 21:45:56     INFO - Application command: /Users/cltbld/tasks/task_1503031473/build/application/NightlyDebug.app/Contents/MacOS/firefox -no-remote -marionette -foreground -profile /var/folders/0h/x57jhf4s5fz98zd_969qlqy000000w/T/tmpn1sJKn.mozrunner
[..]
> 21:46:01     INFO -  1503031561842	Marionette	DEBUG	Received observer notification "sessionstore-windows-restored"
[..]
> 21:58:30     INFO -  TEST-START: test_quit_restart.py TestQuitRestart.test_in_app_restart_safe_mode
[..]
> 21:58:35     INFO -  1503032315505	Marionette	INFO	Enabled via MOZ_MARIONETTE
> 21:58:35     INFO -  1503032315505	Marionette	DEBUG	Received observer notification "profile-after-change"
> 21:58:35     INFO -  1503032315672	Marionette	DEBUG	Received observer notification "command-line-startup"
> 21:58:35     INFO -  1503032315789	Marionette	DEBUG	Received observer notification "domwindowopened"
> 21:58:35     INFO -  1503032315965	Marionette	DEBUG	Safe Mode detected. Going to suspress the dialog now.
> 21:58:37     INFO -  1503032317824	Marionette	DEBUG	Received observer notification "sessionstore-windows-restored"
> 21:58:42     INFO -  1503032322853	Marionette	INFO	Listening on port 2828
> 22:15:39     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/Users/cltbld/tasks/task_1503031473/build/venv/bin/python', '-u', '/Users

Firefox correctly restarts and Marionette initializes itself. Then the client doesn't connect anymore. It looks like the underlying issue here is also bug 1362293.

But what I wonder is why we are getting killed given that we should not wait longer than 360s for the connection.
Depends on: 1362293
we have 30 failures in the last week:
https://brasstacks.mozilla.com/orangefactor/index.html?display=Bug&bugid=1391545

these are all on osx-debug, :whimboo is this something you are looking at?
Flags: needinfo?(hskupin)
Whiteboard: [stockwell needswork]
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #3)
> these are all on osx-debug, :whimboo is this something you are looking at?

It looks like that this has the same underlying issue as bug 1362293. Not sure why it raised that much lately for OS X.

At least this specific issue started on Aug 18th. I landed the fix for bug 1254136 on Aug 17th. Not sure if that has an influence here. The marionette session correctly gets started in Firefox.

I never actually had such a hang locally, and I'm always running debug builds.
Flags: needinfo?(hskupin)
Whiteboard: [stockwell needswork] → [needs bug 1362293 fixed][stockwell needswork]
This looks to be caused by the safe browsing mode. Maybe related to bug 1397612.
Depends on: 1397612
disabled until the failure rate is under control:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ab66407266ff9e3b58be4f66d03ecb1f81976459
Keywords: leave-open
Whiteboard: [needs bug 1362293 fixed][stockwell disable-recommended] → [needs bug 1362293 fixed][stockwell disabled]
Attached patch Reenable Mn jobs (obsolete) — Splinter Review
Attached patch Disabling test (obsolete) — Splinter Review
Comment on attachment 8909465 [details] [diff] [review]
Disabling test

Talked to David on IRC and this patch should have been added to bug 1397612.
Attachment #8909465 - Attachment is obsolete: true
Comment on attachment 8909464 [details] [diff] [review]
Reenable Mn jobs

We will just revert ab66407266ff9e3b58be4f66d03ecb1f81976459 once the skip patch got landed.
Attachment #8909464 - Attachment is obsolete: true
Whiteboard: [needs bug 1362293 fixed][stockwell disabled] → [needs bug 1362293 fixed][stockwell needswork]
(In reply to OrangeFactor Robot from comment #19)
> ** This test has failed more than 200 times in the last 30 days. It should
> be disabled until it can be fixed. ** 

Right, but that happened before we disabled the affected test over on bug 1397612. Until now no other test failure has been seen. So lets see how the Windows failures are behaving today.
Whiteboard: [needs bug 1362293 fixed][stockwell disable-recommended] → [needs bug 1362293 fixed]
Whiteboard: [needs bug 1362293 fixed] → [needs bug 1397612 fixed]
First, there is nothing we can disable, and second the number of failures dropped significantly. Not sure why OF proposed a disable here, maybe it doesn't take the failures from yesterday into account and only checked the weekly ones.
Whiteboard: [needs bug 1397612 fixed][stockwell disable-recommended] → [needs bug 1397612 fixed]
this is a bot that looks at 30 days of history and if there are >200 failures it will recommend disabling.
20 failures in the last 7 days:
https://brasstacks.mozilla.com/orangefactor/index.html?display=Bug&bugid=1391545

hopefully the disabled bot doesn't flag this for recommended disabling again.
Whiteboard: [needs bug 1397612 fixed] → [needs bug 1397612 fixed][stockwell unknown]
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #23)
> this is a bot that looks at 30 days of history and if there are >200
> failures it will recommend disabling.

I know but as we have seen it could be smarter and not only look at the weekly stats but also for eg the last day to see if there was a massive drop of failures like in this case. I feel we have to keep the `disable-recommended` entry for now.
a chance this could still come up on the radar for disable-recommended, but we are under the 200 threshold and I suspect will remain there.
Whiteboard: [needs bug 1397612 fixed][stockwell disable-recommended] → [needs bug 1397612 fixed][stockwell needswork]
This failure seems to have the same underlying issue as what I'm trying to fix on bug 1410366.
Depends on: 1410366
This failure is still present with the patch on bug 1410366 landed.
No longer depends on: 1410366
This failure occurred 47 times in the last 7 days on Windows7 and OS X 10.10 on opt, pgo, debug build types.
Failing tests: opt-marionette-headless-e10, opt-marionette-e10s.

Link to a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=143996083&lineNumber=35456

Part of the log:
05:55:00     INFO -  Unable to read VR Path Registry from /Users/cltbld/Library/Application Support/OpenVR/.openvr/openvrpaths.vrpath
35454
05:55:00     INFO -  2017-11-11 05:55:00.313 plugin-container[1796:13786] *** CFMessagePort: bootstrap_register(): failed 1100 (0x44c) 'Permission denied', port = 0x994b, name = 'com.apple.tsm.portname'
35455
05:55:00     INFO -  See /usr/include/servers/bootstrap_defs.h for the error codes.
35456
06:11:40     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/Users/cltbld/tasks/task_1510407443/build/venv/bin/python', '-u', '/Users/cltbld/tasks/task_1510407443/build/tests/marionette/harness/marionette_harness/runtests.py', '--gecko-log=-', '--log-raw=-', '-vv', '--log-raw=/Users/cltbld/tasks/task_1510407443/build/blobber_upload_dir/marionette_raw.log', '--log-errorsummary=/Users/cltbld/tasks/task_1510407443/build/blobber_upload_dir/marionette_errorsummary.log', '--log-html=/Users/cltbld/tasks/task_1510407443/build/blobber_upload_dir/report.html', '--binary=/Users/cltbld/tasks/task_1510407443/build/application/Nightly.app/Contents/MacOS/firefox', '--address=localhost:2828', '--symbols-path=https://queue.taskcluster.net/v1/task/FZl6_UhnSAeSg9dF1ztL9w/artifacts/public/build/target.crashreporter-symbols.zip', '/Users/cltbld/tasks/task_1510407443/build/tests/marionette/tests/testing/marionette/harness/marionette_harness/tests/unit-tests.ini']
35457
06:11:40    ERROR - timed out after 1000 seconds of no output
35458
06:11:40    ERROR - Return code: -15
35459
06:11:40    ERROR - No suite end message was emitted by this harness.
35460
06:11:40     INFO - TinderboxPrint: marionette<br/>697/0/1
35461
06:11:40     INFO - gecko.log not found
35462
06:11:40     INFO - TinderboxPrint: marionette<br/>697/0/1
35463
06:11:40     INFO - Marionette exited with return code -15: FAILURE
35464
06:11:40    ERROR - # TBPL FAILURE #
35465
06:11:40  WARNING - setting return code to 2
35466
06:11:40     INFO - Running post-action listener: _package_coverage_data
35467
06:11:40     INFO - Running post-action listener: _resource_record_post_action
35468
06:11:40     INFO - [mozharness: 2017-11-11 14:11:40.392585Z] Finished run-tests step (success)
35469
06:11:40     INFO - Running post-run listener: _resource_record_post_run
35470
06:11:40     INFO - Total resource usage - Wall time: 1349s; CPU: 41.0%; Read bytes: 38518272; Write bytes: 802604032; Read time: 269; Write time: 3094
35471
06:11:40     INFO - TinderboxPrint: CPU usage<br/>40.5%
35472
06:11:40     INFO - TinderboxPrint: I/O read bytes / time<br/>38,518,272 / 269
35473
06:11:40     INFO - TinderboxPrint: I/O write bytes / time<br/>802,604,032 / 3,094
35474
06:11:40     INFO - TinderboxPrint: CPU idle<br/>3,207.0 (59.5%)
35475
06:11:40     INFO - TinderboxPrint: CPU system<br/>1,240.2 (23.0%)
35476
06:11:40     INFO - TinderboxPrint: CPU user<br/>946.6 (17.6%)
35477
06:11:40     INFO - TinderboxPrint: Swap in / out<br/>281,784,320 / 0
35478
06:11:40     INFO - install - Wall time: 23s; CPU: 53.0%; Read bytes: 229271552; Write bytes: 225699328; Read time: 16559; Write time: 459
35479
06:11:40     INFO - run-tests - Wall time: 1326s; CPU: 40.0%; Read bytes: 36765184; Write bytes: 558607872; Read time: 250; Write time: 2596
35480
06:11:40     INFO - Running post-run listener: _upload_blobber_files
35481
06:11:40  WARNING - Blob upload gear skipped. Missing cmdline options.
35482
06:11:40     INFO - Running post-run listener: copy_logs_to_upload_dir
35483
06:11:40     INFO - Copying logs to upload dir...
35484
06:11:40     INFO - mkdir: /Users/cltbld/tasks/task_1510407443/build/upload/logs
35485
06:11:40     INFO - Copying logs to upload dir...
35486
06:11:40  WARNING - returning nonzero exit status 2

:whimboo Bug 1362293 that was blocking this one got fixed. Do you have any insights why this bug is continuing to fail?
Flags: needinfo?(hskupin)
The problem here is related to test_quit_restart.py TestQuitRestart.test_in_app_restart_safe_mode. It could be a side-effect of bug 1397612, but I'm not sure why we timeout after 1000s.

Maybe it's related to the racy initial Marionette connection, as observed on bug 1400819, which I haven't had the time to look into yet.

I will request a backout of the unskip patch on bug 1397612.
Flags: needinfo?(hskupin)
This problem is no longer present since the backout from bug 1397612 has been done. Marking as fixed.
Assignee: nobody → hskupin
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Whiteboard: [needs bug 1397612 fixed][stockwell unknown] → [needs bug 1397612 fixed][stockwell fixed:backout]
Status: RESOLVED → REOPENED
Depends on: 1421289
Resolution: FIXED → ---
This failure occurred 32 times in the last 7 days on Windows7 and OS X 10.10 on opt, pgo, debug build types, Linux debug, Asan, Linux x64 pgo, linux64-nightly opt and linux64-jsdcov opt.
Failing tests: opt-marionette-headless-e10, opt-marionette-e10s, opt-web-platform-tests-wdspec-e10s, opt-mochitest-devtools-chrome-e10s-3.

Link to a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=149209586&lineNumber=1238

23:34:14     INFO -  TEST-START | test_cli_arguments.py TestCommandLineArguments.test_startup_timeout
23:34:14     INFO -  1512171254534	Marionette	TRACE	3 <- [1,3,null,{}]
23:34:14     INFO -  1512171254536	Marionette	DEBUG	Closed connection 3
23:34:14     INFO -  1512171254538	Marionette	DEBUG	Accepted connection 4 from 127.0.0.1:49308
23:34:14     INFO -  1512171254539	Marionette	TRACE	4 -> [0,1,"newSession",{}]
23:34:14     INFO -  1512171254541	Marionette	DEBUG	Register listener.js for window 4294967297
23:34:14     INFO -  1512171254544	Marionette	TRACE	4 <- [1,1,null,{"sessionId":"13723b39-41a8-41b5-a744-0a451b23c0a4","capabilities":{"browserName":"firefox","browserVersion":"59.0a ... ocessID":756,"moz:profile":"C:\\Users\\GenericWorker\\AppData\\Local\\Temp\\tmpclotkm.mozrunner","moz:webdriverClick":true}}]
23:34:14     INFO -  1512171254545	Marionette	TRACE	4 -> [0,2,"setTimeouts",{"script":30000}]
23:34:14     INFO -  1512171254546	Marionette	TRACE	4 <- [1,2,null,{}]
23:34:14     INFO -  1512171254550	Marionette	TRACE	4 -> [0,3,"setTimeouts",{"pageLoad":300000}]
23:34:14     INFO -  1512171254550	Marionette	TRACE	4 <- [1,3,null,{}]
23:34:14     INFO -  1512171254552	Marionette	TRACE	4 -> [0,4,"setTimeouts",{"implicit":0}]
23:34:14     INFO -  1512171254552	Marionette	TRACE	4 <- [1,4,null,{}]
23:34:14     INFO -  1512171254555	Marionette	TRACE	4 -> [0,5,"executeScript",{"scriptTimeout":null,"newSandbox":true,"args":[],"filename":"testcases.py","script":"dump('TEST-START: test_cli_arguments.py TestCommandLineArguments.test_startup_timeout\\n')","sandbox":"simpletest","line":316}]
23:34:14     INFO -  TEST-START: test_cli_arguments.py TestCommandLineArguments.test_startup_timeout
23:34:14     INFO -  1512171254558	Marionette	TRACE	4 <- [1,5,null,{"value":null}]
23:34:14     INFO -  1512171254560	Marionette	TRACE	4 -> [0,6,"deleteSession",{}]
23:34:14     INFO -  1512171254560	Marionette	TRACE	4 <- [1,6,null,{}]
23:34:14     INFO -  Application command: Z:\task_1512170584\build\application\firefox\firefox.exe -no-remote -marionette -profile c:\users\genericworker\appdata\local\temp\tmpclotkm.mozrunner
23:34:14     INFO -  1512171254887	Marionette	DEBUG	Received observer notification "profile-after-change"
23:34:14     INFO -  1512171254926	Marionette	DEBUG	Received observer notification "command-line-startup"
23:34:14     INFO -  1512171254926	Marionette	INFO	Enabled via --marionette
23:50:54     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['Z:\\task_1512170584\\build\\venv\\Scripts\\python', '-u', 'Z:\\task_1512170584\\build\\tests\\marionette\\harness\\marionette_harness\\runtests.py', '--gecko-log=-', '-vv', '--log-raw=Z:\\task_1512170584\\build\\blobber_upload_dir\\marionette_raw.log', '--log-errorsummary=Z:\\task_1512170584\\build\\blobber_upload_dir\\marionette_errorsummary.log', '--log-html=Z:\\task_1512170584\\build\\blobber_upload_dir\\report.html', '--binary=Z:\\task_1512170584\\build\\application\\firefox\\firefox.exe', '--address=localhost:2828', '--symbols-path=https://queue.taskcluster.net/v1/task/RUEweznASSmCBIo52bwfGA/artifacts/public/build/target.crashreporter-symbols.zip', 'Z:\\task_1512170584\\build\\tests\\marionette\\tests\\testing\\marionette\\harness\\marionette_harness\\tests\\unit-tests.ini']
23:50:54    ERROR - timed out after 1000 seconds of no output
23:50:54    ERROR - Return code: 572
23:50:54     INFO - TinderboxPrint: marionette<br/><em class="testfail">T-FAIL</em>
23:50:54     INFO - gecko.log not found
23:50:54     INFO - TinderboxPrint: marionette<br/><em class="testfail">T-FAIL</em>
23:50:54     INFO - Marionette exited with return code 572: FAILURE
23:50:54    ERROR - # TBPL FAILURE #
23:50:54  WARNING - setting return code to 2
23:50:54     INFO - Running post-action listener: _package_coverage_data
23:50:54     INFO - Running post-action listener: _resource_record_post_action
23:50:54     INFO - [mozharness: 2017-12-01 23:50:54.970000Z] Finished run-tests step (success)
23:50:54     INFO - Running post-run listener: _resource_record_post_run
23:50:55     INFO - Total resource usage - Wall time: 1016s; CPU: 2.0%; Read bytes: 99255296; Write bytes: 205289472; Read time: 235710; Write time: 452520
23:50:55     INFO - TinderboxPrint: CPU usage<br/>1.8%
23:50:55     INFO - TinderboxPrint: I/O read bytes / time<br/>99,255,296 / 235,710
23:50:55     INFO - TinderboxPrint: I/O write bytes / time<br/>205,289,472 / 452,520
23:50:55     INFO - TinderboxPrint: CPU idle<br/>7,981.8 (98.2%)
23:50:55     INFO - TinderboxPrint: CPU user<br/>82.9 (1.0%)
23:50:55     INFO - install - Wall time: 3s; CPU: 9.0%; Read bytes: 0; Write bytes: 82336256; Read time: 0; Write time: 68780
23:50:55     INFO - run-tests - Wall time: 1014s; CPU: 2.0%; Read bytes: 99212800; Write bytes: 53468672; Read time: 235470; Write time: 344410
23:50:55     INFO - Running post-run listener: _upload_blobber_files
23:50:55  WARNING - Blob upload gear skipped. Missing cmdline options.
23:50:55     INFO - Running post-run listener: copy_logs_to_upload_dir
23:50:55     INFO - Copying logs to upload dir...
23:50:55     INFO - mkdir: Z:\task_1512170584\build\upload\logs
23:50:55     INFO - Copying logs to upload dir...
23:50:55     INFO - Using _rmtree_windows ...
23:50:55     INFO - Running command: del /F /Q "Z:\task_1512170584\build\upload\logs\localconfig.json"
23:50:55     INFO - Return code: 0
23:50:55     INFO - Using _rmtree_windows ...
23:50:55     INFO - Running command: del /F /Q "Z:\task_1512170584\build\upload\logs\log_info.log"
23:50:55     INFO - Return code: 0
23:50:55     INFO - Using _rmtree_windows ...
23:50:55     INFO - Running command: del /F /Q "Z:\task_1512170584\build\upload\logs\log_raw.log"
23:50:55     INFO - Return code: 0
23:50:55     INFO - Using _rmtree_windows ...
23:50:55     INFO - Running command: del /F /Q "Z:\task_1512170584\build\upload\logs\log_warning.log"
23:50:55     INFO - Return code: 0
23:50:55     INFO - Using _rmtree_windows ...
23:50:55     INFO - Running command: del /F /Q "Z:\task_1512170584\build\upload\logs\log_critical.log"
23:50:55     INFO - Return code: 0
23:50:55     INFO - Using _rmtree_windows ...
23:50:55     INFO - Running command: del /F /Q "Z:\task_1512170584\build\upload\logs\log_error.log"
23:50:55     INFO - Return code: 0
23:50:55     INFO - Using _rmtree_windows ...
23:50:55     INFO - Running command: del /F /Q "Z:\task_1512170584\build\upload\logs\log_fatal.log"
23:50:55     INFO - Return code: 0
23:50:55  WARNING - returning nonzero exit status 2
[taskcluster 2017-12-01T23:50:55.492Z]   Exit Code: 2
[taskcluster 2017-12-01T23:50:55.493Z]   User Time: 0s
[taskcluster 2017-12-01T23:50:55.493Z] Kernel Time: 15.6001ms
[taskcluster 2017-12-01T23:50:55.493Z]   Wall Time: 19m33.733s
[taskcluster 2017-12-01T23:50:55.493Z] Peak Memory: 2277376
[taskcluster 2017-12-01T23:50:55.493Z]      Result: FAILED
[taskcluster 2017-12-01T23:50:55.493Z] === Task Finished ===
This bug has 35 failures in the last 7 days on Windows7 and OS X 10.10 on opt, pgo, debug build types, Linux debug, Asan, Linux x64 pgo, linux64-nightly opt and linux64-jsdcov opt.
Failing tests: opt-marionette-headless-e10, opt-marionette-e10s, opt-web-platform-tests-wdspec-e10s, opt-mochitest-devtools-chrome-e10s-3.

Recent failure log: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=150989000&lineNumber=1215


[task 2017-12-09T09:59:38.398Z] 09:59:38     INFO - TEST-START | /webdriver/tests/actions/key_shortcuts.py
[task 2017-12-09T10:16:18.461Z] 10:16:18     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/builds/worker/workspace/build/venv/bin/python', '-u', '/builds/worker/workspace/build/tests/web-platform/runtests.py', '--log-raw=-', '--log-raw=/builds/worker/workspace/build/blobber_upload_dir/wpt_raw.log', '--log-wptreport=/builds/worker/workspace/build/blobber_upload_dir/wptreport.json', '--log-errorsummary=/builds/worker/workspace/build/blobber_upload_dir/wpt_errorsummary.log', '--binary=/builds/worker/workspace/build/application/firefox/firefox', '--symbols-path=https://queue.taskcluster.net/v1/task/cIWuCQYUSPOYx3D8rumtKg/artifacts/public/build/target.crashreporter-symbols.zip', '--stackwalk-binary=/usr/local/bin/linux64-minidump_stackwalk', '--stackfix-dir=/builds/worker/workspace/build/tests/bin', '--run-by-dir=3', '--test-type=wdspec', '--stylo-threads=4', '--webdriver-binary=/builds/worker/workspace/build/tests/bin/geckodriver', '--prefs-root=/builds/worker/workspace/build/tests/web-platform/prefs', '--processes=1', '--config=/builds/worker/workspace/build/tests/web-platform/wptrunner.ini', '--ca-cert-path=/builds/worker/workspace/build/tests/web-platform/certs/cacert.pem', '--host-key-path=/builds/worker/workspace/build/tests/web-platform/certs/web-platform.test.key', '--host-cert-path=/builds/worker/workspace/build/tests/web-platform/certs/web-platform.test.pem', '--certutil-binary=/builds/worker/workspace/build/tests/bin/certutil']
[task 2017-12-09T10:16:18.508Z] 10:16:18    ERROR - timed out after 1000 seconds of no output
[task 2017-12-09T10:16:18.508Z] 10:16:18    ERROR - Return code: -15
[task 2017-12-09T10:16:18.509Z] 10:16:18    ERROR - No suite end message was emitted by this harness.
[task 2017-12-09T10:16:18.509Z] 10:16:18    ERROR - # TBPL FAILURE #
[task 2017-12-09T10:16:18.510Z] 10:16:18  WARNING - setting return code to 2
[task 2017-12-09T10:16:18.510Z] 10:16:18     INFO - Running post-action listener: _package_coverage_data
[task 2017-12-09T10:16:18.511Z] 10:16:18     INFO - Running post-action listener: _resource_record_post_action
[task 2017-12-09T10:16:18.511Z] 10:16:18     INFO - [mozharness: 2017-12-09 10:16:18.511410Z] Finished run-tests step (success)
[task 2017-12-09T10:16:18.512Z] 10:16:18     INFO - Running post-run listener: _resource_record_post_run
[task 2017-12-09T10:16:18.620Z] 10:16:18     INFO - Total resource usage - Wall time: 1031s; CPU: 1.0%; Read bytes: 0; Write bytes: 491880448; Read time: 0; Write time: 21344
[task 2017-12-09T10:16:18.621Z] 10:16:18     INFO - TinderboxPrint: CPU usage<br/>0.8%
[task 2017-12-09T10:16:18.621Z] 10:16:18     INFO - TinderboxPrint: I/O read bytes / time<br/>0 / 0
[task 2017-12-09T10:16:18.621Z] 10:16:18     INFO - TinderboxPrint: I/O write bytes / time<br/>491,880,448 / 21,344
[task 2017-12-09T10:16:18.622Z] 10:16:18     INFO - TinderboxPrint: CPU idle<br/>4,074.3 (99.2%)
[task 2017-12-09T10:16:18.622Z] 10:16:18     INFO - TinderboxPrint: Swap in / out<br/>0 / 0
[task 2017-12-09T10:16:18.624Z] 10:16:18     INFO - pull - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2017-12-09T10:16:18.626Z] 10:16:18     INFO - install - Wall time: 25s; CPU: 25.0%; Read bytes: 0; Write bytes: 19259392; Read time: 0; Write time: 2056
[task 2017-12-09T10:16:18.643Z] 10:16:18     INFO - run-tests - Wall time: 1006s; CPU: 0.0%; Read bytes: 0; Write bytes: 472621056; Read time: 0; Write time: 19288
[task 2017-12-09T10:16:19.112Z] 10:16:19     INFO - Running post-run listener: _upload_blobber_files
[task 2017-12-09T10:16:19.112Z] 10:16:19  WARNING - Blob upload gear skipped. Missing cmdline options.
[task 2017-12-09T10:16:19.113Z] 10:16:19     INFO - Running post-run listener: copy_logs_to_upload_dir
[task 2017-12-09T10:16:19.113Z] 10:16:19     INFO - Copying logs to upload dir...
[task 2017-12-09T10:16:19.114Z] 10:16:19     INFO - mkdir: /builds/worker/workspace/build/upload/logs
[task 2017-12-09T10:16:19.116Z] 10:16:19     INFO - Copying logs to upload dir...
[task 2017-12-09T10:16:19.119Z] 10:16:19  WARNING - returning nonzero exit status 2
[task 2017-12-09T10:16:19.154Z] cleanup
[task 2017-12-09T10:16:19.154Z] + cleanup
[task 2017-12-09T10:16:19.155Z] + local rv=2
[task 2017-12-09T10:16:19.155Z] + [[ -s /builds/worker/.xsession-errors ]]
[task 2017-12-09T10:16:19.155Z] + cp /builds/worker/.xsession-errors /builds/worker/artifacts/public/xsession-errors.log
[task 2017-12-09T10:16:19.157Z] + true
[task 2017-12-09T10:16:19.157Z] + cleanup_xvfb
[task 2017-12-09T10:16:19.157Z] pidof Xvfb
[task 2017-12-09T10:16:19.157Z] ++ pidof Xvfb
[task 2017-12-09T10:16:19.161Z] + local xvfb_pid=24
[task 2017-12-09T10:16:19.161Z] + local vnc=false
[task 2017-12-09T10:16:19.161Z] + local interactive=false
[task 2017-12-09T10:16:19.161Z] + '[' -n 24 ']'
[task 2017-12-09T10:16:19.161Z] + [[ false == false ]]
[task 2017-12-09T10:16:19.162Z] + [[ false == false ]]
[task 2017-12-09T10:16:19.162Z] + kill 24
[task 2017-12-09T10:16:19.162Z] + screen -XS xvfb quit
[task 2017-12-09T10:16:19.164Z] XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
[task 2017-12-09T10:16:19.164Z]       after 784 requests (784 known processed) with 1 events remaining.
[task 2017-12-09T10:16:19.164Z] compizconfig - Info: Backend     : ini
[task 2017-12-09T10:16:19.164Z] compizconfig - Info: Integration : true
[task 2017-12-09T10:16:19.164Z] compizconfig - Info: Profile     : default
[task 2017-12-09T10:16:19.224Z] No screen session found.
[task 2017-12-09T10:16:19.225Z] + true
[task 2017-12-09T10:16:19.225Z] + exit 2
[taskcluster 2017-12-09 10:16:19.450Z] === Task Finished ===
[taskcluster 2017-12-09 10:16:24.588Z] Unsuccessful task run with exit code: 2 completed in 1132.231 seconds
Whiteboard: [needs bug 1397612 fixed][stockwell fixed:backout] → [stockwell needswork]
Flags: needinfo?(hskupin)
(In reply to Andreea Pavel [:apavel] from comment #46)
> Failing tests: opt-marionette-headless-e10, opt-marionette-e10s,
> opt-web-platform-tests-wdspec-e10s, opt-mochitest-devtools-chrome-e10s-3.

Can you please explain why everything especially outside Marionette jobs is getting starred for this bug? This is wrong, given that individual bugs exist for that. This bug should only be about Marionette but not Mochitest or web-platform-tests.
Flags: needinfo?(hskupin) → needinfo?(apavel)
Hi :whimboo I just took the data from orange factor, latest log, then followed the triaging specifications and added the tag. 
Indeed there are different bugs filled for each one, we will take a closer look when staring.
Flags: needinfo?(apavel)
There are 56 failure in the last 7 days.
They occur on linux32, Linux x64, linux64-jsdcov, linux64-nightly, macosx64-nightly, OS X 10.10, wINDOWS 7 AND windows7-32-nightly.
The affected build types are debug, opt, pgo.
Here is a recent log example: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=153464023&lineNumber=1195

and a relevant part of the log:
23:30:05     INFO -  TEST-START | testing\marionette\harness\marionette_harness\tests\unit\test_cli_arguments.py TestCommandLineArguments.test_startup_timeout
1178
23:30:05     INFO -  1514417405294	Marionette	DEBUG	Closed connection 2
1179
23:30:05     INFO -  1514417405297	Marionette	DEBUG	Accepted connection 3 from 127.0.0.1:49329
1180
23:30:05     INFO -  1514417405299	Marionette	TRACE	3 -> [0,1,"newSession",{}]
1181
23:30:05     INFO -  1514417405301	Marionette	DEBUG	Register listener.js for window 4294967297
1182
23:30:05     INFO -  1514417405305	Marionette	TRACE	3 <- [1,1,null,{"sessionId":"c22c206b-5f58-45de-8a08-4448616a247b","capabilities":{"browserName":"firefox","browserVersion":"59.0a ... cessID":3980,"moz:profile":"C:\\Users\\GenericWorker\\AppData\\Local\\Temp\\tmpo9fhl9.mozrunner","moz:webdriverClick":true}}]
1183
23:30:05     INFO -  1514417405307	Marionette	TRACE	3 -> [0,2,"setTimeouts",{"script":30000}]
1184
23:30:05     INFO -  1514417405308	Marionette	TRACE	3 <- [1,2,null,{}]
1185
23:30:05     INFO -  1514417405309	Marionette	TRACE	3 -> [0,3,"setTimeouts",{"pageLoad":300000}]
1186
23:30:05     INFO -  1514417405309	Marionette	TRACE	3 <- [1,3,null,{}]
1187
23:30:05     INFO -  1514417405316	Marionette	TRACE	3 -> [0,4,"setTimeouts",{"implicit":0}]
1188
23:30:05     INFO -  1514417405317	Marionette	TRACE	3 <- [1,4,null,{}]
1189
23:30:05     INFO -  1514417405319	Marionette	TRACE	3 -> [0,5,"deleteSession",{}]
1190
23:30:05     INFO -  1514417405320	Marionette	TRACE	3 <- [1,5,null,{}]
1191
23:30:05     INFO -  Application command: Z:\task_1514415800\build\application\firefox\firefox.exe -no-remote -marionette -profile c:\users\genericworker\appdata\local\temp\tmpo9fhl9.mozrunner
1192
23:30:05     INFO -  1514417405639	Marionette	DEBUG	Received observer notification "profile-after-change"
1193
23:30:05     INFO -  1514417405676	Marionette	DEBUG	Received observer notification "command-line-startup"
1194
23:30:05     INFO -  1514417405676	Marionette	INFO	Enabled via --marionette
1195
23:46:45     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['Z:\\task_1514415800\\build\\venv\\Scripts\\python', '-u', 'Z:\\task_1514415800\\build\\tests\\marionette\\harness\\marionette_harness\\runtests.py', '--gecko-log=-', '-vv', '--log-raw=Z:\\task_1514415800\\build\\blobber_upload_dir\\marionette_raw.log', '--log-errorsummary=Z:\\task_1514415800\\build\\blobber_upload_dir\\marionette_errorsummary.log', '--log-html=Z:\\task_1514415800\\build\\blobber_upload_dir\\report.html', '--binary=Z:\\task_1514415800\\build\\application\\firefox\\firefox.exe', '--address=localhost:2828', '--symbols-path=https://queue.taskcluster.net/v1/task/NJ0SIQseRCe7rSmtzNn4zw/artifacts/public/build/target.crashreporter-symbols.zip', 'Z:\\task_1514415800\\build\\tests\\marionette\\tests\\testing\\marionette\\harness\\marionette_harness\\tests\\unit-tests.ini']
1196
23:46:45    ERROR - timed out after 1000 seconds of no output
1197
23:46:45    ERROR - Return code: 572
1198
23:46:45     INFO - TinderboxPrint: marionette<br/><em class="testfail">T-FAIL</em>
1199
23:46:45     INFO - gecko.log not found
1200
23:46:45     INFO - TinderboxPrint: marionette<br/><em class="testfail">T-FAIL</em>
1201
23:46:45     INFO - Marionette exited with return code 572: FAILURE
1202
23:46:45    ERROR - # TBPL FAILURE #
Flags: needinfo?(hskupin)
(In reply to OrangeFactor Robot from comment #59)
> https://brasstacks.mozilla.com/orangefactor/
> ?display=Bug&bugid=1391545&startday=2018-01-01&endday=2018-01-07&tree=all

Most of those starred failures are not related to Marionette itself but web-platform tests, and failures similar to bug 1407383.

Chris, can we please make sure to only star mozprocess related failures for this bug if those come from any of the Mn jobs? Thanks.
Flags: needinfo?(hskupin) → needinfo?(coop)
(In reply to Henrik Skupin (:whimboo) from comment #60) 
> Most of those starred failures are not related to Marionette itself but
> web-platform tests, and failures similar to bug 1407383.
> 
> Chris, can we please make sure to only star mozprocess related failures for
> this bug if those come from any of the Mn jobs? Thanks.

When you say "most" that implies that some actually are legitimate Marionette failures. Is that the case?

I'll talk to sheriffs about root cause determination, but the sheriffs are making their decisions based on log output. Is there anything we can add to the log output for Marionette failures in the timeout case that makes it clear that a root cause is elsewhere?
Flags: needinfo?(coop)
(In reply to Chris Cooper [:coop] from comment #61)
> When you say "most" that implies that some actually are legitimate
> Marionette failures. Is that the case?

Yes, but only on the platforms Windows 7 and 10. As OF indicates everything else is web-platform-tests, or mochitest.

> I'll talk to sheriffs about root cause determination, but the sheriffs are
> making their decisions based on log output. Is there anything we can add to
> the log output for Marionette failures in the timeout case that makes it
> clear that a root cause is elsewhere?

The timeout doesn't come from Marionette but mozprocess. You can say that Marionette uses mozprocess, which is true, but basically it could be totally different situations.
This bug failed 44 times in the last 7 days on Windows7, OSX10, Linux platforms, affecting debug, pt, pgo build types.
Failing test: opt-marionette-e10, opt-browser-screenshots-e10, opt-marionette-headless, opt-mochitest.

Link to a recent log:https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=156416690&lineNumber=2684

Part of the log:

[task 2018-01-15T18:35:15.918Z] 18:35:15     INFO - TEST-OK | dom/events/test/pointerevents/test_bug1403055.html | took 722ms
2682
[task 2018-01-15T18:35:15.926Z] 18:35:15     INFO - TEST-START | dom/events/test/pointerevents/test_bug1414336.html
2683
[task 2018-01-15T18:35:16.022Z] 18:35:16     INFO - GECKO(2095) | Flushing APZ repaints was a no-op, triggering callback directly...
2684
[task 2018-01-15T18:51:56.052Z] 18:51:56     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/builds/worker/workspace/build/venv/bin/python', '-u', '/builds/worker/workspace/build/tests/mochitest/runtests.py', '--disable-e10s', '--total-chunks', '10', '--this-chunk', '3', '--jscov-dir-prefix=/builds/worker/workspace/build/blobber_upload_dir', '--appname=/builds/worker/workspace/build/application/firefox/firefox', '--utility-path=tests/bin', '--extra-profile-file=tests/bin/plugins', '--symbols-path=https://queue.taskcluster.net/v1/task/Uaqdta8aRnm_xM7KAZ2qHQ/artifacts/public/build/target.crashreporter-symbols.zip', '--certificate-path=tests/certs', '--setpref=webgl.force-enabled=true', '--quiet', '--log-raw=/builds/worker/workspace/build/blobber_upload_dir/plain-chunked-coverage_raw.log', '--log-errorsummary=/builds/worker/workspace/build/blobber_upload_dir/plain-chunked-coverage_errorsummary.log', '--use-test-media-devices', '--screenshot-on-fail', '--cleanup-crashes', '--marionette-startup-timeout=180', '--sandbox-read-whitelist=/builds/worker/workspace/build', '--log-raw=-', '--chunk-by-dir=4', '--timeout=1200']
2685
[task 2018-01-15T18:51:56.070Z] 18:51:56    ERROR - timed out after 1000 seconds of no output
2686
[task 2018-01-15T18:51:56.070Z] 18:51:56    ERROR - Return code: -15
2687
[task 2018-01-15T18:51:56.070Z] 18:51:56    ERROR - No suite end message was emitted by this harness.
2688
[task 2018-01-15T18:51:56.071Z] 18:51:56     INFO - TinderboxPrint: mochitest-plain-chunked-coverage<br/>105/0/1
2689
[task 2018-01-15T18:51:56.071Z] 18:51:56    ERROR - # TBPL FAILURE #
2690
[task 2018-01-15T18:51:56.072Z] 18:51:56  WARNING - setting return code to 2
2691
[task 2018-01-15T18:51:56.072Z] 18:51:56    ERROR - The mochitest suite: plain-chunked-coverage ran with return status: FAILUR

This bug depends on: - https://bugzilla.mozilla.org/show_bug.cgi?id=1397612 ni :whimboo
                     - https://bugzilla.mozilla.org/show_bug.cgi?id=1421289 ni :whimboo & :gbrown

Could be similar to: - https://bugzilla.mozilla.org/show_bug.cgi?id=1407383
Maybe logging of Marionette client could further help here. I started to work on bug 1394381 but was not able yet to finish it due to other important work.

(In reply to Arthur Iakab [arthur_iakab] from comment #65)
> This bug failed 44 times in the last 7 days on Windows7, OSX10, Linux
> platforms, affecting debug, pt, pgo build types.
> Failing test: opt-marionette-e10, opt-browser-screenshots-e10,
> opt-marionette-headless, opt-mochitest.

As said above, other test types like web-platform-tests and mochitests including browser-screenshots might not directly be related and really should be starred against a different bug until we are clear that this really has the same underlying issue.
Depends on: 1394381
For the Marionette unit tests it is interesting that those only fail on Windows and then always for the following test:

> TEST-START | testing\marionette\harness\marionette_harness\tests\unit\test_cli_arguments.py TestCommandLineArguments.test_startup_timeout

I will file a new bug and temporarily skip this test for Windows only.
There are 36 failures present over the last 7 days. Most of them happen on Windows 7, Linux, Linux x64, linux64-jsdcov, OS X 10.10 and windows7-32-nightly.

Here is the most recent log example: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=158218289&lineNumber=1208

Here is a snippet from the aforementioned log: 

[task 2018-01-24T15:48:17.977Z] 15:48:17     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/builds/worker/workspace/build/venv/bin/python', '-u', '/builds/worker/workspace/build/tests/web-platform/runtests.py', '--log-raw=-', '--log-raw=/builds/worker/workspace/build/blobber_upload_dir/wpt_raw.log', '--log-wptreport=/builds/worker/workspace/build/blobber_upload_dir/wptreport.json', '--log-errorsummary=/builds/worker/workspace/build/blobber_upload_dir/wpt_errorsummary.log', '--binary=/builds/worker/workspace/build/application/firefox/firefox', '--symbols-path=https://queue.taskcluster.net/v1/task/Y3HNzpk0Qp-Ialvh9_Bakw/artifacts/public/build/target.crashreporter-symbols.zip', '--stackwalk-binary=/usr/local/bin/linux64-minidump_stackwalk', '--stackfix-dir=/builds/worker/workspace/build/tests/bin', '--run-by-dir=3', '--no-pause-after-test', '--test-type=wdspec', '--stylo-threads=4', '--webdriver-binary=/builds/worker/workspace/build/tests/bin/geckodriver', '--prefs-root=/builds/worker/workspace/build/tests/web-platform/prefs', '--processes=1', '--config=/builds/worker/workspace/build/tests/web-platform/wptrunner.ini', '--ca-cert-path=/builds/worker/workspace/build/tests/web-platform/certs/cacert.pem', '--host-key-path=/builds/worker/workspace/build/tests/web-platform/certs/web-platform.test.key', '--host-cert-path=/builds/worker/workspace/build/tests/web-platform/certs/web-platform.test.pem', '--certutil-binary=/builds/worker/workspace/build/tests/bin/certutil']
[task 2018-01-24T15:48:18.012Z] 15:48:18    ERROR - timed out after 1000 seconds of no output
[task 2018-01-24T15:48:18.012Z] 15:48:18    ERROR - Return code: -15
[task 2018-01-24T15:48:18.013Z] 15:48:18    ERROR - No suite end message was emitted by this harness.
[task 2018-01-24T15:48:18.013Z] 15:48:18    ERROR - # TBPL FAILURE #
Similar to other bugs I'm going to update the summary to make it clear this is for the Mn(H) test jobs only.
Summary: Intermittent Marionette timed out after 1000 seconds of no output → Intermittent Mn/MnH Automation Error: mozprocess timed out after 1000 seconds running ['Z:\\task_1517059027\\build\\venv\\Scripts\\python', '-u', 'Z:\\task_1517059027\\build\\tests\\marionette\\harness\\marionette_harness\\runtests.py']
Summary: Intermittent Mn/MnH Automation Error: mozprocess timed out after 1000 seconds running ['Z:\\task_1517059027\\build\\venv\\Scripts\\python', '-u', 'Z:\\task_1517059027\\build\\tests\\marionette\\harness\\marionette_harness\\runtests.py'] → Intermittent marionette timed out after 1000 seconds of no output
In the last 7 days we have 30 failures.
Thy occur on Windows 7 and the affected builds type are opt and pgo.
Recent log example: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=168723810&lineNumber=1389
All the failure here seem to happen for `test_cli_arguments.py TestCommandLineArguments.test_startup_timeout`. I will have a look maybe later today to see if I can reproduce it locally.
Priority: P5 → P2
Oh, we actually got this logged as bug 1430717.
I disabled the unit test which caused this hang. Lets observe if that removes any remaining failures for the Mn jobs.
Status: REOPENED → NEW
Assignee: hskupin → nobody
Keywords: meta
no failures since March 21st, :whimboo are these failures being annotated against a different bug or set of bugs now?
Flags: needinfo?(hskupin)
The error message of those formerly starred failures is: 

https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=1aeb1fc7be3371280a5d6f93394d6df0d8db0ad2&selectedJob=169339731

> Automation Error: mozprocess timed out after 1000 seconds running

It doesn't match what we have here for this bug. Also for other test jobs we include the job name. As such we should update the bug summary, so that we can be sure failures are getting starred correctly again.
Flags: needinfo?(hskupin)
Summary: Intermittent marionette timed out after 1000 seconds of no output → Intermittent Mn Automation Error: mozprocess timed out after 1000 seconds running
changing the title didn't help here- no failures in the last 30 days
Whiteboard: [stockwell disable-recommended] → [stockwell unknown]
Because the safe mode test is still disabled and we wait for bug 1433873 being fixed for dependent bugs.
Summary: Intermittent Mn Automation Error: mozprocess timed out after 1000 seconds running → Intermittent Mn/MnH Automation Error: mozprocess timed out after 1000 seconds running
(In reply to Intermittent Failures Robot from comment #97)
> This is the #41 most frequent failure this week.
[..]
> For more details, see:
> https://treeherder.mozilla.org/intermittent-failures.html#/
> bugdetails?bug=1391545&startday=2018-10-22&endday=2018-10-28&tree=all

There are a couple of wrongly classified test failures, but majorly this is Marionette test jobs on Windows. All failures seem to be related to the unskipped test on bug 1397612. I will check if I can reproduce it later today.
Flags: needinfo?(hskupin)
All the latest failures happen due to TestCommandLineArguments.test_startup_timeout. I will push a patch on bug 1416747 which should make this test more stable. Lets see how this changes the occurrence of this problem.
Depends on: 1416747
Flags: needinfo?(hskupin)
Flags: needinfo?(hskupin)
With the fix for bug 1416747 the failures seem to have stopped. I will re-check with next weeks OF comments.
Flags: needinfo?(hskupin)
There was only one failure tracked by OF which actually was wrongly classified. As such we don't currently see any other failure. I will re-check in a week if we can close this bug.
Flags: needinfo?(hskupin)
It looks still all fine. As such we can close this bug.
Assignee: nobody → hskupin
Status: NEW → RESOLVED
Closed: 7 years ago6 years ago
Flags: needinfo?(hskupin)
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: mozilla59 → mozilla65
Summary: Intermittent Mn/MnH Automation Error: mozprocess timed out after 1000 seconds running → [meta] Intermittent Mn/MnH Automation Error: mozprocess timed out after 1000 seconds running
Version: Version 3 → unspecified
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Why did you reopen this bug? I don't see a new bug filed which would explain it.
Flags: needinfo?(aryx.bugmail)
(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #105)
> So do you want a new bug for that and if yes, shall that block any other one?

I see. Maybe we should just leave this meta bug open forever then. Please always classify against this bug.

The current failures are most likely happening because of bug 1508726. When Marionette runs tests a Firefox update is downloaded because the updater currently doesn't care of prefs Marionette previously set to disallow that. Then a restart upgrades Firefox and mozprocess loses the connection to the new and updated instance. This should only happen on Windows actually due to bug 1493796.
Depends on: 1508726, 1493796
Assignee: hskupin → nobody
Status: REOPENED → NEW
Note that all the OS X related classified failures are not for Marionette but other test jobs like web-platform-tests, and mochitests. Can we please make sure to get those classified correctly in the next week?
Flags: needinfo?(aryx.bugmail)
All the other test failures are actually on Windows and seem to be related to the shutdown of Firefox. Whether the process doesn't stop when calling `restart` or `quit`, or mozprocess just hangs. Several test jobs also show a MOZ_CRASH sadly without any further information:

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=219075099&repo=mozilla-inbound&lineNumber=47165

> 11:12:18     INFO -  Hit MOZ_CRASH(Aborting on channel error.) at z:/build/build/src/ipc/glue/MessageChannel.cpp:2527

Kan-Ru, did you hear about any recent regression or a change in that area which could have caused this shutdown problem on Windows since early December?
Flags: needinfo?(kanru)
Changed to bug 1517206 for the Wdspec failure and bug 1516519 for the functional UI one.
Flags: needinfo?(aryx.bugmail)

The recent failures are all on Win7 and show a channel error. Lets get this investigated and fixed on bug 1521447.

No longer depends on: 1521447
See Also: → 1516519
Flags: needinfo?(kanru)
Whiteboard: [stockwell unknown] → [stockwell needsork:owner]

This has failed 31 times in the last 7 days, all on win7 32. Here's a recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=227439568&repo=mozilla-central&lineNumber=41316

Last test start is test_profile_management.py TestQuitRestartWithoutWorkspace.test_restart_clean_creates_new_profile

22:59:37 INFO - TEST-START | testing\marionette\harness\marionette_harness\tests\unit\test_profile_management.py TestQuitRestartWithoutWorkspace.test_restart_clean_creates_new_profile
22:59:37 INFO - 1549753177737 Marionette DEBUG 2 -> [0,2,"WebDriver:DeleteSession",{}]
22:59:37 INFO - 1549753177740 Marionette DEBUG 2 <- [1,2,null,{"value":null}]
22:59:37 INFO - 1549753177743 Marionette DEBUG Closed connection 2
22:59:37 INFO - 1549753177745 Marionette DEBUG Accepted connection 3 from 127.0.0.1:51049
22:59:37 INFO - 1549753177767 Marionette DEBUG 3 -> [0,1,"WebDriver:NewSession",{"strictFileInteractability":true}]
22:59:37 INFO - 1549753177770 Marionette TRACE [2147483649] Frame script loaded
22:59:37 INFO - 1549753177790 Marionette TRACE [2147483649] Frame script registered
22:59:37 INFO - 1549753177805 Marionette DEBUG 3 <- [1,1,null,{"sessionId":"5796e7d8-f3e5-4a76-8d99-1f8f741b8a2a","capabilities":{"browserName":"firefox","browserVersion":"67.0a ... p\tmpqvkfwl.mozrunner","moz:shutdownTimeout":60000,"moz:useNonSpecCompliantPointerOrigin":false,"moz:webdriverClick":true}}]
22:59:37 INFO - 1549753177828 Marionette DEBUG 3 -> [0,2,"WebDriver:SetTimeouts",{"script":30000}]
22:59:37 INFO - 1549753177830 Marionette DEBUG 3 <- [1,2,null,{"value":null}]
22:59:37 INFO - 1549753177832 Marionette DEBUG 3 -> [0,3,"WebDriver:SetTimeouts",{"pageLoad":300000}]
22:59:37 INFO - 1549753177833 Marionette DEBUG 3 <- [1,3,null,{"value":null}]
22:59:37 INFO - 1549753177837 Marionette DEBUG 3 -> [0,4,"WebDriver:SetTimeouts",{"implicit":0}]
22:59:37 INFO - 1549753177839 Marionette DEBUG 3 <- [1,4,null,{"value":null}]
22:59:37 INFO - 1549753177841 Marionette DEBUG 3 -> [0,5,"Marionette:GetContext",{}]
22:59:37 INFO - 1549753177842 Marionette DEBUG 3 <- [1,5,null,{"value":"content"}]
22:59:37 INFO - 1549753177844 Marionette DEBUG 3 -> [0,6,"WebDriver:DeleteSession",{}]
22:59:37 INFO - 1549753177847 Marionette DEBUG 3 <- [1,6,null,{"value":null}]
22:59:37 INFO - Exiting due to channel error.
23:16:17 INFO - Automation Error: mozprocess timed out after 1000 seconds running ['Z:\task_1549750250\build\venv\Scripts\python', '-u', 'Z:\task_1549750250\build\tests\marionette\harness\marionette_harness\runtests.py', '--gecko-log=-', '-vv', '--log-raw=Z:\task_1549750250\build\blobber_upload_dir\marionette_raw.log', '--log-errorsummary=Z:\task_1549750250\build\blobber_upload_dir\marionette_errorsummary.log', '--log-html=Z:\task_1549750250\build\blobber_upload_dir\report.html', '--binary=Z:\task_1549750250\build\application\firefox\firefox.exe', '--address=localhost:2828', '--symbols-path=Z:\task_1549750250\build\symbols', 'Z:\task_1549750250\build\tests\marionette\tests\testing\marionette\harness\marionette_harness\tests\unit-tests.ini']
23:16:17 ERROR - timed out after 1000 seconds of no output
23:16:17 ERROR - Return code: 572
23:16:17 INFO - TinderboxPrint: marionette<br/><em class="testfail">T-FAIL</em>
23:16:17 INFO - gecko.log not found
23:16:17 INFO - TinderboxPrint: marionette<br/><em class="testfail">T-FAIL</em>
23:16:17 INFO - Marionette exited with return code 572: FAILURE
23:16:17 ERROR - # TBPL FAILURE #
23:16:17 WARNING - setting return code to 2

Whiteboard: [stockwell needsork:owner] → [stockwell needswork:owner]
Depends on: 1526880

(In reply to Cosmin Sabou [:CosminS] from comment #122)

This has failed 31 times in the last 7 days, all on win7 32. Here's a recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=227439568&repo=mozilla-central&lineNumber=41316

I filed this as bug 1526880 now, given that it is very specific for Windows 7 32bit.

No longer depends on: 1521447
Summary: [meta] Intermittent Mn/MnH Automation Error: mozprocess timed out after 1000 seconds running → [meta] Intermittent Mn/MnH Automation Error: mozprocess timed out after 1000 seconds running - DON'T USE FOR CLASSIFICATION

We haven't seen any failure in all the last months so I'm going to close this bug as WFM.

Status: NEW → RESOLVED
Closed: 6 years ago2 years ago
Resolution: --- → WORKSFORME
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: