Closed Bug 1826982 Opened 1 year ago Closed 9 months ago

Intermittent OSError: Servers failed to start:

Categories

(Testing :: web-platform-tests, defect, P5)

defect

Tracking

(firefox119 fixed)

RESOLVED FIXED
119 Branch
Tracking Status
firefox119 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jgraham)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disable-recommended], [wptsync upstream])

Attachments

(2 files)

Filed by: sstanca [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=411703647&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/JjRfYwi_S3GIML7rBnF3FQ/runs/0/artifacts/public/logs/live_backing.log


[task 2023-04-07T15:32:23.567Z] 15:32:23     INFO - wptserve Starting http2 server on https://127.0.0.1:9000
[task 2023-04-07T15:32:25.577Z] 15:32:25     INFO - wptserve Starting WebTransport over HTTP/3 server on 127.0.0.1:11000
[task 2023-04-07T15:32:31.688Z] 15:32:31     INFO - wptserve Stopped WebTransport over HTTP/3 server on 127.0.0.1:11000
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Close on: (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('127.0.0.1', 8889))
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Stopped http server on 127.0.0.1:8446
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Close on: (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('127.0.0.1', 8888))
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Stopped http server on 127.0.0.1:8001
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Stopped http server on 127.0.0.1:8443
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Stopped http server on 127.0.0.1:9000
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Stopped http server on 127.0.0.1:8445
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Stopped http server on 127.0.0.1:8000
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Stopped http server on 127.0.0.1:8444
[task 2023-04-07T15:32:32.079Z] 15:32:32     INFO - wptserve Stopped http server on 127.0.0.1:8002
[task 2023-04-07T15:32:32.095Z] 15:32:32     INFO - wptserve Stopped http server on 127.0.0.1:8003
[task 2023-04-07T15:32:32.158Z] 15:32:32     INFO - Removed font: Ahem.ttf
[task 2023-04-07T15:32:32.236Z] 15:32:32     INFO - Closing logging queue
[task 2023-04-07T15:32:32.250Z] 15:32:32     INFO - queue closed
[task 2023-04-07T15:32:32.283Z] 15:32:32     INFO - Test harness output was not a valid structured log message
[task 2023-04-07T15:32:32.288Z] 15:32:32     INFO - Traceback (most recent call last):
[task 2023-04-07T15:32:32.288Z] 15:32:32     INFO -   File "Z:\task_168087658109457\build\tests\web-platform\runtests.py", line 16, in <module>
[task 2023-04-07T15:32:32.289Z] 15:32:32     INFO -     rv = wptrunner.main()
[task 2023-04-07T15:32:32.289Z] 15:32:32     INFO -   File "Z:\task_168087658109457\build\tests\web-platform\tests\tools\wptrunner\wptrunner\wptrunner.py", line 535, in main
[task 2023-04-07T15:32:32.289Z] 15:32:32     INFO -     return start(**kwargs)
[task 2023-04-07T15:32:32.289Z] 15:32:32     INFO -   File "Z:\task_168087658109457\build\tests\web-platform\tests\tools\wptrunner\wptrunner\wptrunner.py", line 518, in start
[task 2023-04-07T15:32:32.290Z] 15:32:32     INFO -     rv = not run_tests(**kwargs)[0] or logged_critical.has_log
[task 2023-04-07T15:32:32.291Z] 15:32:32     INFO -   File "Z:\task_168087658109457\build\tests\web-platform\tests\tools\wptrunner\wptrunner\wptrunner.py", line 419, in run_tests
[task 2023-04-07T15:32:32.291Z] 15:32:32     INFO -     test_environment.ensure_started()
[task 2023-04-07T15:32:32.291Z] 15:32:32     INFO -   File "Z:\task_168087658109457\build\tests\web-platform\tests\tools\wptrunner\wptrunner\environment.py", line 278, in ensure_started
[task 2023-04-07T15:32:32.291Z] 15:32:32     INFO -     raise OSError("Servers failed to start: %s" %
[task 2023-04-07T15:32:32.292Z] 15:32:32     INFO - OSError: Servers failed to start:
[task 2023-04-07T15:32:32.830Z] 15:32:32     INFO - Return code: 1
[task 2023-04-07T15:32:32.831Z] 15:32:32    ERROR - No suite end message was emitted by this harness.
[task 2023-04-07T15:32:32.831Z] 15:32:32    ERROR - No checks run.
[task 2023-04-07T15:32:32.831Z] 15:32:32  WARNING - setting return code to 2
[task 2023-04-07T15:32:32.831Z] 15:32:32     INFO - Running post-action listener: _package_coverage_data
[task 2023-04-07T15:32:32.831Z] 15:32:32     INFO - Running post-action listener: _resource_record_post_action
[task 2023-04-07T15:32:32.831Z] 15:32:32     INFO - Running post-action listener: process_java_coverage_data
[task 2023-04-07T15:32:32.831Z] 15:32:32     INFO - Running post-action listener: stop_device
[task 2023-04-07T15:32:32.831Z] 15:32:32     INFO - [mozharness: 2023-04-07 15:32:32.831709Z] Finished run-tests step (success)
[task 2023-04-07T15:32:32.831Z] 15:32:32     INFO - Running post-run listener: _resource_record_post_run
[task 2023-04-07T15:32:32.987Z] 15:32:32     INFO - Total resource usage - Wall time: 323s; CPU: 20%; Read bytes: 24172032; Write bytes: 340779520; Read time: 10; Write time: 851
[task 2023-04-07T15:32:32.987Z] 15:32:32     INFO - TinderboxPrint: CPU usage<br/>20.3%
[task 2023-04-07T15:32:32.987Z] 15:32:32     INFO - TinderboxPrint: I/O read bytes / time<br/>24,172,032 / 10
[task 2023-04-07T15:32:32.987Z] 15:32:32     INFO - TinderboxPrint: I/O write bytes / time<br/>340,779,520 / 851
[task 2023-04-07T15:32:32.987Z] 15:32:32     INFO - TinderboxPrint: CPU idle<br/>2,056.0 (79.7%)
[task 2023-04-07T15:32:32.987Z] 15:32:32     INFO - TinderboxPrint: CPU system<br/>192.2 (7.4%)
[task 2023-04-07T15:32:32.987Z] 15:32:32     INFO - TinderboxPrint: CPU user<br/>331.8 (12.9%)
[task 2023-04-07T15:32:32.987Z] 15:32:32     INFO - pull - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2023-04-07T15:32:32.987Z] 15:32:32     INFO - start-emulator - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2023-04-07T15:32:32.988Z] 15:32:32     INFO - verify-device - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2023-04-07T15:32:32.988Z] 15:32:32     INFO - install - Wall time: 9s; CPU: 23%; Read bytes: 3644416; Write bytes: 25493504; Read time: 4; Write time: 20
[task 2023-04-07T15:32:32.990Z] 15:32:32     INFO - run-tests - Wall time: 314s; CPU: 20%; Read bytes: 19573760; Write bytes: 311837184; Read time: 6; Write time: 828
[task 2023-04-07T15:32:33.050Z] 15:32:33  WARNING - returning nonzero exit status 2
[taskcluster 2023-04-07T15:32:33.128Z]    Exit Code: 2
[taskcluster 2023-04-07T15:32:33.128Z]    User Time: 0s
[taskcluster 2023-04-07T15:32:33.128Z]  Kernel Time: 15.625ms
[taskcluster 2023-04-07T15:32:33.128Z]    Wall Time: 15m30.897611s
[taskcluster 2023-04-07T15:32:33.128Z]       Result: FAILED
[taskcluster 2023-04-07T15:32:33.129Z] === Task Finished ===
[taskcluster 2023-04-07T15:32:33.129Z] Task Duration: 15m30.9111905s
[taskcluster 2023-04-07T15:32:33.351Z] Uploading artifact public/logs/localconfig.json from file logs\localconfig.json with content encoding "gzip", mime type "application/json" and expiry 2024-04-06T14:52:59.139Z
[taskcluster 2023-04-07T15:32:34.201Z] Uploading artifact public/test_info/resource-usage.json from file build\blobber_upload_dir\resource-usage.json with content encoding "gzip", mime type "application/json" and expiry 2024-04-06T14:52:59.139Z
[taskcluster 2023-04-07T15:32:35.486Z] Uploading artifact public/test_info/wpt_errorsummary.log from file build\blobber_upload_dir\wpt_errorsummary.log with content encoding "gzip", mime type "text/plain" and expiry 2024-04-06T14:52:59.139Z
[taskcluster 2023-04-07T15:32:35.838Z] Uploading artifact public/test_info/wpt_instruments.txt from file build\blobber_upload_dir\wpt_instruments.txt with content encoding "gzip", mime type "text/plain; charset=utf-8" and expiry 2024-04-06T14:52:59.139Z
[taskcluster 2023-04-07T15:32:36.189Z] Uploading artifact public/test_info/wpt_raw.log from file build\blobber_upload_dir\wpt_raw.log with content encoding "gzip", mime type "text/plain" and expiry 2024-04-06T14:52:59.139Z
[taskcluster 2023-04-07T15:32:36.535Z] Uploading artifact public/test_info/wptreport.json from file build\blobber_upload_dir\wptreport.json with content encoding "gzip", mime type "application/json" and expiry 2024-04-06T14:52:59.139Z
[taskcluster 2023-04-07T15:32:36.873Z] Uploading link artifact public/logs/live.log to artifact public/logs/live_backing.log with expiry 2024-04-06T14:52:59.139Z
[taskcluster:error] exit status 2
Summary: Intermittent [tier 2] OSError: Servers failed to start: → Intermittent OSError: Servers failed to start:

This is a frequent failure (100+ classifications last week): What can be done about this?

Flags: needinfo?(james)
Assignee: nobody → james
Status: NEW → ASSIGNED

Moving to Windows 11 seems to have increased the rate of intermittent failures here,
experiment with a longer timeout to see if it helps.

It looks like maybe Windows 11 is just really slow? I've added some patches that might help.

Flags: needinfo?(james)
Pushed by james@hoppipolla.co.uk:
https://hg.mozilla.org/integration/autoland/rev/dfb7ce61fa8a
Provide a better error message when servers fail to start, r=Sasha
https://hg.mozilla.org/integration/autoland/rev/a2a9e360d5cb
Increase server startup timeout, r=Sasha
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → 119 Branch
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/42153 for changes under testing/web-platform/tests
Whiteboard: [stockwell disable-recommended] → [stockwell disable-recommended], [wptsync upstream]
Upstream PR merged by moz-wptsync-bot
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: