Closed Bug 1435082 Opened 6 years ago Closed 5 years ago

Intermittent wpt Max restarts exceeded

Categories

(Testing :: web-platform-tests, defect, P5)

Version 3
defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: intermittent-bug-filer, Unassigned)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])

Filed by: archaeopteryx [at] coole-files.de

https://treeherder.mozilla.org/logviewer.html#?job_id=159863705&repo=autoland

https://queue.taskcluster.net/v1/task/EIHqJFGkQ36HOmZYrho9Cw/runs/0/artifacts/public/logs/live_backing.log

[task 2018-02-01T15:30:40.526Z] 15:30:40     INFO - TEST-START | /mixed-content/link-prefetch-tag/no-opt-in/same-host-http/top-level/no-redirect/blockable/no-opt-in-blocks.https.html
[task 2018-02-01T15:30:40.526Z] 15:30:40     INFO - TEST-SKIP | /mixed-content/link-prefetch-tag/no-opt-in/same-host-http/top-level/no-redirect/blockable/no-opt-in-blocks.https.html | took 2ms
[task 2018-02-01T15:30:40.526Z] 15:30:40     INFO - Starting http server on web-platform.test:8001
[task 2018-02-01T15:30:40.527Z] 15:30:40     INFO - Starting http server on web-platform.test:8443
[task 2018-02-01T15:31:32.676Z] 15:31:32     INFO - Setting up ssl
[task 2018-02-01T15:32:43.656Z] 15:32:43     INFO - certutil | 
[task 2018-02-01T15:33:46.481Z] 15:33:46     INFO - certutil | 
[task 2018-02-01T15:33:46.497Z] 15:33:46     INFO - certutil | 
[task 2018-02-01T15:33:46.498Z] 15:33:46     INFO - Certificate Nickname                                         Trust Attributes
[task 2018-02-01T15:33:46.498Z] 15:33:46     INFO -                                                              SSL,S/MIME,JAR/XPI
[task 2018-02-01T15:33:46.501Z] 15:33:46     INFO - 
[task 2018-02-01T15:33:46.501Z] 15:33:46     INFO - web-platform-tests                                           CT,, 
[task 2018-02-01T15:33:46.501Z] 15:33:46     INFO - 
[task 2018-02-01T15:33:46.501Z] 15:33:46     INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox --marionette about:blank -profile /tmp/tmpYZWEOh.mozrunner
[task 2018-02-01T15:33:46.507Z] 15:33:46     INFO - Starting runner
[task 2018-02-01T15:33:47.454Z] 15:33:47     INFO - Browser exited with return code -15
[task 2018-02-01T15:33:57.455Z] 15:33:57  WARNING - Forcibly terminating runner process
[task 2018-02-01T15:33:57.463Z] 15:33:57  WARNING - u'stop': ()
[task 2018-02-01T15:34:08.749Z] 15:34:08     INFO - Setting up ssl
[task 2018-02-01T15:34:19.782Z] 15:34:19     INFO - certutil | 
[task 2018-02-01T15:34:31.312Z] 15:34:31     INFO - certutil | 
[task 2018-02-01T15:34:31.327Z] 15:34:31     INFO - certutil | 
[task 2018-02-01T15:34:31.327Z] 15:34:31     INFO - Certificate Nickname                                         Trust Attributes
[task 2018-02-01T15:34:31.328Z] 15:34:31     INFO -                                                              SSL,S/MIME,JAR/XPI
[task 2018-02-01T15:34:31.328Z] 15:34:31     INFO - 
[task 2018-02-01T15:34:31.328Z] 15:34:31     INFO - web-platform-tests                                           CT,, 
[task 2018-02-01T15:34:31.328Z] 15:34:31     INFO - 
[task 2018-02-01T15:34:41.915Z] 15:34:41     INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox --marionette about:blank -profile /tmp/tmpJQtvIL.mozrunner
[task 2018-02-01T15:34:41.931Z] 15:34:41     INFO - Starting runner
[task 2018-02-01T15:34:57.500Z] 15:34:57     INFO - Browser exited with return code -15
[task 2018-02-01T15:35:07.502Z] 15:35:07  WARNING - Forcibly terminating runner process
[task 2018-02-01T15:35:07.505Z] 15:35:07  WARNING - u'stop': ()
[task 2018-02-01T15:35:22.433Z] 15:35:22     INFO - Setting up ssl
[task 2018-02-01T15:35:32.400Z] 15:35:32     INFO - certutil | 
[task 2018-02-01T15:35:38.621Z] 15:35:38     INFO - certutil | 
[task 2018-02-01T15:35:38.637Z] 15:35:38     INFO - certutil | 
[task 2018-02-01T15:35:38.637Z] 15:35:38     INFO - Certificate Nickname                                         Trust Attributes
[task 2018-02-01T15:35:38.637Z] 15:35:38     INFO -                                                              SSL,S/MIME,JAR/XPI
[task 2018-02-01T15:35:38.638Z] 15:35:38     INFO - 
[task 2018-02-01T15:35:38.638Z] 15:35:38     INFO - web-platform-tests                                           CT,, 
[task 2018-02-01T15:35:38.641Z] 15:35:38     INFO - 
[task 2018-02-01T15:35:46.023Z] 15:35:46     INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox --marionette about:blank -profile /tmp/tmpIjn88N.mozrunner
[task 2018-02-01T15:35:46.039Z] 15:35:46     INFO - Starting runner
[task 2018-02-01T15:36:07.521Z] 15:36:07     INFO - Browser exited with return code -15
[task 2018-02-01T15:36:17.521Z] 15:36:17  WARNING - Forcibly terminating runner process
[task 2018-02-01T15:36:17.529Z] 15:36:17  WARNING - u'stop': ()
[task 2018-02-01T15:36:31.455Z] 15:36:31     INFO - Setting up ssl
[task 2018-02-01T15:36:34.293Z] 15:36:34     INFO - certutil | 
[task 2018-02-01T15:36:36.060Z] 15:36:36     INFO - certutil | 
[task 2018-02-01T15:36:36.068Z] 15:36:36     INFO - certutil | 
[task 2018-02-01T15:36:36.068Z] 15:36:36     INFO - Certificate Nickname                                         Trust Attributes
[task 2018-02-01T15:36:36.069Z] 15:36:36     INFO -                                                              SSL,S/MIME,JAR/XPI
[task 2018-02-01T15:36:36.069Z] 15:36:36     INFO - 
[task 2018-02-01T15:36:36.069Z] 15:36:36     INFO - web-platform-tests                                           CT,, 
[task 2018-02-01T15:36:36.069Z] 15:36:36     INFO - 
[task 2018-02-01T15:36:37.192Z] 15:36:37     INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox --marionette about:blank -profile /tmp/tmpF41VZR.mozrunner
[task 2018-02-01T15:36:37.201Z] 15:36:37     INFO - Starting runner
[task 2018-02-01T15:37:17.535Z] 15:37:17     INFO - Browser exited with return code -15
[task 2018-02-01T15:37:27.537Z] 15:37:27  WARNING - Forcibly terminating runner process
[task 2018-02-01T15:37:27.545Z] 15:37:27  WARNING - u'stop': ()
[task 2018-02-01T15:37:32.814Z] 15:37:32     INFO - Setting up ssl
[task 2018-02-01T15:37:36.526Z] 15:37:36     INFO - certutil | 
[task 2018-02-01T15:37:40.504Z] 15:37:40     INFO - certutil | 
[task 2018-02-01T15:37:40.520Z] 15:37:40     INFO - certutil | 
[task 2018-02-01T15:37:40.520Z] 15:37:40     INFO - Certificate Nickname                                         Trust Attributes
[task 2018-02-01T15:37:40.520Z] 15:37:40     INFO -                                                              SSL,S/MIME,JAR/XPI
[task 2018-02-01T15:37:40.520Z] 15:37:40     INFO - 
[task 2018-02-01T15:37:40.521Z] 15:37:40     INFO - web-platform-tests                                           CT,, 
[task 2018-02-01T15:37:40.521Z] 15:37:40     INFO - 
[task 2018-02-01T15:37:44.388Z] 15:37:44     INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox --marionette about:blank -profile /tmp/tmpqnQIvu.mozrunner
[task 2018-02-01T15:37:44.405Z] 15:37:44     INFO - Starting runner
[task 2018-02-01T15:38:28.188Z] 15:38:28     INFO - Browser exited with return code -15
[task 2018-02-01T15:38:38.190Z] 15:38:38  WARNING - Forcibly terminating runner process
[task 2018-02-01T15:38:38.198Z] 15:38:38  WARNING - u'stop': ()
[task 2018-02-01T15:38:49.812Z] 15:38:49     INFO - Setting up ssl
[task 2018-02-01T15:38:58.386Z] 15:38:58     INFO - certutil | 
[task 2018-02-01T15:39:04.956Z] 15:39:04     INFO - certutil | 
[task 2018-02-01T15:39:04.977Z] 15:39:04     INFO - certutil | 
[task 2018-02-01T15:39:04.977Z] 15:39:04     INFO - Certificate Nickname                                         Trust Attributes
[task 2018-02-01T15:39:04.977Z] 15:39:04     INFO -                                                              SSL,S/MIME,JAR/XPI
[task 2018-02-01T15:39:04.978Z] 15:39:04     INFO - 
[task 2018-02-01T15:39:04.978Z] 15:39:04     INFO - web-platform-tests                                           CT,, 
[task 2018-02-01T15:39:04.978Z] 15:39:04     INFO - 
[task 2018-02-01T15:39:07.008Z] 15:39:07     INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox --marionette about:blank -profile /tmp/tmppXjAbX.mozrunner
[task 2018-02-01T15:39:07.011Z] 15:39:07     INFO - Starting runner
[task 2018-02-01T15:39:38.234Z] 15:39:38     INFO - Browser exited with return code -15
[task 2018-02-01T15:39:48.234Z] 15:39:48  WARNING - Forcibly terminating runner process
[task 2018-02-01T15:39:48.244Z] 15:39:48  WARNING - u'stop': ()
[task 2018-02-01T15:39:48.245Z] 15:39:48    ERROR - Max restarts exceeded
[task 2018-02-01T15:39:48.246Z] 15:39:48     INFO - Got 0 unexpected results
[task 2018-02-01T15:39:48.247Z] 15:39:48     INFO - SUITE-END | took 547s
https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INCOMPLETE
Hi,

This bug had a huge spike in occurrence in the last 24 hours.

Most failures are on windows 7-32 and a few of them on windows 7-64, build type is opt and pgo.

Here is a recent log of the failure:
https://treeherder.mozilla.org/logviewer.html#?job_id=180539052&repo=mozilla-inbound

jgraham: Can you please take a look at this bug?
Flags: needinfo?(james)
Looks like this was a mislabled failure that got fixed.
Flags: needinfo?(james)
https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → INCOMPLETE
Recent failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=211394281&repo=autoland&lineNumber=2081
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Status: REOPENED → RESOLVED
Closed: 6 years ago5 years ago
Resolution: --- → INCOMPLETE
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → INCOMPLETE

This has failed 34 times in the last 7 days on: android-em-7-0-x86_64 debug & opt, windows10-64 & windows10-64-ccov debug, windows10-64-qr opt & debug, windows10-64-shippable opt, windows7-32 opt & debug and windows7-32-shippable opt
https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-06-21&endday=2019-06-28&tree=trunk&bug=1435082

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=253858823&repo=autoland&lineNumber=100818

05:21:25 INFO - TEST-PASS | leakcheck | tab no leaks detected!
05:21:25 INFO - Closing logging queue
05:21:25 INFO - queue closed
05:21:25 ERROR - Max restarts exceeded
05:21:25 INFO - PROCESS LEAKS c:\users\task_1561688441\appdata\local\temp\tmptr4hyi.mozrunner\runtests_leaks_3832.log
05:21:25 INFO - leakcheck | Processing log file c:\users\task_1561688441\appdata\local\temp\tmptr4hyi.mozrunner\runtests_leaks_3832.log for scope /mixed-content/module-data-worker-import/no-opt-in

Flags: needinfo?(james)

This has failed 34 times in the last 7 days on: android-em-7-0-x86_64 debug & opt, windows10-64 & windows10-64-ccov debug, windows10-64-qr opt & debug, windows10-64-shippable opt, windows7-32 opt & debug and windows7-32-shippable opt

Recent log:
https://treeherder.mozilla.org/logviewer.html#?job_id=255157521&repo=mozilla-inbound

It seems like sometimes we're just not getting a marionette connection within the timeout. It could be that starting up forefox really does take longer than 360s on occasion, or we're somehow hitting a startup crash (although the builds seem to be started correctly at least some times).

Flags: needinfo?(james)

There are 23 total failures in the last 7 days on windows10-64 debug and opt, windows10-64-ccov debug, windows10-64-qr opt and windows7-32 debug.

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=258466756&repo=autoland&lineNumber=113222

[task 2019-07-26T10:59:40.684Z] 10:59:40 INFO - TEST-START | /referrer-policy/unset-referrer-policy/attr-referrer/same-origin/http-https/script-tag/swap-origin-redirect/upgrade-protocol.http.html
[task 2019-07-26T10:59:40.689Z] 10:59:40 INFO - Closing window 10737418241
[task 2019-07-26T10:59:40.699Z] 10:59:40 INFO - PID 6808 | [Parent 3548, Main Thread] WARNING: Constructing RangeBoundary with invalid value: 'mRef || aOffset == 0', file z:/build/build/src/obj-firefox/dist/include\mozilla/RangeBoundary.h, line 79
[task 2019-07-26T10:59:40.699Z] 10:59:40 INFO - PID 6808 | [Parent 3548, Main Thread] WARNING: Constructing RangeBoundary with invalid value: 'mRef || aOffset == 0', file z:/build/build/src/obj-firefox/dist/include\mozilla/RangeBoundary.h, line 79
[task 2019-07-26T10:59:40.699Z] 10:59:40 INFO - PID 6808 | [Child 8392, Main Thread] WARNING: No active window: file z:/build/build/src/js/xpconnect/src/XPCJSContext.cpp, line 664
[task 2019-07-26T10:59:40.745Z] 10:59:40 INFO - PID 6808 | [Parent 3548, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 341

[task 2019-07-26T11:33:27.601Z] 11:33:27 INFO - nsTraceRefcnt::DumpStatistics: 817 entries
[task 2019-07-26T11:33:27.601Z] 11:33:27 INFO - TEST-PASS | leakcheck | tab no leaks detected!
[task 2019-07-26T11:33:27.602Z] 11:33:27 INFO - leakcheck | Processing leak log file c:\users\task_1564132537\appdata\local\temp\tmpaybz4t.mozrunner\runtests_leaks_10444_tab_pid4472.log
[task 2019-07-26T11:33:27.602Z] 11:33:27 INFO -
[task 2019-07-26T11:33:27.602Z] 11:33:27 INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, tab process 4472
[task 2019-07-26T11:33:27.602Z] 11:33:27 INFO -
[task 2019-07-26T11:33:27.602Z] 11:33:27 INFO - |<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
[task 2019-07-26T11:33:27.602Z] 11:33:27 INFO - | | Per-Inst Leaked| Total Rem|
[task 2019-07-26T11:33:27.602Z] 11:33:27 INFO - 0 |TOTAL | 45 0| 37702 0|
[task 2019-07-26T11:33:27.607Z] 11:33:27 INFO -
[task 2019-07-26T11:33:27.607Z] 11:33:27 INFO - nsTraceRefcnt::DumpStatistics: 991 entries
[task 2019-07-26T11:33:27.608Z] 11:33:27 INFO - TEST-PASS | leakcheck | tab no leaks detected!
[task 2019-07-26T11:33:27.609Z] 11:33:27 INFO - Closing logging queue
[task 2019-07-26T11:33:27.609Z] 11:33:27 INFO - queue closed
[task 2019-07-26T11:33:27.609Z] 11:33:27 ERROR - Max restarts exceeded
[task 2019-07-26T11:33:27.609Z] 11:33:27 INFO - PROCESS LEAKS c:\users\task_1564132537\appdata\local\temp\tmpaybz4t.mozrunner\runtests_leaks_10444.log
[task 2019-07-26T11:33:27.609Z] 11:33:27 INFO - leakcheck | Processing log file c:\users\task_1564132537\appdata\local\temp\tmpaybz4t.mozrunner\runtests_leaks_10444.log for scope /referrer-policy/unset-referrer-policy/http-rp
[task 2019-07-26T11:33:27.610Z] 11:33:27 INFO - TEST-INFO | leakcheck | default process: leak threshold set at 51200 bytes
[task 2019-07-26T11:33:27.610Z] 11:33:27 INFO - TEST-INFO | leakcheck | plugin process: leak threshold set at 0 bytes
[task 2019-07-26T11:33:27.610Z] 11:33:27 INFO - TEST-INFO | leakcheck | tab process: leak threshold set at 10000 bytes
[task 2019-07-26T11:33:27.610Z] 11:33:27 INFO - TEST-INFO | leakcheck | gmplugin process: leak threshold set at 20000 bytes
[task 2019-07-26T11:33:27.610Z] 11:33:27 INFO - TEST-INFO | leakcheck | gpu process: leak threshold set at 0 bytes
[task 2019-07-26T11:33:27.610Z] 11:33:27 INFO - TEST-INFO | leakcheck | rdd process: leak threshold set at 400 bytes
[task 2019-07-26T11:33:27.610Z] 11:33:27 INFO - TEST-INFO | leakcheck | vr process: leak threshold set at 0 bytes
[task 2019-07-26T11:33:27.610Z] 11:33:27 INFO - TEST-INFO | leakcheck | socket process: leak threshold set at 0 bytes
[task 2019-07-26T11:33:27.610Z] 11:33:27 INFO - leakcheck | Processing leak log file c:\users\task_1564132537\appdata\local\temp\tmpaybz4t.mozrunner\runtests_leaks_10444.log

James is there something that can be done here?

Flags: needinfo?(james)
Whiteboard: [stockwell needswork:owner]

It's unclear if the problem here is just that we're even slower to start the browser than we expect, or if there's something more fundamental going on. I've tried a try push with the browser startup timeout increased [1], although come to think of it there may also be a timeout inside marionette. But if there's something more happening then just increasing the timeout won't help.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=70644ca043ebfea35f988be59ad681629a409460

Flags: needinfo?(james)

It looks like the marionette startup timeout is already 120s, so I think it makes sense to increase the wptrunner value to just above that e.g. 130s (although the shutdown timeout seems to be 70s, so I think we want to keep the wptrunner value close to that).

That try push didn't fix the problem, so I'm not sure what's going on here. It looks like we try to restart the browser, wait 4 minutes or so, and then run the shutdown sequence. It claims marionette isn't connected so it's possible there's an error in that layer. Maybe a push with the marionette log level turned up would help?

Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.