Closed
Bug 1414144
Opened 7 years ago
Closed 7 years ago
Intermittent TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/cookies/delete_cookie.py | expected OK
Categories
(Remote Protocol :: Marionette, defect, P3)
Tracking
(firefox59 wontfix, firefox60 wontfix, firefox61 fixed)
RESOLVED
FIXED
mozilla61
People
(Reporter: intermittent-bug-filer, Unassigned)
References
(Depends on 1 open bug)
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])
Comment 1•7 years ago
|
||
Here an excerpt from the failure:
[task 2017-11-03T03:25:42.976Z] 03:25:42 INFO - PID 1453 | 1509679542967 Marionette DEBUG Received DOM event "DOMContentLoaded" for "about:neterror?e=dnsNotFound&u=http%3A//web-platform.test%3A8000/webdriver/tests/support/inline.py%3Fdoc%3D%253Ctitle%253EWD+doc+title%253C%252Ftitle%253E%26content-type%3Dtext%252Fhtml%253Bcharset%253Dutf-8&c=UTF-8&f=regular&d=Firefox%20can%E2%80%99t%20find%20the%20server%20at%20web-platform.test."
[task 2017-11-03T03:25:43.919Z] 03:25:43 INFO - STDOUT: FAILED
[task 2017-11-03T03:25:44.061Z] 03:25:44 INFO - PID 1453 | 1509679544051 Marionette INFO New connections will no longer be accepted
[task 2017-11-03T03:25:44.279Z] 03:25:44 INFO - PID 1453 | [Child 1673] WARNING: pipe error (17): Connection reset by peer: file /build/firefox-l1g98U/firefox-56.0+build6/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 353
[task 2017-11-03T03:25:44.280Z] 03:25:44 INFO - PID 1453 | [Child 1673] WARNING: pipe error (3): Connection reset by peer: file /build/firefox-l1g98U/firefox-56.0+build6/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 353
[task 2017-11-03T03:25:44.337Z] 03:25:44 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/cookies/delete_cookie.py::test_handle_prompt_missing_value
As it looks like we hit a network issue here, and as result the requested page could not be loaded. I don't know how DNS is handled by wptrunner and wptserve, so I would like to defer to James here.
Otherwise why do the pipe errors contain "/build/firefox-l1g98U/firefox-56.0+build6/" for a build on autoland which is Firefox 58?
For me the above sounds like a operating system failure, maybe a complete network stack shutdown?
Flags: needinfo?(ted)
Flags: needinfo?(james)
Comment 2•7 years ago
|
||
I suspect that's a symptom rather than the cause. I would guess that the test is failing in some way, we close the connection and then try to communicate over it for some reason. I think you'd need to find a way to reproduce the failure to see exactly what's happening though.
Flags: needinfo?(james)
Comment hidden (Intermittent Failures Robot) |
Comment 4•7 years ago
|
||
I don't have any specific knowledge here, and I don't have time to dig into this. Sorry.
Flags: needinfo?(ted)
Comment 5•7 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #1)
> [task 2017-11-03T03:25:44.279Z] 03:25:44 INFO - PID 1453 | [Child 1673]
> WARNING: pipe error (17): Connection reset by peer: file
> /build/firefox-l1g98U/firefox-56.0+build6/ipc/chromium/src/chrome/common/
> ipc_channel_posix.cc, line 353
>
> Otherwise why do the pipe errors contain
> "/build/firefox-l1g98U/firefox-56.0+build6/" for a build on autoland which
> is Firefox 58?
Kan-Ru, do you have an explanation for the above question? Why do we show 56.0 for a current build from autoland?
Flags: needinfo?(kchen)
Comment hidden (Intermittent Failures Robot) |
Comment 7•7 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #5)
> (In reply to Henrik Skupin (:whimboo) from comment #1)
> > [task 2017-11-03T03:25:44.279Z] 03:25:44 INFO - PID 1453 | [Child 1673]
> > WARNING: pipe error (17): Connection reset by peer: file
> > /build/firefox-l1g98U/firefox-56.0+build6/ipc/chromium/src/chrome/common/
> > ipc_channel_posix.cc, line 353
> >
> > Otherwise why do the pipe errors contain
> > "/build/firefox-l1g98U/firefox-56.0+build6/" for a build on autoland which
> > is Firefox 58?
>
> Kan-Ru, do you have an explanation for the above question? Why do we show
> 56.0 for a current build from autoland?
I have no idea tbh. The only explanation I can think of is that maybe the test runner is using an older version of firefox to run something.
Flags: needinfo?(kchen)
Comment 8•7 years ago
|
||
Hm, the build has been downloaded from:
https://tools.taskcluster.net/groups/Lk2knvElS4KwOENmItI0Tw/tasks/J704BOBjRJ-Rho7TJmICcQ/runs/0/artifacts
Which is:
20171103021641
https://hg.mozilla.org/integration/autoland/rev/e6d86b7284bae701700b9d300ee1476ebe5f3eed
So it's nothing related to 56 beta at all.
AFAIK we always create a new docker instance for each job on Linux, so there shouldn't be any remaining Firefox process running.
Comment hidden (Intermittent Failures Robot) |
Comment 10•7 years ago
|
||
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
Comment 11•7 years ago
|
||
This still happens:
https://treeherder.mozilla.org/logviewer.html#?job_id=155191260&repo=autoland&lineNumber=1983
The test exceeds the 25s default timeout because there seem to be a hang in `test_handle_prompt_accept` or during shutdown. Due to less logging output this is hard to say:
[task 2018-01-10T01:00:13.888Z] 01:00:13 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/cookies/delete_cookie.py::test_handle_prompt_accept
[task 2018-01-10T01:00:43.571Z] 01:00:43 INFO - TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/cookies/delete_cookie.py | expected OK
[task 2018-01-10T01:00:43.571Z] 01:00:43 INFO - TEST-INFO took 30002ms
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Comment hidden (Intermittent Failures Robot) |
Comment 13•7 years ago
|
||
Debug builds show the following here:
[task 2018-01-10T03:21:31.525Z] 03:21:31 INFO - PID 2965 | [Parent 2977, IPDL Background] WARNING: Some storage operations are taking longer than expected during shutdown and will be aborted!: file /builds/worker/workspace/build/src/dom/quota/ActorsParent.cpp, line 5824
[task 2018-01-10T03:21:46.397Z] 03:21:46 INFO - PID 2965 | *** UTM:SVC TimerManager:registerTimer called after profile-before-change notification. Ignoring timer registration for id: telemetry_modules_ping
[task 2018-01-10T03:21:50.327Z] 03:21:50 INFO - PID 2965 | [Parent 2977, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/workspace/build/src/dom/xhr/XMLHttpRequestMainThread.cpp, line 2772
[task 2018-01-10T03:21:50.327Z] 03:21:50 INFO - PID 2965 | [Parent 2977, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/workspace/build/src/dom/xhr/XMLHttpRequestMainThread.cpp, line 2772
[task 2018-01-10T03:21:50.328Z] 03:21:50 INFO - PID 2965 | [Parent 2977, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/workspace/build/src/dom/xhr/XMLHttpRequestMainThread.cpp, line 2772
[task 2018-01-10T03:21:50.329Z] 03:21:50 INFO - PID 2965 | [Parent 2977, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/workspace/build/src/dom/xhr/XMLHttpRequestMainThread.cpp, line 2772
[task 2018-01-10T03:21:50.330Z] 03:21:50 INFO - PID 2965 | [Parent 2977, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/workspace/build/src/dom/xhr/XMLHttpRequestMainThread.cpp, line 2772
[task 2018-01-10T03:21:50.330Z] 03:21:50 INFO - PID 2965 | [Parent 2977, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/workspace/build/src/dom/xhr/XMLHttpRequestMainThread.cpp, line 2772
[task 2018-01-10T03:22:03.712Z] 03:22:03 INFO - PID 2965 | Hit MOZ_CRASH(Workers Hanging - A:3|S:0|Q:0-BC:0|WorkerHolderToken-BC:0|WorkerHolderToken-BC:0|WorkerHolderToken) at /builds/worker/workspace/build/src/dom/workers/RuntimeService.cpp:2187
[task 2018-01-10T03:22:03.712Z] 03:22:03 INFO - PID 2965 | ExceptionHandler::GenerateDump cloned child 3127
[task 2018-01-10T03:22:03.713Z] 03:22:03 INFO - PID 2965 | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2018-01-10T03:22:03.714Z] 03:22:03 INFO - PID 2965 | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
So it also looks like that we are having a hang during shutdown caused by workers. Andrea, I assume that this might also be related to bug 1425559?
Flags: needinfo?(amarchesini)
Updated•7 years ago
|
status-firefox59:
--- → affected
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 16•7 years ago
|
||
This bug has failed 34 times in the last 7 days on Linux affecting pgo build types.
Failing test: opt-web-platform-tests-wdspec-e10
Link to a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=160000339&lineNumber=1868
Part of that log:
[task 2018-02-02T01:44:12.084Z] 01:44:12 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/cookies/delete_cookie.py::test_handle_prompt_accept
[task 2018-02-02T01:44:12.269Z] 01:44:12 INFO - PID 2836 | [Parent 2848, Gecko_IOThread] WARNING: pipe error: Broken pipe: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 709
[task 2018-02-02T01:44:41.768Z] 01:44:41 INFO - TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/cookies/delete_cookie.py | expected OK
[task 2018-02-02T01:44:41.769Z] 01:44:41 INFO - TEST-INFO took 30001ms
[task 2018-02-02T01:44:41.799Z] 01:44:41 WARNING - u'runner_teardown': ()
[task 2018-02-02T01:44:41.800Z] 01:44:41 INFO - Starting runner
[task 2018-02-02T01:44:41.823Z] 01:44:41 INFO - PID 2992 | 1517535881815 geckodriver INFO geckodriver 0.19.1 ( '0690bf69410a' '2018-02-01 12:29 -0800')
[task 2018-02-02T01:44:41.823Z] 01:44:41 INFO - PID 2992 | 1517535881818 geckodriver INFO Listening on 127.0.0.1:4447
[task 2018-02-02T01:44:42.322Z] 01:44:42 INFO - WebDriver HTTP server listening at http://127.0.0.1:4447/
[task 2018-02-02T01:44:42.323Z] 01:44:42 INFO - TEST-START | /webdriver/tests/cookies/get_named_cookie.py
[task 2018-02-02T01:44:42.500Z] 01:44:42 INFO - STDOUT: ============================= test session starts ==============================
[task 2018-02-02T01:44:42.501Z] 01:44:42 INFO - STDOUT: platform linux2 -- Python 2.7.12, pytest-unknown, py-1.5.2, pluggy-0.5.3.dev -- /builds/worker/workspace/build/venv/bin/python
[task 2018-02-02T01:44:42.502Z] 01:44:42 INFO - STDOUT: rootdir: /builds/worker/workspace/build, inifile:
[task 2018-02-02T01:44:42.502Z] 01:44:42 INFO - STDOUT: collecting ...
[task 2018-02-02T01:44:42.538Z] 01:44:42 INFO - STDOUT: collected 3 items
:whimboo do you have any update on this?
Flags: needinfo?(hskupin)
Whiteboard: [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment 18•7 years ago
|
||
Can you check if this is fixed after bug 1422036?
Flags: needinfo?(amarchesini)
Comment 19•7 years ago
|
||
The intermittent failure is still present also with the backout of the patch from bug 1434618 which caused way more issues.
https://treeherder.mozilla.org/logviewer.html#?job_id=160525168&repo=mozilla-inbound&lineNumber=2005
[task 2018-02-06T01:44:05.900Z] 01:44:05 INFO - TEST-START | /webdriver/tests/cookies/delete_cookie.py
[task 2018-02-06T01:44:05.907Z] 01:44:05 INFO - STDOUT: ============================= test session starts ==============================
[task 2018-02-06T01:44:05.909Z] 01:44:05 INFO - STDOUT: platform linux2 -- Python 2.7.12, pytest-unknown, py-1.5.2, pluggy-0.5.3.dev -- /builds/worker/workspace/build/venv/bin/python
[task 2018-02-06T01:44:05.909Z] 01:44:05 INFO - STDOUT: rootdir: /builds/worker/workspace/build, inifile:
[task 2018-02-06T01:44:05.910Z] 01:44:05 INFO - STDOUT: collecting ...
[task 2018-02-06T01:44:05.926Z] 01:44:05 INFO - STDOUT: collected 7 items
[task 2018-02-06T01:44:05.927Z] 01:44:05 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/cookies/delete_cookie.py::test_no_browsing_context
[task 2018-02-06T01:44:06.048Z] 01:44:06 INFO - STDOUT: PASSED
[task 2018-02-06T01:44:06.191Z] 01:44:06 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/cookies/delete_cookie.py::test_handle_prompt_dismiss_and_notify
[task 2018-02-06T01:44:06.193Z] 01:44:06 INFO - STDOUT: PASSED
[task 2018-02-06T01:44:06.193Z] 01:44:06 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/cookies/delete_cookie.py::test_handle_prompt_accept_and_notify
[task 2018-02-06T01:44:06.194Z] 01:44:06 INFO - STDOUT: PASSED
[task 2018-02-06T01:44:06.198Z] 01:44:06 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/cookies/delete_cookie.py::test_handle_prompt_ignore
[task 2018-02-06T01:44:06.199Z] 01:44:06 INFO - STDOUT: PASSED
[task 2018-02-06T01:44:06.199Z] 01:44:06 INFO - STDOUT: tests/web-platform/tests/webdriver/tests/cookies/delete_cookie.py::test_handle_prompt_accept
[task 2018-02-06T01:44:35.862Z] 01:44:35 INFO - TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/cookies/delete_cookie.py | expected OK
All the tests work fine, but then we hang during shutdown of Firefox in test_handle_prompt_accept. Maybe it's somewhat related to bug 1425559? Sadly those are all opt builds where it happens so not that much logging is available.
Flags: needinfo?(hskupin) → needinfo?(amarchesini)
Updated•7 years ago
|
Priority: P5 → P3
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Priority: P3 → P5
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 23•7 years ago
|
||
In the last 7 days, there have been 31 failures.
The failures are on Linux x64 and Linux platform and it affects opt and pgo build types.
Recent log file:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=165410711&lineNumber=2000
:whimboo, :baku Do you have any updates here?
Flags: needinfo?(hskupin)
Whiteboard: [stockwell unknown] → [stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment 25•7 years ago
|
||
This is clearly a shutdown hang:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=166463179&lineNumber=17056
Andrea, do you think it could improve when we move to `will-xpcom-shutdown`? Sadly this hang only happens for opt/pgo builds, so no further information available.
Flags: needinfo?(hskupin)
Comment 26•7 years ago
|
||
FWIW I moved Marionette to use xpcom-will-shutdown in
https://bugzil.la/1428795, so if we think that will have an impact
here we should see a decline in the OF score pretty soon.
Comment 27•7 years ago
|
||
Oh, Andreas filed bug 1443524 which this should be dependent on.
status-firefox60:
--- → affected
Depends on: 1443524
Comment 28•7 years ago
|
||
(In reply to Andreas Tolfsen ‹:ato› from comment #26)
> FWIW I moved Marionette to use xpcom-will-shutdown in
> https://bugzil.la/1428795, so if we think that will have an impact
> here we should see a decline in the OF score pretty soon.
No, I was talking about bug 1434618.
Updated•7 years ago
|
Priority: P5 → P3
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 32•7 years ago
|
||
In most cases when this failures is happening the hang occurs in `test_handle_prompt_accept`, which is using a new session. So this failure should depend on a solution for bug 1449538.
But there are also a low number of cases when Firefox restarts quickly but then no "sessionstore-windows-restored" notification is received by Marionette:
https://treeherder.mozilla.org/logviewer.html#?job_id=170140630&repo=mozilla-inbound&lineNumber=20448-20464
This should basically be bug 1382162.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 39•7 years ago
|
||
Fixed by bug 1449538.
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Flags: needinfo?(amarchesini)
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Updated•7 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 47•2 years ago
|
||
Moving bug to Remote Protocol::Marionette component per bug 1815831.
Component: geckodriver → Marionette
Product: Testing → Remote Protocol
You need to log in
before you can comment on or make changes to this bug.
Description
•