Closed Bug 1389103 Opened 5 years ago Closed 4 years ago

Unskip temporarily skipped quit/restart tests due to bug 1054740

Categories

(Testing :: Marionette, defect, P3)

defect

Tracking

(firefox55 wontfix, firefox56 disabled, firefox57 disabled)

RESOLVED DUPLICATE of bug 1467583
Tracking Status
firefox55 --- wontfix
firefox56 --- disabled
firefox57 --- disabled

People

(Reporter: whimboo, Unassigned)

References

Details

Attachments

(1 file)

All of our in_app quit/restart tests got disabled because they were causing a perma failure with the changes for bug 1363368. Given that parts of those got reverted, we can unskip our tests, except for TestQuitRestart.test_in_app_restart_with_callback_no_shutdown which is still perma failing. I filed bug 1389095 for the latter.
The problem on Windows still persists. So I assume it's somewhat related to the former test run, which is causing some IPC errors during restart of Firefox:

15:56:44     INFO -  1502380604807	Marionette	TRACE	3 <- [1,19,null,{"cause":"restart"}]
15:56:44     INFO -  Unable to read VR Path Registry from C:\Users\GenericWorker\AppData\Local\openvr\openvrpaths.vrpath
15:56:44     INFO -  JavaScript error: resource://shield-recipe-client/lib/RecipeRunner.jsm, line 153: TypeError: prefs is undefined
15:56:44     INFO -  Unable to read VR Path Registry from C:\Users\GenericWorker\AppData\Local\openvr\openvrpaths.vrpath
15:56:44     INFO -  [Child 8420] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
15:56:44     INFO -  [Child 8420] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
15:56:44     INFO -  1502380604824	Marionette	DEBUG	Closed connection 3
15:56:44     INFO -  1502380604824	Marionette	WARN	New connections are currently not accepted
15:56:44     INFO -  [Parent 2216] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
15:56:44     INFO -  Unable to read VR Path Registry from C:\Users\GenericWorker\AppData\Local\openvr\openvrpaths.vrpath
15:56:44     INFO -  [Child 3736] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
15:56:44     INFO -  [Child 3736] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
15:56:44     INFO -  1502380604966	addons.productaddons	WARN	Failed downloading via XHR, status: 0, reason: error
15:56:45     INFO -  *** UTM:SVC TimerManager:registerTimer called after profile-before-change notification. Ignoring timer registration for id: telemetry_modules_ping
15:56:45     INFO -  1502380605238	Marionette	DEBUG	Received observer notification "xpcom-shutdown"
15:56:45     INFO -  1502380605238	Marionette	DEBUG	Received observer notification "xpcom-shutdown"
15:56:45     INFO -  Unable to read VR Path Registry from C:\Users\GenericWorker\AppData\Local\openvr\openvrpaths.vrpath
15:56:45     INFO -  [GPU 13188] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346

I will check the next days with a Windows VM if I can reproduce it.
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
ni? as requested
Flags: needinfo?(hskupin)
Priority: -- → P3
With bug 1397675 it will be clearer when something goes wrong when we try to connect to Marionette. It might not help here, but I would like to wait for it until I push another try build.
Depends on: 1397675
Flags: needinfo?(hskupin)
So Andreas' changes on bug 1388424 actually modified the behavior for `restart()` of the Marionette driver. Before we kept the session id, and now it is reset. This means `start_session()` will be called, which is not allowed in case the shutdown was not successful. As such we hang until the socket timeout.

Because of that `test_in_app_restart_with_callback_no_shutdown` causes a perma failure, and would require the logic in Marionette to be changed. Maybe we should only set `acceptConnections` to false if the shutdown was really triggered. We are getting notified about that via the observer notification.

Andreas, does that sound fine with you? This would definitely help us to not lock out clients due to a non-working shutdown.
Depends on: 1388424
Flags: needinfo?(ato)
Looks like that the problem with `test_in_app_restart_with_callback_no_shutdown` is only intermittent. After changing bookmarks locally I cannot reproduce it anymore. So it might be worth moving the topic out to a different bug.
Ok, so the try build also shows the problem with this test but on Windows. On that platform it seems to be a permanent failure. Maybe I have to run the tests way more locally to see it again. Maybe we really need the above change of logic to make it working again correctly.
Here the try build:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=59d6bb21a5fea682d78e3494bbacd890312061c5

Failure:
14:24:32     INFO -  WARNING | IO Completion Port failed to signal process shutdown
14:24:32     INFO -  Parent process 3092 exited with children alive:
14:24:32     INFO -  PIDS: 3140, 5276
14:24:32     INFO -  Attempting to kill them, but no guarantee of success
(In reply to Henrik Skupin (:whimboo) from comment #7)

> So Andreas' changes on bug 1388424 actually modified the behavior
> for `restart()` of the Marionette driver. Before we kept the
> session id, and now it is reset. This means `start_session()`
> will be called, which is not allowed in case the shutdown was not
> successful. As such we hang until the socket timeout.

I’m not too familiar with how the Python client works, but we
don’t accept the session ID as input when creating a new session
anymore because it gives the false impression that the new session
is the same, or a continuation, of the old.

When Firefox restarts itself the next Marionette session will be a
fresh session, so I think the fact that the client relied on the
session ID being the same was to put it into the right code path.
You say that it means start_session() will be called, but I think
the client needs to set some internal state to not trigger this
path in the event of a restart.  It shouldn’t need to expect a
different response from the server.

> Because of that `test_in_app_restart_with_callback_no_shutdown`
> causes a perma failure, and would require the logic in Marionette
> to be changed.  Maybe we should only set `acceptConnections`
> to false if the shutdown was really triggered. We are getting
> notified about that via the observer notification.
> 
> Andreas, does that sound fine with you? This would definitely help
> us to not lock out clients due to a non-working shutdown.

You only want to set Marionette:AcceptConenctions when you’re
expecting Firefox to shutdown so that the existing Marionette server
doesn’t accept any more connections until the Firefox process has
had time to restart.

It is an attempt to work around the fact that Marionette still
accepts new connections as it is shutting down.
Flags: needinfo?(ato)
We have to wait until my patch on bug 1410366 got landed.
Depends on: 1410366
The failures are still happening after the rebase. I will use my Windows 10 loaner to further investigate those.
Blocks: 1404223
No longer blocks: 1407390
Depends on: 1433873
Not an actionable bug for me right now until bug 1433873 has been fixed.
Assignee: hskupin → nobody
Status: ASSIGNED → NEW
I missed that bug. Actually all the quit/restart tests have been enabled again via bug 1467583 about two weeks ago.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1467583
You need to log in before you can comment on or make changes to this bug.