Closed
Bug 1137388
Opened 10 years ago
Closed 10 years ago
Marionette hard-kills the browser for restarts which causes lots of trouble
Categories
(Remote Protocol :: Marionette, defect)
Remote Protocol
Marionette
Tracking
(firefox38 fixed, firefox39 fixed)
RESOLVED
FIXED
mozilla39
People
(Reporter: whimboo, Assigned: chmanchester)
References
Details
(Keywords: dataloss)
Attachments
(2 files, 2 obsolete files)
As implemented on bug 940954 the restart support for Marionette has a massive disruptive behavior! As Chris and I figured out today it some-what hard-kills the browser, and doesn't let it sanely shutdown. All this is done via runner.stop(). The problems you can face by doing this are massive in terms that lots of things which have to happen on shutdown are NOT executed. We have seen this because preferences we have set in a test were lost after a restart, which never should happen. Instead of stopping the browser from mozrunner, the shutdown or better restart has to be initiated from within the application itself. Whereby it has to allow Firefox to safely shutdown itself. Only when that is not possible we should issue a hard stop for the process. Here some links for how Mozmill is handling that at the moment: Controller's restartApplication() wrapper: https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/extension/resource/driver/controller.js#L438 Controller's stopApplication() wrapper: https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/extension/resource/driver/controller.js#L466 Frame's shutdownApplication() method which safely closes the browser and issues a restart by Firefox itself: https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/extension/resource/modules/frame.js#L48 Python __init__.py restart logic for mozrunner: https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/__init__.py#L549 Allowing Firefox to restart itself should be the way to go here because it will pass over environment variables which are important for the next process. We should not call mozrunner to start the application again. This bug is kinda blocking us from having working restart tests, including our update tests. We may somewhat be able to workaround that with sleeps and a custom request for shutdown in our tests. But I cannot tell, if this all would work.
Assignee | ||
Comment 1•10 years ago
|
||
Isn't this a duplicate of bug 1099127? I think we're ending up solving the same problem here, except that "inside application triggered" restart means any restart triggered by mozmill because mozmill runs inside the application.
Flags: needinfo?(hskupin)
Reporter | ||
Comment 2•10 years ago
|
||
No, this is not a duplicate. This is part of the problem to solve for user triggered restarts. For that bug we still have to call marionette.restart(). It's just that we need a clean way to restart/shutdown the browser. This is not done at the moment.
Flags: needinfo?(hskupin)
Assignee | ||
Comment 3•10 years ago
|
||
This looks ok locally, let's see how this looks on other platforms: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b2aa51e49c91
Assignee | ||
Comment 4•10 years ago
|
||
/r/4413 - Bug 1137388 - Add a facility to restart firefox from marionette from within the browser for update tests. Pull down this commit: hg pull review -r b004519010946edc4c1375cfb5ba42eccfbfb9e3
Assignee | ||
Comment 5•10 years ago
|
||
(In reply to Chris Manchester [:chmanchester] from comment #3) > This looks ok locally, let's see how this looks on other platforms: > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=b2aa51e49c91 Problems on windows. I'll dust off my windows build tomorrow.
Reporter | ||
Comment 6•10 years ago
|
||
I cannot login to review board, so some comments here... * It's great to see that you were able to get that far with it yesterday! * In case of a not forced restart keep in mind that Firefox might disallow a shutdown of itself. In such a case we have to ensure to not wait forever but also kill the process after a given amount of time. So not sure if adding force as parameter is necessary. * For the restart method you missed to document the new parameters. Maybe for the flags it would be good to add a see link to the MDN documentation? Or at least list the possible values, or create constants in Marionette itself. * The same logic as you have here we may also need for a normal close of the application. Is Marionette also using mozrunner to hardstop it? There are tests which would also need a clean shutdown of the application. This is most likely a follow-up bug, but wanted to bring this up.
Assignee: nobody → cmanchester
Status: NEW → ASSIGNED
Assignee | ||
Comment 7•10 years ago
|
||
(In reply to Chris Manchester [:chmanchester] from comment #5) > (In reply to Chris Manchester [:chmanchester] from comment #3) > > This looks ok locally, let's see how this looks on other platforms: > > > > https://treeherder.mozilla.org/#/jobs?repo=try&revision=b2aa51e49c91 > > Problems on windows. I'll dust off my windows build tomorrow. This looks ok except for win64 (I tried because my local build work fine): https://treeherder.mozilla.org/#/jobs?repo=try&revision=a77e2eb87993 Maybe the browser process isn't re-used after a restart on win64 like it is for other platforms.
Assignee | ||
Comment 8•10 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #6) > I cannot login to review board, so some comments here... > > * It's great to see that you were able to get that far with it yesterday! > > * In case of a not forced restart keep in mind that Firefox might disallow a > shutdown of itself. In such a case we have to ensure to not wait forever but > also kill the process after a given amount of time. So not sure if adding > force as parameter is necessary. The implementation of quit seems pretty definitive as long as eForceQuit is set (which should be included here for parity with mozmill but I missed). When is this an issue? > > * For the restart method you missed to document the new parameters. Maybe > for the flags it would be good to add a see link to the MDN documentation? > Or at least list the possible values, or create constants in Marionette > itself. > > * The same logic as you have here we may also need for a normal close of the > application. Is Marionette also using mozrunner to hardstop it? There are > tests which would also need a clean shutdown of the application. This is > most likely a follow-up bug, but wanted to bring this up.
Assignee | ||
Comment 9•10 years ago
|
||
Comment on attachment 8570254 [details] MozReview Request: bz://1137388/chmanchester /r/4413 - Bug 1137388 - Add a facility to restart firefox from marionette from within the browser for update tests. Pull down this commit: hg pull review -r 5e4deeea5a198134a0e49a7a926badcdb333913f
Assignee | ||
Comment 10•10 years ago
|
||
(In reply to Chris Manchester [:chmanchester] from comment #8) > (In reply to Henrik Skupin (:whimboo) from comment #6) > > I cannot login to review board, so some comments here... > > > > * It's great to see that you were able to get that far with it yesterday! > > > > * In case of a not forced restart keep in mind that Firefox might disallow a > > shutdown of itself. In such a case we have to ensure to not wait forever but > > also kill the process after a given amount of time. So not sure if adding > > force as parameter is necessary. > The implementation of quit seems pretty definitive as long as eForceQuit is > set (which should be included here for parity with mozmill but I missed). > When is this an issue? > > > > * For the restart method you missed to document the new parameters. Maybe > > for the flags it would be good to add a see link to the MDN documentation? > > Or at least list the possible values, or create constants in Marionette > > itself. I noticed we're already using setting enough flags by default that this isn't useful, so I removed the parameter. > > > > * The same logic as you have here we may also need for a normal close of the > > application. Is Marionette also using mozrunner to hardstop it? There are > > tests which would also need a clean shutdown of the application. This is > > most likely a follow-up bug, but wanted to bring this up. mozrunner probably stops the browser at the end of the run. My next question is whether getting this working on win64 should block other work on update tests.
Assignee | ||
Updated•10 years ago
|
Attachment #8570254 -
Attachment is obsolete: true
Reporter | ||
Comment 11•10 years ago
|
||
(In reply to Chris Manchester [:chmanchester] from comment #8) > > * In case of a not forced restart keep in mind that Firefox might disallow a > > shutdown of itself. In such a case we have to ensure to not wait forever but > > also kill the process after a given amount of time. So not sure if adding > > force as parameter is necessary. > The implementation of quit seems pretty definitive as long as eForceQuit is > set (which should be included here for parity with mozmill but I missed). > When is this an issue? Usually code in Firefox has to request a shutdown first via the "quit-application-requested" observer notification. That allows listeners for this topic to run prepare for shutdown. Whereby each listener can reject the request via the return value. (In reply to Chris Manchester [:chmanchester] from comment #10) > > > * For the restart method you missed to document the new parameters. Maybe > > > for the flags it would be good to add a see link to the MDN documentation? > > > Or at least list the possible values, or create constants in Marionette > > > itself. > I noticed we're already using setting enough flags by default that this > isn't useful, so I removed the parameter. IMHO it would be better to call this method shutdownApplication similar to the implementation in Mozmill and allow to set a flag if the application needs to be restarted. This would help to have only a single shutdown method compared to the case below. > > > * The same logic as you have here we may also need for a normal close of the > > > application. Is Marionette also using mozrunner to hardstop it? There are > > > tests which would also need a clean shutdown of the application. This is > > > most likely a follow-up bug, but wanted to bring this up. > mozrunner probably stops the browser at the end of the run. That would be another problem we should fix. Mozrunner should only stop an application if it cannot quit itself. > My next question is whether getting this working on win64 should block other > work on update tests. Our existing Mozmill tests are working perfect with Firefox 32/64bit on Win64. So I think it's just a missing piece in the implementation here.
Assignee | ||
Comment 12•10 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #11) > (In reply to Chris Manchester [:chmanchester] from comment #8) > > > * In case of a not forced restart keep in mind that Firefox might disallow a > > > shutdown of itself. In such a case we have to ensure to not wait forever but > > > also kill the process after a given amount of time. So not sure if adding > > > force as parameter is necessary. > > The implementation of quit seems pretty definitive as long as eForceQuit is > > set (which should be included here for parity with mozmill but I missed). > > When is this an issue? > > Usually code in Firefox has to request a shutdown first via the > "quit-application-requested" observer notification. That allows listeners > for this topic to run prepare for shutdown. Whereby each listener can reject > the request via the return value. The quit is always granted by the time we call nsIAppStartup.quit. There's a case the quit is aborted when new windows are opened while quitting, but that's not honored when eForceQuit is provided. > > (In reply to Chris Manchester [:chmanchester] from comment #10) > > > > * For the restart method you missed to document the new parameters. Maybe > > > > for the flags it would be good to add a see link to the MDN documentation? > > > > Or at least list the possible values, or create constants in Marionette > > > > itself. > > I noticed we're already using setting enough flags by default that this > > isn't useful, so I removed the parameter. > > IMHO it would be better to call this method shutdownApplication similar to > the implementation in Mozmill and allow to set a flag if the application > needs to be restarted. This would help to have only a single shutdown method > compared to the case below. restarts are the only case I'm targeting here. > > > > > * The same logic as you have here we may also need for a normal close of the > > > > application. Is Marionette also using mozrunner to hardstop it? There are > > > > tests which would also need a clean shutdown of the application. This is > > > > most likely a follow-up bug, but wanted to bring this up. > > mozrunner probably stops the browser at the end of the run. > > That would be another problem we should fix. Mozrunner should only stop an > application if it cannot quit itself. Doesn't sound like a blocker, but file a bug. > > > My next question is whether getting this working on win64 should block other > > work on update tests. > > Our existing Mozmill tests are working perfect with Firefox 32/64bit on > Win64. So I think it's just a missing piece in the implementation here. If it's a requirement, it's a requirement. It's significantly more challenging because it looks like we need to thread thread everything through to a new process. I'll need to figure out how to get a local build to work with.
Assignee | ||
Comment 13•10 years ago
|
||
19:06:18 INFO - ProcessManager UNABLE to use job objects to manage child processes 19:06:18 INFO - ProcessManager NOT managing child processes ... from the broken win64 run looks relevant. Now I'm thinking this is a mozprocess problem or something else altogether.
Assignee | ||
Comment 14•10 years ago
|
||
(In reply to Chris Manchester [:chmanchester] from comment #13) > 19:06:18 INFO - ProcessManager UNABLE to use job objects to manage > child processes > 19:06:18 INFO - ProcessManager NOT managing child processes > > ... from the broken win64 run looks relevant. Now I'm thinking this is a > mozprocess problem or something else altogether. I did a win64 build locally and everything works as expected (the warning isn't printed, either). I pushed to try with mozprocess debug output turned on to see it that yields anything interesting. I guess the next step is to request a loaner and try to debug from there. I don't know how far I'll get with that, but I'll give it a shot.
Assignee | ||
Comment 15•10 years ago
|
||
After some more try debugging and reading through the mozprocess code and windows APIs, the issue appears to be that JOB_OBJECT_LIMIT_BREAKAWAY_OK isn't set on the job object that creates the browser process on win64 (also our only win 8 jobs), meaning we don't use job objects to manage child processes, so when we restart, the child process that becomes the new main firefox process can't be killed by mozprocess.
Reporter | ||
Comment 16•10 years ago
|
||
(In reply to Chris Manchester [:chmanchester] from comment #12) > > Usually code in Firefox has to request a shutdown first via the > > "quit-application-requested" observer notification. That allows listeners > > for this topic to run prepare for shutdown. Whereby each listener can reject > > the request via the return value. > > The quit is always granted by the time we call nsIAppStartup.quit. There's a > case the quit is aborted when new windows are opened while quitting, but > that's not honored when eForceQuit is provided. The problem you are hitting when directly requesting a shutdown with the aForceQuit flag set is, that there can be unsafed data for windows which is not getting saved and it results in dataloss. Good example here should be sessionrestore, which has a small delay before it saves new data. Here a forced restart would cause that e.g. a tab you opened right before quit will not exist anymore after a restart. So eAttemptQuit should be the correct flag for us, and we should only fallback to eForceQuit, if the first try to shutdown fails. https://developer.mozilla.org/en-US/docs/How_to_Quit_a_XUL_Application > > IMHO it would be better to call this method shutdownApplication similar to > > the implementation in Mozmill and allow to set a flag if the application > > needs to be restarted. This would help to have only a single shutdown method > > compared to the case below. > > restarts are the only case I'm targeting here. Well, for shutdown it's exactly the same method just without the aRestart flag set. But I can file a bug for it for sure. > If it's a requirement, it's a requirement. It's significantly more We have to run Firefox on Windows 32 and 64 systems. That's a requirement from QA. The same applies to the upcoming Firefox64 builds.
Assignee | ||
Comment 17•10 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #16) > (In reply to Chris Manchester [:chmanchester] from comment #12) > > > Usually code in Firefox has to request a shutdown first via the > > > "quit-application-requested" observer notification. That allows listeners > > > for this topic to run prepare for shutdown. Whereby each listener can reject > > > the request via the return value. > > > > The quit is always granted by the time we call nsIAppStartup.quit. There's a > > case the quit is aborted when new windows are opened while quitting, but > > that's not honored when eForceQuit is provided. > > The problem you are hitting when directly requesting a shutdown with the > aForceQuit flag set is, that there can be unsafed data for windows which is > not getting saved and it results in dataloss. Good example here should be > sessionrestore, which has a small delay before it saves new data. Here a > forced restart would cause that e.g. a tab you opened right before quit will > not exist anymore after a restart. So eAttemptQuit should be the correct > flag for us, and we should only fallback to eForceQuit, if the first try to > shutdown fails. > > https://developer.mozilla.org/en-US/docs/How_to_Quit_a_XUL_Application The mozmill update tests only use controller.js' restartApplication, which doesn't specify a user shutdown, and therefore goes through frame.js' shutdownApplication, which always specifies eForceQuit. This seems like the realm of bug 1099127. > > > > IMHO it would be better to call this method shutdownApplication similar to > > > the implementation in Mozmill and allow to set a flag if the application > > > needs to be restarted. This would help to have only a single shutdown method > > > compared to the case below. > > > > restarts are the only case I'm targeting here. > > Well, for shutdown it's exactly the same method just without the aRestart > flag set. But I can file a bug for it for sure. The client would need to handle things differently, too, but I guess we can expose the parameter so the server won't need to be updated later. > > > If it's a requirement, it's a requirement. It's significantly more > > We have to run Firefox on Windows 32 and 64 systems. That's a requirement > from QA. The same applies to the upcoming Firefox64 builds. As you can see from this bug I'm working on it.
Assignee | ||
Comment 18•10 years ago
|
||
(In reply to Chris Manchester [:chmanchester] from comment #15) > After some more try debugging and reading through the mozprocess code and > windows APIs, the issue appears to be that JOB_OBJECT_LIMIT_BREAKAWAY_OK > isn't set on the job object that creates the browser process on win64 (also > our only win 8 jobs), meaning we don't use job objects to manage child > processes, so when we restart, the child process that becomes the new main > firefox process can't be killed by mozprocess. According to msdn, windows 8 we can create nested job objects, so breaking away from a job isn't required to manage a process and its children in a sub-job (mozprocess assumes we need to breakaway to achieve this, which doesn't appear possible on our Windows 8 slaves). I replicated the failure and tried a mozharness patch locally that works for this. I'll run it through try as soon as it opens.
Assignee | ||
Comment 19•10 years ago
|
||
Results from try with mozprocess patch looking good: https://treeherder.mozilla.org/#/jobs?repo=try&revision=967dcdfea267 Will land in bug 1139722.
Assignee | ||
Comment 20•10 years ago
|
||
/r/4933 - Bug 1137388 - Add a facility to restart firefox from marionette from within the browser for update tests. Pull down this commit: hg pull review -r 76d5a3582f390569746acbe1a8449a850e24033f
Assignee | ||
Comment 21•10 years ago
|
||
Comment on attachment 8573666 [details] MozReview Request: bz://1137388/chmanchester /r/4933 - Bug 1137388 - Add a facility to restart firefox from marionette from within the browser for update tests. Pull down this commit: hg pull review -r 76d5a3582f390569746acbe1a8449a850e24033f
Attachment #8573666 -
Flags: review?(jgriffin)
Attachment #8573666 -
Flags: review?(dburns)
Comment 22•10 years ago
|
||
Comment on attachment 8573666 [details] MozReview Request: bz://1137388/chmanchester https://reviewboard.mozilla.org/r/4931/#review3935 Ship It!
Attachment #8573666 -
Flags: review?(dburns) → review+
Assignee | ||
Comment 23•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/56083b5a4473
Assignee | ||
Updated•10 years ago
|
Attachment #8573666 -
Flags: review?(jgriffin)
Comment 24•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/56083b5a4473
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Reporter | ||
Comment 25•10 years ago
|
||
We kinda would like to have this support for Firefox 38 to be able to run our Firefox UI tests. Would it be possible to backport this patch?
Assignee | ||
Comment 26•10 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #25) > We kinda would like to have this support for Firefox 38 to be able to run > our Firefox UI tests. Would it be possible to backport this patch? If we do, we definitely need to pick up its dependents (transitively).
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(cmanchester)
Reporter | ||
Comment 27•10 years ago
|
||
The only other patch we have to backport is for bug 1139722. The general restart logic already landed for Firefox 37.
Comment 28•10 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/81306a6804e8
Flags: needinfo?(dburns)
Reporter | ||
Updated•10 years ago
|
Assignee | ||
Comment 29•9 years ago
|
||
Attachment #8573666 -
Attachment is obsolete: true
Attachment #8619605 -
Flags: review+
Attachment #8619606 -
Flags: review+
Assignee | ||
Comment 30•9 years ago
|
||
Assignee | ||
Comment 31•9 years ago
|
||
Updated•2 years ago
|
Product: Testing → Remote Protocol
You need to log in
before you can comment on or make changes to this bug.
Description
•