Closed Bug 1366310 Opened 7 years ago Closed 7 years ago

Marionette is causing talos tart/cart to fail when I use marionette to connect to the browser and close it while initializing the testing profile

Categories

(Remote Protocol :: Marionette, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jmaher, Unassigned)

References

Details

In order to install web extensions for talos, we are using marionette to install them.  How talos works is we:
* create a fresh profile
* start up the browser to warm the profile up (avoid measuring cost of profile initialization in talos tests)
* shut down the browser
* start up the browser and start testing

Historically talos has done this and in the warmup of the profile we would load a webpage that would terminate the browser.  For all the addons we needed, we would copy the addons to the profile prior to warmup.

Now with web extensions, we are installing the addons dynamically via marionette.  After installing the addons, we would close the browser with marionette, and then do the last step as we have always done (not using marionette).

Once that worked, I ran all of the tests and found out that two tests TART/CART failed to run.  In fact, it appeared they failed without doing any work (as we started the test but nothing happened).  In pushing to try, this is the case for linux/windows, but not osx.  All other talos tests work just fine.

I reduced the code a bit and only connected with marionette and closed the browser:
        client = Marionette(host='localhost', port=2828)
        client.start_session()
        client.close()


so doing this causes a future browser session to fail when running a test defined from the TART/CART addon.  The addon is not too different from other talos addons:
https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/tart/addon

I have tried to add debug prints to see where in the flow we are getting stuck.  It looks like we are stuck inside of tart after we init and when we get to the first test case:
https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/tart/addon/content/tart.js#493

when we init, we set the url for newtab as |about:blank|, then we add a tab and expect to catch an observer for "newtab-url-changed".  The first test hits our timeout where we do not observe the event.  I think there is more debugging to do here, I am still working on that.

Looking at the differences, it seems to be that using marionette to connect/close a previous browser session is putting the browser into some kind of state which it is not in normally.  This seems to affect this one addon.

I compared profiles of with/without marionette warming it up. I found:
* a bunch of preferences that marionette set (https://dxr.mozilla.org/mozilla-central/source/testing/marionette/client/marionette_driver/geckoinstance.py#19 )
* xulstore.json has browser.xul as twice the dimensions of without marionette ("main-window":{"width":"2048","height":"1536"...)
* crashes/store.json.mozlz4 is unique (possibly a crash when shutting down with marionette.?)
* a few other files, but it might be temporary.

I would like to figure out what marionette does to a profile or persistent storage between browser sessions.  I know we use marionette for mochitest/reftest, but that is not restarting the browser.

If there are tips for debugging this or ideas to try, I am eager to hear.

To reproduce this, I add this patch to my local m-c copy:
* https://hg.mozilla.org/try/rev/587bcc7d3d9d52b792d71d8768c9fbb7f681d968

then I run |./mach talos-test -a tart| and see the failure (locally on win10).
(In reply to Joel Maher ( :jmaher) from comment #0)
> I reduced the code a bit and only connected with marionette and closed the
> browser:
>         client = Marionette(host='localhost', port=2828)
>         client.start_session()
>         client.close()

What in your code is actually closing Firefox? `client.close()` is not shutting down Firefox. It only closes the current tab, or window if it's the last tab. So on MacOS Firefox will still be running, because it allows no windows at all, and by just having the hidden window available. 

So you still need a proper method of allowing your harness to restart Firefox. How exactly have you done it before? Cannot you continue doing it the same way?

Another option would be to allow the Marionette driver to handle the instance of Firefox. So it starts Firefox, and finally quits it. With that you will have the `client.quit()` and `client.restart()` methods available. Also Marionette would do the crash checks. Not sure if this is something you could imagine, or if possible at all.

> when we init, we set the url for newtab as |about:blank|, then we add a tab
> and expect to catch an observer for "newtab-url-changed".  The first test
> hits our timeout where we do not observe the event.  I think there is more
> debugging to do here, I am still working on that.

When is this exactly done? Before or after sessionstore finished loading? If before it could cause trouble after a restart of Firefox.
 
> Looking at the differences, it seems to be that using marionette to
> connect/close a previous browser session is putting the browser into some
> kind of state which it is not in normally.  This seems to affect this one
> addon.

As mentioned before you are not quitting Firefox via Marionette. So this might be the reason or a part of it.

> I compared profiles of with/without marionette warming it up. I found:
> * a bunch of preferences that marionette set
> (https://dxr.mozilla.org/mozilla-central/source/testing/marionette/client/
> marionette_driver/geckoinstance.py#19 )

Are there any conflicts to preference values you are using?

> * xulstore.json has browser.xul as twice the dimensions of without
> marionette ("main-window":{"width":"2048","height":"1536"...)

I'm not aware that Marionette sets the initial browser window size. David was working on all that lately. So maybe something changed?

> * crashes/store.json.mozlz4 is unique (possibly a crash when shutting down
> with marionette.?)

I don't know this file. You would have to give me some real crash ids to be able to check it.
There are some great suggestions here, thanks :whimboo for the data.

I realized I must have been having an odd scenario to terminate the browser, so I went back to my old method of loading a web page (using client.navigate) which then terminates the browser.  This seems to work well in Marionette, but I still get failures, not only in the same TART test, but also in sessionrestore:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=74ef226b05b7c0d7166c04694ef41b2ded96a494

I am working on only using marionette when we need it and going with the original methods for all the other tests.

I do think a future experiment is to let marionette handle the session management as you suggested- this would be for profile generation, so ideally not a problem to go this different route.  Meanwhile I am using what mochitest/reftest uses and that seems to be working well.
May I ask why you are using Marionette to open a URL? I thought that you wanted to use the former methods, and Marionette only for those things which are needed like installing an addon.

Please keep in mind that the sessionrestore failure most likely is related to bug 1363368, which I'm currently working on. It's a recent regression in Marionette's window handling behavior after a restart of Firefox caused by a change in the sessionrestore code. You most likely can workaround it with a call to Wait().until() or a sleep if really needed in the meantime. Otherwise use whatever you did before to load a URL.
This issue has been fixed in mozprofile
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Product: Testing → Remote Protocol
You need to log in before you can comment on or make changes to this bug.