Open Bug 1745430 Opened 3 years ago Updated 2 years ago

Investigate adding a test to simulate a shutdown during startup

Categories

(Testing :: Marionette Client and Harness, task, P3)

Default
task

Tracking

(Not tracked)

People

(Reporter: standard8, Unassigned)

References

(Depends on 1 open bug)

Details

In bug 1740301 we saw hangs which appear to have been caused by the Talos test starting the browser and then shutting it down just after the sessionstore-windows-restored notification - this is whilst it is still starting up.

Some of those caused hangs to occur. Since this would be a rare case and not something the Talos test should be testing, we delayed shutdown in the Talos test in bug 1744794. However, we do wonder if we should have a Marionette test or something, which would test rapid shutdown and potentially catch issues there.

Note that we also face this issue nearly permanently with some of the Puppeteer unit tests. As such they need to stay disabled.

We have a couple of tests in the profile and restart unit tests but as it looks like none of these actually trigger this issue:

https://searchfox.org/mozilla-central/source/testing/marionette/harness/marionette_harness/tests/unit/test_profile_management.py
https://searchfox.org/mozilla-central/source/testing/marionette/harness/marionette_harness/tests/unit/test_quit_restart.py

Do you already know which exact steps are causing this hang? When we have a Marionette test we would still need this platform bug to be fixed first before we can enable it in CI.

Also note that currently the initialization of Marionette is delayed after session store has completed. This might change with bug 1726465.

Flags: needinfo?(standard8)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #1)

Note that we also face this issue nearly permanently with some of the Puppeteer unit tests. As such they need to stay disabled.

Hmm, we might want to look at those and maybe see if they are related to this, but if they aren't new, then maybe not. It might be worth revisiting once bug 986145 is fixed, as hopefully by then OS.File will be out of the startup path.

We have a couple of tests in the profile and restart unit tests but as it looks like none of these actually trigger this issue:

https://searchfox.org/mozilla-central/source/testing/marionette/harness/marionette_harness/tests/unit/test_profile_management.py
https://searchfox.org/mozilla-central/source/testing/marionette/harness/marionette_harness/tests/unit/test_quit_restart.py

Do you already know which exact steps are causing this hang? When we have a Marionette test we would still need this platform bug to be fixed first before we can enable it in CI.

The Talos tests were waiting for the sessionstore-windows-restored notification and then calling Services.startup.quit(Services.startup.eForceQuit); straight away.

https://searchfox.org/mozilla-central/rev/71befbc9bc348d33f0c7307dc48e3cb36ba78650/testing/talos/talos/startup_test/sessionrestore/addon/api.js#103,165

Also note that currently the initialization of Marionette is delayed after session store has completed. This might change with bug 1726465.

Flags: needinfo?(standard8)

Please note that I'm able to trigger such a condition with the already existent quit and restart unit tests of Marionette while working on bug 1726465. As such it might be good to wait on this bug until that work has been done.

Depends on: 1726465
Priority: -- → P3

Note that one crash that we have seen when forcing a shutdown before the sessionstore-windows-restored notification has been sent out is bug 1764420. That one is kinda easily to reproduce when running Marionette in window-less mode (moz:windowless capability) now.

Mark, should we put bug 986145 as blocker for this bug? Or is there anything that we could already do before it's fixed?

Flags: needinfo?(standard8)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #4)

Mark, should we put bug 986145 as blocker for this bug? Or is there anything that we could already do before it's fixed?

That one is fixed now :)

Flags: needinfo?(standard8)

Note that we have a quit/restart test especially for the windowless mode in Marionette which is related to a silent restart of Firefox and doesn't rely on any open browser window.

https://searchfox.org/mozilla-central/rev/32ca4fc265150e7d3d7aa6c6abea088768cf024b/testing/marionette/harness/marionette_harness/tests/unit/test_quit_restart.py#313-324

It basically shuts down Firefox after the final-ui-startup notification has been send out. I can see some failures in the logs:
https://treeherder.mozilla.org/logviewer?job_id=379621476&repo=mozilla-central&lineNumber=67021-67212

Mark are those helpful? If yes could you investigate these? Would we need something more than that? Right now we cannot do more than that and from previous weeks I cannot see anything hanging during shutdown.

Flags: needinfo?(standard8)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #6)

It basically shuts down Firefox after the final-ui-startup notification has been send out. I can see some failures in the logs:
https://treeherder.mozilla.org/logviewer?job_id=379621476&repo=mozilla-central&lineNumber=67021-67212

If we had time, then I think it might be interesting to play around with shutting down earlier/later than that, but I won't have time or that anytime soon.

Mark are those helpful? If yes could you investigate these? Would we need something more than that? Right now we cannot do more than that and from previous weeks I cannot see anything hanging during shutdown.

I think it would be worth getting bug filed on those warnings to check that they are either expected, or fix them if necessary. I think that's reasonable for now.

Flags: needinfo?(standard8)
Product: Testing → Remote Protocol
Component: Marionette → Marionette Client and Harness
Product: Remote Protocol → Testing
You need to log in before you can comment on or make changes to this bug.