Closed Bug 1284687 Opened 4 years ago Closed 4 years ago

Hide the windows instead of closing them during flush on quit

Categories

(Firefox :: Session Restore, defect, P1)

45 Branch
defect

Tracking

()

VERIFIED FIXED
Firefox 50
Tracking Status
e10s m9+ ---
firefox47 --- wontfix
firefox48 --- verified
firefox49 --- verified
firefox50 --- verified

People

(Reporter: billm, Assigned: mconley)

References

(Depends on 1 open bug)

Details

(Keywords: dataloss, regression)

Attachments

(3 files, 1 obsolete file)

Attached file ui-pages.xpi
STR:
1. Load the attached extension in about:debugging.
2. Click the browser action icon, which should open a new tab.
3. Quit Firefox.

ER:
It quits.

AR:
It hangs and eventually prints a message about session restore hanging or something.

This might be a session store bug, but I filed it in WebExtensions since it might be that too.
Yeah, it looks like a session store bug. The tab flusher code sends an async message to each browser in a window, and waits for a response. Immediately after it sends the message, we remove the tab.

I guess there's an argument that we shouldn't remove extension pages at app shutdown, though, but only at extension shutdown. But either way, we should fix the session store bug.
Mike, any chance you could look at this?
Flags: needinfo?(mconley)
Yeah, I could take a peek. Also cc'ing mikedeboer who's recently become the owner of Session Restore.
Assignee: nobody → mconley
Flags: needinfo?(mconley)
This is a minor tragedy, and I feel pretty responsible.

In bug 1177310, I wrote patches that got rid of CPOW usage on application shutdown. Instead of causing flushes using CPOWs, I use the AsyncShutdown JSM to spin the event loop until each windows browsers had flushed their message queues.

As a perceived shutdown performance optimization, I added a line that closed the window after the flush messages go down to each browser.

This... this isn't perfect, and this WebExtension is exposing that.

The problem is that the onClose handler - which fires in response to the domwindowclosed notification - will likely fire before all of the updates have come up from each browser. onClose, when we're shutting down, will call cleanUpWindow, which resolves all of the in-flight flush Promises for each of the browsers still in the window and then removing all of the message listeners.

This is problematic for two reasons:

1) It's very likely that we haven't heard all of the browser updates by the time that onClose fires, so we're probably missing out on the last ~2 seconds of session state from the unflushed browsers
2) If something (like the WebExtension in this bug) removes a browser from a window once SessionStore has kicked off its async flush, but _before_ domwindowclosed has fired, then when cleanUpWindow is called, it'll resolve all browsers except the one that had been removed, and the async window flush will never finish because we'll never have resolved the one that was removed.

To me, the simplest solution is to not cause onClose to happen until the window flushes have completed. Instead of calling win.close after each flush message is sent, I set its visibility to false.

This solves the WebExtension case because when the WebExtension removes the tab, we'll hear the final update come up from it, which will resolve the async window flush Promise (whereas before we never had a chance to hear that final update message because cleanUpWindow removed the message listeners).
Summary: WebExtension page causes Firefox shutdown to hang → Hide the windows instead of closing them during flush on quit
Component: WebExtensions → Session Restore
Keywords: dataloss
Product: Toolkit → Firefox
We were closing the windows before to improve perceived shutdown
performance, but we end up in a state where we're likely to miss
out on the last ~2 seconds of session activity for most tabs per
window. This is because we were removing the session update
message listeners and resolving the flush Promises once the
domwindowclosed notification fired for the window.

Hiding the window allows us to wait for the messages properly.

Review commit: https://reviewboard.mozilla.org/r/62914/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/62914/
Attachment #8768888 - Flags: review?(wmccloskey)
So, from comment 4, it seems like this is something that might have ugly data loss side-effects, and probably isn't upliftable.

Should we change the WebExtension code so that it doesn't close tab pages on browser shutdown, and try to uplift that to 48, to try to avoid this hitting users?

Also, is there any way we can reasonably add tests for this? It seems tricky...
Flags: needinfo?(mconley)
(In reply to Kris Maglione [:kmag] from comment #6)
> So, from comment 4, it seems like this is something that might have ugly
> data loss side-effects, and probably isn't upliftable.

I'm actually reasonably confident that this patch does the right thing. It might be upliftable as far as Beta. At least, I'd be prepared to argue that case, what with the ~2s window for dataloss.

> Should we change the WebExtension code so that it doesn't close tab pages on
> browser shutdown, and try to uplift that to 48, to try to avoid this hitting
> users?

Let's see what billm thinks of the patch. I'm of the opinion that it's a simple fix with glorious side-effects. If we can make a case for 48, maybe we should go for it. It might be prudent to prepare plans for such a patch just in case we get shot down.

> 
> Also, is there any way we can reasonably add tests for this? It seems
> tricky...

Indeed. I'm currently (re) looking at the Firefox UI tests that got added to the tree a few months back. They allow us to do browser restart tests. See bug 1228120.
Flags: needinfo?(mconley)
Attachment #8768888 - Flags: review?(wmccloskey)
Comment on attachment 8768888 [details]
Bug 1284687 - Hide windows on shutdown while persisting session instead of closing them.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/62914/diff/1-2/
Attachment #8768888 - Flags: review?(wmccloskey)
Ugh, it gets even worse. We weren't even _collecting the window data after flushing_. So even if the flushes succeeded, we'd never end up writing them to disk. So we've been missing the last ~2ish seconds of history per window at shutdown always. :(

Found that after working on tests in bug 1228120. I can't get it to work with e10s yet (there's a Marionette bug to deal with), but what I've got shows the restoration working properly now (and is what pointed me towards this new problem).
Comment on attachment 8768888 [details]
Bug 1284687 - Hide windows on shutdown while persisting session instead of closing them.

https://reviewboard.mozilla.org/r/62914/#review59906

Thanks for figuring this out. I'm sorry I didn't catch it.

I would feel better if we did something like not running cleanUpWindow until the window promise has been resolved. It still seems like we might end up in a situation where a window gets closed in between quit-application-granted and the promises resolving (maybe an add-on or a runnable or something).

::: browser/components/sessionstore/SessionStore.jsm:1513
(Diff revision 2)
>  
> -    progress.total = windowPromises.length;
> +    progress.total = windowPromises.size;
>  
>      // We'll iterate through the Promise array, yielding each one, so as to
>      // provide useful progress information to AsyncShutdown.
> -    for (let i = 0; i < windowPromises.length; ++i) {
> +    // progress.current starts at -1, and we increment it every time we

This seems really weird. Why not start it at 0? Also, the default value for |progress| is {}, where it won't start at -1.
Attachment #8768888 - Flags: review?(wmccloskey) → review+
Priority: -- → P1
https://reviewboard.mozilla.org/r/62914/#review59906

Good idea - I'll file that as a follow-up.

> This seems really weird. Why not start it at 0? Also, the default value for |progress| is {}, where it won't start at -1.

Alright, I've started at 0 instead. I agree this makes more sense.
Comment on attachment 8768888 [details]
Bug 1284687 - Hide windows on shutdown while persisting session instead of closing them.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/62914/diff/2-3/
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/51efc2643b80
Hide windows on shutdown while persisting session instead of closing them. r=billm
https://hg.mozilla.org/mozilla-central/rev/51efc2643b80
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 50
See Also: → 1286077
ni'ing myself to request uplift for this patch once this has had a few more days to bake.
Flags: needinfo?(mconley)
Comment on attachment 8768888 [details]
Bug 1284687 - Hide windows on shutdown while persisting session instead of closing them.

Approval Request Comment
[Feature/regressing bug #]:

Bug 1177310

[User impact if declined]:

Users that use WebExtensions might experience shutdown hangs depending on the things that the WebExtension does.

Users may also lose some of the last few (~2) seconds of session state at shutdown. Note that this last bug has existed since Firefox 45.


[Describe test coverage new/current, TreeHerder]:

A regression test was written for the session state dataloss in bug 1228120. This has not yet been reviewed or landed in tree, but it passes for me locally.

The WebExtension case was tested manually locally.

[Risks and why]: 

We're changing how shutdown works in the session store manager. In particular, we're _hiding_ windows instead of closing them when shutting down the browser while we wait for the state to save. This might be unexpected for add-ons.

[String/UUID change made/needed]:

None.
Flags: needinfo?(mconley)
Attachment #8768888 - Flags: approval-mozilla-beta?
Attachment #8768888 - Flags: approval-mozilla-aurora?
Keywords: regression
Version: unspecified → 45 Branch
(In reply to Mike Conley (:mconley) - (Needinfo me!) from comment #16)
> 
> We're changing how shutdown works in the session store manager. In
> particular, we're _hiding_ windows instead of closing them when shutting
> down the browser while we wait for the state to save. This might be
> unexpected for add-ons.
> 

This last bit might be unfounded. I just looked over the previous state of SessionStore.jsm and how it'd work in the normal case.

The browser would start the async flush, and it'd close each window.

After each window finished closing, the onClose event handler would fire in SessionStoreInternal, and because the RunState is quitting, we'd skip right to cleaning up the window. We'd never, as far as I can tell, collect the window data from the closed window anyway. That's bad because it means we've been busted like this for a while (45), but it's good because it makes me feel better about uplift risks.
Comment on attachment 8768888 [details]
Bug 1284687 - Hide windows on shutdown while persisting session instead of closing them.

Let's uplift this to beta. I discussed the risks with mconley and if we aren't about to do something unexpected for add-ons I think we have a good chance of improving the number of shutdown hangs and fixing data loss issue. 
Andrei if your team can be alert for session store problems or do some extra testing in late beta after this lands, that would be helpful.
Flags: needinfo?(andrei.vaida)
Attachment #8768888 - Flags: approval-mozilla-beta?
Attachment #8768888 - Flags: approval-mozilla-beta+
Attachment #8768888 - Flags: approval-mozilla-aurora?
Attachment #8768888 - Flags: approval-mozilla-aurora+
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #18) 
> Andrei if your team can be alert for session store problems or do some extra
> testing in late beta after this lands, that would be helpful.

We'll make sure to test further around this, leaving the ni? in place as a reminder.
Flags: qe-verify+
I tried to reproduce this bug on 50.0a1 (2016-07-05) with e10s on/off and using the steps from comment 0 -> Firefox crashes (https://crash-stats.mozilla.com/report/index/02fd74ea-da2b-44d0-b8fa-debd82160715) (after step 3, Firefox is closed, but it is not possible to open it back for several minutes; after this time, Crash Reporter appears).
Also, I verified the issue on 
- latest Nightly 50.0a1 (2016-07-17)
- latest Aurora 49.0a2 (2016-07-17)
using 
- Windows 10 x64
- Ubuntu 14.04 x86
- Mac OS X 10.11.1
These above mentioned builds are verified fixed. The fix seems to work as expected. There are no more shutdown hangs or crashes related to WebExtensions and Session Restore. 
I will set the corresponding flags.
I will continue the verification process once the 48.0b9 build will appear.
I managed to verify this issue on 48.0b9 build1 (20160718142219), too, using 
- Windows 10 x64 
- Ubuntu 14.04 x86 
- Mac OS X 10.11 
This build is verified fixed. 
Based on the last comments, I will set the bug status accordingly.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Flags: needinfo?(andrei.vaida)
Attachment #8774195 - Attachment is obsolete: true
Duplicate of this bug: 1269859
Depends on: 1296922
Depends on: 1293324
You need to log in before you can comment on or make changes to this bug.