Some windows are skipped when async flushing

RESOLVED FIXED in Firefox 62

Status

()

enhancement
P1
critical
RESOLVED FIXED
Last year
Last year

People

(Reporter: agashlin, Assigned: agashlin)

Tracking

(Blocks 1 bug)

61 Branch
Firefox 62
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox62 fixed)

Details

Attachments

(3 attachments, 7 obsolete attachments)

1.24 KB, patch
mikedeboer
: review+
Details | Diff | Splinter Review
3.81 KB, patch
agashlin
: review+
Details | Diff | Splinter Review
1.80 KB, patch
agashlin
: review+
Details | Diff | Splinter Review
In flushAllWindowsAsync [1] we step through the windows returned from the iterator this._browserWindows, which returns the windows in z order from BrowserWindowTracker.orderedWindows. However the z order of windows will change during this loop when we set `baseWin.visibility = false`; since we are disabling the foremost window, another window becomes active, which brings it to the front of the list, where it will be skipped. One result of this is `TabStateFlusher.flushWindow()` isn't run for that window.

[1] https://searchfox.org/mozilla-central/rev/38bcf897f1fa19c1eba441a611cf309482e0d6e5/browser/components/sessionstore/SessionStore.jsm#1680
Ouch, good catch! So it might be a good idea to run the iterator of a static clone/ copy of the ordered list, right? Are you interested in making that change, or shall I?
Flags: needinfo?(agashlin)
Severity: normal → critical
Priority: -- → P1
If you could make the change I'd appreciate it, heading to bed after finally getting to the bottom of this one. Fortunately it hasn't been in for long.
Flags: needinfo?(agashlin)
Sure thing. Thanks again for finding this!
Assignee: nobody → mdeboer
Status: NEW → ASSIGNED
Blocks: 1458119
Here's an additional firefox-ui-functional test that catches this, seems to be fixed if I disable hiding the windows.
If you haven't already gotten to this I'll pick it up, it's messing with my try runs in bug 1458119 and I'd like to get that sorted.
Submitting this as just a patch since it needs to apply over bug 1458119 when that is able to land (due to my messing with the tests there).
Assignee: mdeboer → agashlin
Attachment #8982558 - Attachment is obsolete: true
Attachment #8982745 - Flags: review?(mdeboer)
Attachment #8982745 - Flags: review?(hskupin)
Comment on attachment 8982745 [details] [diff] [review]
Fix use of _browserWindows while it is changing, add test

Running on try at:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=642935289ea761f53776b9a6bb48bc805e9c3218

Evidently there are still some issues, I'll keep looking into it.
Attachment #8982745 - Flags: review?(mdeboer)
Attachment #8982745 - Flags: review?(hskupin)
Attachment #8982745 - Attachment is obsolete: true
Attachment #8982982 - Flags: review?(mdeboer)
I think the other issue was introduced by bug 1255977 [1]. This resolves one of the racing promises if it ever sees ipc:content-shutdown. It looks like this was added to avoid waiting for the flush if (any?) content process has crashed (see bug 1255977 comment 21), but the referenced code in ContentCrashHandlers.jsm checks if the shutdown was abnormal before deciding that the process crashed [2]. The code added in bug 1255977 does not do this check.

So I updated this patch to add the check, seems to be running ok:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6c6c76d99e82b0a93911e5ee84d0282a2fd8cd5f&selectedJob=181664820

Technically it is a separate bug but I think it's easier to fix it here.

[1] https://searchfox.org/mozilla-central/rev/292d295d6b084b43b70de26a42e68513bb7b36a3/browser/components/sessionstore/SessionStore.jsm#1639
[2] https://searchfox.org/mozilla-central/rev/292d295d6b084b43b70de26a42e68513bb7b36a3/browser/modules/ContentCrashHandlers.jsm#108
Attachment #8982983 - Flags: review?(mdeboer)
This is just a variation on TestSessionStoreEnabled that doesn't open the private windows after opening the first three normal windows, this exposes more missed flushes (such as those caused by Parts 1 & 2).

Needs to be applied on top of Part 2 from Bug 1458119.
Attachment #8982984 - Flags: review?(hskupin)
Comment on attachment 8982983 [details] [diff] [review]
Part 2: Don't abort flushing on normal content process shutdown.

Review of attachment 8982983 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for this change, Adam! I fear that this may be the cause of quite a few other dataloss issues that might not even have been reported, but impossible to debug if so.

::: browser/components/sessionstore/SessionStore.jsm
@@ +1625,5 @@
>  
>            const observeTopic = topic => {
>              let deferred = PromiseUtils.defer();
>              const cleanup = () => Services.obs.removeObserver(deferred.resolve, topic);
> +            Services.obs.addObserver((aSubject, aTopic, aData) => {

We've deprecated the use of `aArgument` in JS, so can you make this `subject => {`? (You're not using, thus don't need, the other two arguments.)
Attachment #8982983 - Flags: review?(mdeboer) → review+
Comment on attachment 8982982 [details] [diff] [review]
Part 1: Avoid using _browserWindows while it is changing.

Review of attachment 8982982 [details] [diff] [review]:
-----------------------------------------------------------------

What I'd like you to do instead is:

1. Make the change in BrowserWindowTracker.jsm instead, so it'll be fixed for all consumers,
2. In the `orderedWindows` iterator, change the following from
   `for (let window of _trackedWindows)` to
   `for (let window of [..._trackedWindows])`
3. Brownie points for a descriptive comment before it that explains why we bother to clone the list.

Does that make sense?

Thanks, Adam - these changes are solid. I apologize for the delay in reviewing your patches - I strive for 24h turn-arounds - but oh well.
Attachment #8982982 - Flags: review?(mdeboer) → review-
(In reply to Mike de Boer [:mikedeboer] from comment #12)
> Comment on attachment 8982982 [details] [diff] [review]
> Part 1: Avoid using _browserWindows while it is changing.
> 
> Review of attachment 8982982 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> What I'd like you to do instead is:
> ...

Sounds good, I agree that no one wants that to change while iterating.

> Thanks, Adam - these changes are solid. I apologize for the delay in
> reviewing your patches - I strive for 24h turn-arounds - but oh well.

No problem, thanks for the good suggestions!
It looks like all other consumers of orderedWindows were already dumping into an array before iterating, but this will protect future consumers. And it's less code than my earlier approach.
Attachment #8982982 - Attachment is obsolete: true
Attachment #8983563 - Flags: review?(mdeboer)
Cleanup, carrying r+ forward.
Attachment #8982983 - Attachment is obsolete: true
Attachment #8983564 - Flags: review+
Comment on attachment 8983563 [details] [diff] [review]
Part 1: Prevent orderedWindows from changing while iterating.

Review of attachment 8983563 [details] [diff] [review]:
-----------------------------------------------------------------

Cool, thanks!
Attachment #8983563 - Flags: review?(mdeboer) → review+
Comment on attachment 8982984 [details] [diff] [review]
Part 3: Test exit soon after navigation

Review of attachment 8982984 [details] [diff] [review]:
-----------------------------------------------------------------

Actually mozreview would be better, but here we go. Given that the patch on the other bug is close to land I can review it now. Please see my inline comments.

::: testing/firefox-ui/tests/functional/sessionstore/test_restore_windows_after_restart_and_quit.py
@@ +37,5 @@
>                           been restored. Expected {}, got {}.
>                           """.format(self.test_windows, current_windows_set))
>  
>  
> +class TestSessionStoreEnabledNoPrivate(TestBaseRestoreWindows):

The whole code gets duplicated except one option in `setUp`. Mind creating two more classes for enabled and disabled, which your new test and the existing one could base on?

> class SessionStoreEnabledTestCase(SessionStoreTestCase)
> class SessionStoreDisabledTestCase(SessionStoreTestCase)

Existing tests could be kept in those classes.

Then add the following classes:

> class TestSessionStoreEnabledAllWindows(SessionStoreEnabledTestCase)
> class TestSessionStoreEnabledNoPrivateWindows(SessionStoreEnabledTestCase)

Those would vary only in `setUp` for the `include_private` argument, but then wouldn't need their own test methods because they inherit them from the base class, and we save us from this code duplication.

@@ +38,5 @@
>                           """.format(self.test_windows, current_windows_set))
>  
>  
> +class TestSessionStoreEnabledNoPrivate(TestBaseRestoreWindows):
> +    def setUp(self):

nit: please add an empty line before.
Attachment #8982984 - Flags: review?(hskupin) → review-
(In reply to Henrik Skupin (:whimboo) from comment #17)
> Comment on attachment 8982984 [details] [diff] [review]
> Part 3: Test exit soon after navigation
> 
> Review of attachment 8982984 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Actually mozreview would be better, but here we go. Given that the patch on
> the other bug is close to land I can review it now. Please see my inline
> comments.

Yeah sorry about the mess here, if it's possible to push stuff to mozreview on top of unlanded changes I don't know how.

> The whole code gets duplicated except one option in `setUp`. Mind creating
> two more classes for enabled and disabled, which your new test and the
> existing one could base on?

As a compromise see the attached, I have the one test class inherit from another, only overriding setUp to provide include_private=False. I ran into trouble making a common base class for the two of them because the presence of a test* method on the base class would cause that test to be run an extra time for the base class.
Attachment #8982984 - Attachment is obsolete: true
Attachment #8983925 - Flags: review?(hskupin)
(In reply to Adam Gashlin [:agashlin] from comment #18)
> Yeah sorry about the mess here, if it's possible to push stuff to mozreview
> on top of unlanded changes I don't know how.

Just include the unlanded changes in your new bookmark too for the time being. You can remove them later. That's how I do it.

> As a compromise see the attached, I have the one test class inherit from
> another, only overriding setUp to provide include_private=False. I ran into
> trouble making a common base class for the two of them because the presence
> of a test* method on the base class would cause that test to be run an extra
> time for the base class.

Oh, right. I totally forgot that. pytest fwiw.... I hope that at some time we can get rid of the unittest dependency.
Comment on attachment 8983925 [details] [diff] [review]
Part 3: Test exit soon after navigation

Review of attachment 8983925 [details] [diff] [review]:
-----------------------------------------------------------------

r=wc

::: testing/firefox-ui/tests/functional/sessionstore/test_restore_windows_after_restart_and_quit.py
@@ +23,5 @@
>          some number of tabs in them. Once the tabs have loaded, restarts
>          the browser, and then ensures that the standard tabs have been
>          restored, and that the private ones have not.
> +
> +        Pass include_private=False to setUp() to not open any private windows.

This comment doesn't match what this test is covering. I assume you wanted to add this to the `setUp` method instead?

@@ +44,5 @@
>  
> +class TestSessionStoreEnabledNoPrivateWindows(TestSessionStoreEnabledAllWindows):
> +
> +    def setUp(self):
> +        super(TestSessionStoreEnabledNoPrivateWindows, self).setUp(include_private=False)

This looks fine to me, and we can clearly get rid of the duplicated test. Thanks.
Attachment #8983925 - Flags: review?(hskupin) → review+
(In reply to Henrik Skupin (:whimboo) from comment #19)
> Just include the unlanded changes in your new bookmark too for the time
> being. You can remove them later. That's how I do it.

Makes sense, I'll note that for next time.

> >          some number of tabs in them. Once the tabs have loaded, restarts
> >          the browser, and then ensures that the standard tabs have been
> >          restored, and that the private ones have not.
> > +
> > +        Pass include_private=False to setUp() to not open any private windows.
> 
> This comment doesn't match what this test is covering. I assume you wanted
> to add this to the `setUp` method instead?

Moved to setUp.

> >  
> > +class TestSessionStoreEnabledNoPrivateWindows(TestSessionStoreEnabledAllWindows):
> > +
> > +    def setUp(self):
> > +        super(TestSessionStoreEnabledNoPrivateWindows, self).setUp(include_private=False)
> 
> This looks fine to me, and we can clearly get rid of the duplicated test.
> Thanks.

I'm a little unclear what you mean by "duplicated test", I'm assuming that you aren't proposing any changes here but rather recognizing that this avoids duplicating a test.

Carrying forward r+.
Attachment #8983925 - Attachment is obsolete: true
Attachment #8984501 - Flags: review+
Correct, no duplication anymore. Looks way better in regards of maintainability.
I missed that I had to do a QueryInterface when I copied the check from ContentCrashHandler.
Attachment #8983564 - Attachment is obsolete: true
Attachment #8984610 - Flags: review+

Comment 24

Last year
Pushed by rgurzau@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2985d2ce41ad
Part 1: Prevent orderedWindows from changing while iterating. r=mikedeboer
https://hg.mozilla.org/integration/mozilla-inbound/rev/56381f46d1d9
Part 2: Fix aborting flush on normal content shutdown. r=mikedeboer
https://hg.mozilla.org/integration/mozilla-inbound/rev/cf89498e0dec
Part 3: Test session save when quitting quickly. r=whimboo
Keywords: checkin-needed

Comment 25

Last year
bugherder
https://hg.mozilla.org/mozilla-central/rev/2985d2ce41ad
https://hg.mozilla.org/mozilla-central/rev/56381f46d1d9
https://hg.mozilla.org/mozilla-central/rev/cf89498e0dec
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → Firefox 62

Comment 26

Last year
Can you fix bug 1235231 now?
Flags: needinfo?(agashlin)
(In reply to Robert Ab from comment #26)
> Can you fix bug 1235231 now?

I've only been working on session store to fix some data loss issues, I don't plan on tackling that bug, irritating as it is.
Flags: needinfo?(agashlin)
You need to log in before you can comment on or make changes to this bug.