Closed Bug 1437382 Opened 6 years ago Closed 6 years ago

Private tab updates shortly before backgrounding aren't guaranteed to make it into storage

Categories

(Firefox for Android Graveyard :: Session Restore, enhancement, P1)

All
Android
enhancement

Tracking

(firefox60 fixed)

RESOLVED FIXED
Firefox 60
Tracking Status
firefox60 --- fixed

People

(Reporter: JanH, Assigned: JanH)

References

Details

Attachments

(12 files)

59 bytes, text/x-review-board-request
jchen
: review+
Details
59 bytes, text/x-review-board-request
jchen
: review+
Details
59 bytes, text/x-review-board-request
jchen
: review+
Details
59 bytes, text/x-review-board-request
jchen
: review+
Details
59 bytes, text/x-review-board-request
jchen
: review+
Details
59 bytes, text/x-review-board-request
jchen
: review+
Details
59 bytes, text/x-review-board-request
esawin
: review+
Details
59 bytes, text/x-review-board-request
esawin
: review+
Details
59 bytes, text/x-review-board-request
esawin
: review+
Details
59 bytes, text/x-review-board-request
esawin
: review+
Details
59 bytes, text/x-review-board-request
esawin
: review+
Details
59 bytes, text/x-review-board-request
esawin
: review+
Details
When we get an application-background event, the session store attempts to flush its data into storage, however while I guess that at some point this might have worked as designed, after a number of refactorings in that area (possibly especially bug 1260243 and bug 1346413) this is certainly no longer guaranteed to happen before we return control from the relevant Android lifecycle events.

For normal tabs this isn't quite as tragic as it seems, because while we could be killed at any moment after returning from onPause(), in practice most of the time we probably still manage to finish writing the new session store file anyway, even while already backgrounded.

For private browsing tabs this is different, though: As we don't want to write them to disk, we store them as part of our savedInstanceState. The current situation means that the data flush triggered by our application-background handling happens too late to update mPrivateBrowsingSession in GeckoApp in time for the onSavedInstanceState callback.
This means that the private browsing data held as part of our savedInstanceState in case of OOM-kills might be up to ~10 seconds out of date, which is enough time for some supposedly closed private tabs to resurface after an OOM background kill.

STR:
1. Turn on "browser.sessionstore.debug_logging" in about:config, restart Firefox and watch logcat for messages from "GeckoSessionStore".
2. Open some private tabs and wait until session store activity has settled down.
3. Trigger a session store write by scrolling the current tab slightly/selecting a different tab and wait until saveState/writeFile handling for this change has completed.
4. Now close some of those private tabs and then immediately hit the home button or switch to some other app.
5. Run "adb shell am kill <Firefox package name>" to simulate an OOM kill.
6. Open Firefox again.
7. The private tabs you closed in step 4.) unexpectedly reappear.
Having looked some more, I think that bug 1260243 actually isn't to blame here and the straw that broke the camel's back was probably bug 1346413 - now we're triggering application-background handling only after/during onStop, which is definitiviely too late for onSaveInstanceState, whereas previously this was kicked off (when things worked) in onPause.

We still had a race condition, though, I think, because our Android onPause-handling *didn't* actually block for Gecko completing its "application-background" work, so depending on how much time there was between onPause and onSaveInstanceState and how soon the session store managed to do its work in response to application-background, this might or might not have worked even previously.

There are two possible ways to solve this: One is to block during onSaveInstanceState until the session store has sent over fresh private tabs data. By making sure that we *only* wait for the private data to be sent over, and *not* waiting for everything else that currently happens as part of application-background, including the session store writing the non-private tabs to disk, this can be made to happen quite quickly, but unfortunately it still introduces the possibility of our UI hanging if Gecko is very busy and takes longer than normally to respond.

The other way is to speed up processing of tab events relating to private tabs within the session store, so they're not subject to the regular write throttling interval of by default 10 s.

Since I think that the most important issue is persisting TabClose events as reliably as possible, I propose the following combined solution:
1. onSaveInstanceState will wait for the session store to send over fresh private tabs data, but only for a limited amount of time, say on the order of 500 ms (and add some telemetry to get some better data on how long we typically have to wait here than just my phone). onSaveInstanceState means we're going into the background, so we can afford to wait a little, but not too much in case the user returns to Firefox immediately or we're only switching to a different activity within Firefox, e.g. our settings. If we have to abort the wait, it means we'll miss private tabs that have navigated or been opened within the last ~ 10 seconds, but...
2. ... to make sure that GeckoApp's mPrivateBrowsingSession is less likely to be out of date in that regard, speed up the processing of TabClose events, so they don't have to wait up to 10 seconds before being transmitted to GeckoApp.
(In reply to Jan Henning [:JanH] from comment #1)
> speed up the processing of TabClose events, so they don't have to wait
> up to 10 seconds before being transmitted to GeckoApp.

That is TabClose events relating to private tabs.
Comment on attachment 8953737 [details]
Bug 1437382 - Part 6 - Switch to using PrivateBrowsing helper method in session store.

https://reviewboard.mozilla.org/r/222856/#review229162
Attachment #8953737 - Flags: review?(esawin) → review+
Comment on attachment 8953738 [details]
Bug 1437382 - Part 7 - Change session store time callback behaviour.

https://reviewboard.mozilla.org/r/222858/#review229166
Attachment #8953738 - Flags: review?(esawin) → review+
Comment on attachment 8953739 [details]
Bug 1437382 - Part 8 - Extract functions for creating and cancelling the delayed write timer.

https://reviewboard.mozilla.org/r/222860/#review229248
Attachment #8953739 - Flags: review?(esawin) → review+
Comment on attachment 8953739 [details]
Bug 1437382 - Part 8 - Extract functions for creating and cancelling the delayed write timer.

https://reviewboard.mozilla.org/r/222860/#review229254
Comment on attachment 8953740 [details]
Bug 1437382 - Part 9 - Track number of outstanding "private tabs only" saveState calls.

https://reviewboard.mozilla.org/r/222862/#review229262
Attachment #8953740 - Flags: review?(esawin) → review+
Comment on attachment 8953741 [details]
Bug 1437382 - Part 10 - Use a reduced save delay when saving private tabs.

https://reviewboard.mozilla.org/r/222864/#review229276

::: mobile/android/components/SessionStore.js:991
(Diff revision 1)
>        // Interval until the next disk operation is allowed
>        let currentDelay = this._lastSaveTime + this._interval - Date.now();
>  
>        // If we have to wait, set a timer, otherwise saveState directly
> -      let delay = Math.max(currentDelay, MINIMUM_SAVE_DELAY);
> +      let delay = aPrivateTabsOnly ?
> +                    SAVE_INTERVAL_PRIVATE_TABS : Math.max(currentDelay, MINIMUM_SAVE_DELAY);

We (should) a use custom indentation rule for the ternary operator:
<condition>
  ? <yes>
  : <no>

::: mobile/android/components/SessionStore.js:991
(Diff revision 1)
>        // Interval until the next disk operation is allowed
>        let currentDelay = this._lastSaveTime + this._interval - Date.now();
>  
>        // If we have to wait, set a timer, otherwise saveState directly
> -      let delay = Math.max(currentDelay, MINIMUM_SAVE_DELAY);
> +      let delay = aPrivateTabsOnly ?
> +                    SAVE_INTERVAL_PRIVATE_TABS : Math.max(currentDelay, MINIMUM_SAVE_DELAY);

We (should) a use custom indentation rule for the ternary operator:
<condition>
  ? <yes>
  : <no>

::: mobile/android/components/SessionStore.js:991
(Diff revision 1)
>        // Interval until the next disk operation is allowed
>        let currentDelay = this._lastSaveTime + this._interval - Date.now();
>  
>        // If we have to wait, set a timer, otherwise saveState directly
> -      let delay = Math.max(currentDelay, MINIMUM_SAVE_DELAY);
> +      let delay = aPrivateTabsOnly ?
> +                    SAVE_INTERVAL_PRIVATE_TABS : Math.max(currentDelay, MINIMUM_SAVE_DELAY);

We (should) a use custom indentation rule for the ternary operator:
<condition>
  ? <yes>
  : <no>
Attachment #8953741 - Flags: review?(esawin) → review+
Comment on attachment 8953742 [details]
Bug 1437382 - Part 11 - Shorten save delay when private tabs are closed.

https://reviewboard.mozilla.org/r/222866/#review229280
Attachment #8953742 - Flags: review?(esawin) → review+
Comment on attachment 8953731 [details]
Bug 1437382 - Part 0 - Cleanup imports (and unused variables).

https://reviewboard.mozilla.org/r/222842/#review229318
Attachment #8953731 - Flags: review?(nchen) → review+
Comment on attachment 8953732 [details]
Bug 1437382 - Part 1 - Cleanup GeckoActivityMonitor.

https://reviewboard.mozilla.org/r/222844/#review229320
Attachment #8953732 - Flags: review?(nchen) → review+
Comment on attachment 8953733 [details]
Bug 1437382 - Part 2 - Allow triggering application-background earlier when possible.

https://reviewboard.mozilla.org/r/222846/#review229324
Attachment #8953733 - Flags: review?(nchen) → review+
Comment on attachment 8953734 [details]
Bug 1437382 - Part 3 - Flush session store tab data separately from application-background.

https://reviewboard.mozilla.org/r/222848/#review229326
Attachment #8953734 - Flags: review?(nchen) → review+
Comment on attachment 8953735 [details]
Bug 1437382 - Part 4 - Make sure that flushing private tabs data reaches GeckoApp in time.

https://reviewboard.mozilla.org/r/222850/#review229330

::: mobile/android/base/java/org/mozilla/gecko/GeckoApp.java:614
(Diff revision 1)
>      }
>  
>      @Override
>      protected void onSaveInstanceState(Bundle outState) {
> +        synchronized (this) {
> +            mPrivateBrowsingSessionOutdated = true;

If GeckoApplication triggered and _completed_ a flush before this, I think this line here will make us wait unnecessarily?

::: mobile/android/components/SessionStore.js:1088
(Diff revision 1)
> +      log("_saveState() writing normal data, " +
> +           normalData.windows[0].tabs.length + " tabs in window[0]");
> +    } else {
> +      log("_saveState() writing empty normal data");
> +    }
> +    this._writeFile(this._sessionFile, this._sessionFileTemp, normalData, aAsync);

Any reason for moving this to after sending private session?
Attachment #8953735 - Flags: review?(nchen)
Comment on attachment 8953736 [details]
Bug 1437382 - Part 5 - Test that flushing pending session store data sends the expected messages to Java.

https://reviewboard.mozilla.org/r/222852/#review229344
Attachment #8953736 - Flags: review?(nchen) → review+
Comment on attachment 8953735 [details]
Bug 1437382 - Part 4 - Make sure that flushing private tabs data reaches GeckoApp in time.

https://reviewboard.mozilla.org/r/222850/#review229330

> If GeckoApplication triggered and _completed_ a flush before this, I think this line here will make us wait unnecessarily?

GeckoApplication's background handling is triggered by the GeckoActivityMonitor, which in turn is called only *after* we've called super.onSaveInstanceState, as the ActivityLifecycleCallbacks are handled by Android's base Activity class. (I suppose I should add a comment to that effect) So we'll always have triggered a fresh flush for which we're attempting to wait below.

> Any reason for moving this to after sending private session?

When flushing tabs, \_writeFile writes synchronously, which takes time, plus due to the way \_write is structured to support writing both synchronously or asynchronously, unless I'm mistaken we always have to take a detour via the event loop before being able to continue with the part following the .then{... at https://dxr.mozilla.org/mozilla-central/rev/7208b6a7b11c3ed8c87a7f17c9c30a8f9583e791/mobile/android/components/SessionStore.js#1173.

All that time spent writing non-private tabs
- means we're blocking the UI thread longer than necessary
- makes it more likely that the wait time expires and we fail to get the most current private session store data into the savedInstanceState

Hence I've swapped the order of those two operations around.
Comment on attachment 8953735 [details]
Bug 1437382 - Part 4 - Make sure that flushing private tabs data reaches GeckoApp in time.

https://reviewboard.mozilla.org/r/222850/#review229330

> GeckoApplication's background handling is triggered by the GeckoActivityMonitor, which in turn is called only *after* we've called super.onSaveInstanceState, as the ActivityLifecycleCallbacks are handled by Android's base Activity class. (I suppose I should add a comment to that effect) So we'll always have triggered a fresh flush for which we're attempting to wait below.

In that case would it make sense to always flush from GeckoApp (and not flush from GeckoApplication?). During a backgrounding event where GeckoApp is not involved, presumably we don't need to deal with session store either?

> When flushing tabs, \_writeFile writes synchronously, which takes time, plus due to the way \_write is structured to support writing both synchronously or asynchronously, unless I'm mistaken we always have to take a detour via the event loop before being able to continue with the part following the .then{... at https://dxr.mozilla.org/mozilla-central/rev/7208b6a7b11c3ed8c87a7f17c9c30a8f9583e791/mobile/android/components/SessionStore.js#1173.
> 
> All that time spent writing non-private tabs
> - means we're blocking the UI thread longer than necessary
> - makes it more likely that the wait time expires and we fail to get the most current private session store data into the savedInstanceState
> 
> Hence I've swapped the order of those two operations around.

Ok that makes sense. Thanks!
Comment on attachment 8953735 [details]
Bug 1437382 - Part 4 - Make sure that flushing private tabs data reaches GeckoApp in time.

https://reviewboard.mozilla.org/r/222850/#review229330

> In that case would it make sense to always flush from GeckoApp (and not flush from GeckoApplication?). During a backgrounding event where GeckoApp is not involved, presumably we don't need to deal with session store either?

In a way yes, you're right - currently the session store only handles full Firefox tabs, which are more or less linked to GeckoApp/BrowserApp. Even if/when some day somebody gets the session store to work with GeckoView as well I suppose the API for that will look differently, with the session store only responsible for collecting and providing the serialised tab data and everything else (including actually saving the data) being left to the GeckoView consumer.

However:
1. When an activity is finishing, onSaveInstanceState won't be called. The GeckoActivityMonitor can simply fall back to dispatching onApplicationBackground() during onStop() instead, but within GeckoApp I'd have to add additional code to trigger a tabs flush during onStop() as well, but only if the activity is finishing.
This scenario is relevant when we're using GeckoApp's finishAndShutdown() method as opposed to messaging browser.js with an "Browser:Quit" message - in the latter case shutdown is initiated by Gecko and the session store will flush pending changes autonomously, whereas in the former case we just finish GeckoApp, rely on our application-background handling to do some minimal cleanup and then just kill our process.
2. Once GeckoApp has been started, the full Firefox tabs opened through it will remain open as long the process running Gecko stays alive, even if the GeckoApp activity is temporarily destroyed while it's in background, either because we're completely in background, or just having a different foreground activity like our settings or anything GeckoView-based. While most probably there won't be anything interesting happening to those tabs while the main browser UI is in background, because of scripts running in background it's not totally excluded. Because the likelihood of us getting killed only depends on whether the whole app is backgrounded or not, I'd still like to flush the current session store state to disk when that happens.
Comment on attachment 8953735 [details]
Bug 1437382 - Part 4 - Make sure that flushing private tabs data reaches GeckoApp in time.

https://reviewboard.mozilla.org/r/222850/#review229940
Attachment #8953735 - Flags: review?(nchen) → review+
Pushed by mozilla@buttercookie.de:
https://hg.mozilla.org/integration/autoland/rev/04b001dda76a
Part 0 - Cleanup imports (and unused variables). r=jchen
https://hg.mozilla.org/integration/autoland/rev/7eb4043fdaca
Part 1 - Cleanup GeckoActivityMonitor. r=jchen
https://hg.mozilla.org/integration/autoland/rev/9f1acf3452e1
Part 2 - Allow triggering application-background earlier when possible. r=jchen
https://hg.mozilla.org/integration/autoland/rev/50522bf54000
Part 3 - Flush session store tab data separately from application-background. r=jchen
https://hg.mozilla.org/integration/autoland/rev/22531e0df456
Part 4 - Make sure that flushing private tabs data reaches GeckoApp in time. r=jchen
https://hg.mozilla.org/integration/autoland/rev/4b7d24b7ec8b
Part 5 - Test that flushing pending session store data sends the expected messages to Java. r=jchen
https://hg.mozilla.org/integration/autoland/rev/e06211b51624
Part 6 - Switch to using PrivateBrowsing helper method in session store. r=esawin
https://hg.mozilla.org/integration/autoland/rev/6c92b6fac3e3
Part 7 - Change session store time callback behaviour. r=esawin
https://hg.mozilla.org/integration/autoland/rev/41f601758661
Part 8 - Extract functions for creating and cancelling the delayed write timer. r=esawin
https://hg.mozilla.org/integration/autoland/rev/6db1cb600f52
Part 9 - Track number of outstanding "private tabs only" saveState calls. r=esawin
https://hg.mozilla.org/integration/autoland/rev/2f104556b212
Part 10 - Use a reduced save delay when saving private tabs. r=esawin
https://hg.mozilla.org/integration/autoland/rev/6bcb8ed863ab
Part 11 - Shorten save delay when private tabs are closed. r=esawin
Depends on: 1459679
Depends on: 1489594
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: