Closed Bug 1351677 Opened 7 years ago Closed 7 years ago

Async tab switcher tries to show a spinner if the selected tab in a restored window isn't the initial one

Categories

(Core :: DOM: Content Processes, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: mconley, Assigned: mconley)

References

Details

Attachments

(4 files, 1 obsolete file)

The attached graph shows[1] the percentage of clients on Nightly seeing spinners increasing starting on the March 22nd Nightly build, and peaking around the March 25th Nightly build at over half of the population (~56%). So far, we seem to be holding there.

Leading up to March 22nd, we were pretty steady at around ~40% of the population - so we've increased by something like ~16%. That's bad, and we should figure out what's going on.

The other graphs indicate that of that 16% increase, we can break down the severity of the increase as follows:

~12% increase on users seeing spinners of length 0 - 999ms
~3% increase on users seeing spinners of length 1000 - 2296ms

The rest is kinda distributed all over the place.

Other interesting facts:

On March 22nd, I noticed an increase in the proportion of spinners that are of type "unseenNew". This went from around ~2% of spinners to around ~9% of spinners, and it's pretty steady there: https://mzl.la/2oaDn0i

March 21st Nightly build rev:
5fe5dcf1c10a4523ba3f0a20295551462c2dae11 

March 22nd Nightly build rev:
201231223cd4354a450c3e5d80959f35b8e4cf0c 

Changelog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5fe5dcf1c10a4523ba3f0a20295551462c2dae11&tochange=201231223cd4354a450c3e5d80959f35b8e4cf0c

I'm kinda suspicious of bug 1337056.

Hey mystor, do you know if your patches in bug 1337056 would be somehow bogging down the main thread in the content process on start up?

[1]: Apologies for the mess of markers and the date in the top right of the graph. Trust me that that says March 22nd.
mystor is innocent. This is fallout from bug 1256472.
Blocks: 1256472
The "good" news is that I believe this spinner is mostly invisible, and happens at the point of window restoration.
Here's the tab switcher log during window restoration:

START
Initial tab is loaded?: false
requestTab 15(about:blank) 0:VR(+?) 1:(-) 2:(-) 3:(-) 4:(-) 5:(-) 6:(-) 7:(-) 8:(-) 9:(-) 10:(-) 11:(-) 12:(-) 13:(-) 14:(-) 15:(-) 16:(-) 17:(-) 18:(-) 19:(-) 20:(-) 21:(-)
Loading tab 15(about:blank)
SPINNER SHOWN
done 0:V(+?) 1:(-) 2:(-) 3:(-) 4:(-) 5:(-) 6:(-) 7:(-) 8:(-) 9:(-) 10:(-) 11:(-) 12:(-) 13:(-) 14:(-) 15:LR(+?) 16:(-) 17:(-) 18:(-) 19:(-) 20:(-) 21:(-)
done 0:V(+?) 1:(-) 2:(-) 3:(-) 4:(-) 5:(-) 6:(-) 7:(-) 8:(-) 9:(-) 10:(-) 11:(-) 12:(-) 13:(-) 14:(-) 15:LR(+?) 16:(-) 17:(-) 18:(-) 19:(-) 20:(-) 21:(-)
done 0:V(+?) 1:(-) 2:(-) 3:(-) 4:(-) 5:(-) 6:(-) 7:(-) 8:(-) 9:(-) 10:(-) 11:(-) 12:(-) 13:(-) 14:(-) 15:LR(+?) 16:(-) 17:(-) 18:(-) 19:(-) 20:(-) 21:(-)
onTabChildNotReady(0) 0:V(+?) 1:(-) 2:(-) 3:(-) 4:(-) 5:(-) 6:(-) 7:(-) 8:(-) 9:(-) 10:(-) 11:(-) 12:(-) 13:(-) 14:(-) 15:LR(+?) 16:(-) 17:(-) 18:(-) 19:(-) 20:(-) 21:(-)
done 0:V(+?) 1:(-) 2:(-) 3:(-) 4:(-) 5:(-) 6:(-) 7:(-) 8:(-) 9:(-) 10:(-) 11:(-) 12:(-) 13:(-) 14:(-) 15:LR(+?) 16:(-) 17:(-) 18:(-) 19:(-) 20:(-) 21:(-)
onTabChildNotReady(15) 0:V(+?) 1:(-) 2:(-) 3:(-) 4:(-) 5:(-) 6:(-) 7:(-) 8:(-) 9:(-) 10:(-) 11:(-) 12:(-) 13:(-) 14:(-) 15:LR(+?) 16:(-) 17:(-) 18:(-) 19:(-) 20:(-) 21:(-)
DEBUG: spinner time = 146
DEBUG: tab switch time = 147
SPINNER HIDDEN
I'm getting more responsive tab usage and better performance, and i was stuck on firefox 44 beta 9 since the session restore regression changes.

I don't see any increase in spinners personally.
So here's what I believe is going on in this bug:

Firefox starts, and we're restoring a session. We're restoring a window for that session. The tab that we will ultimately select in that window is not the initially selected browser tab, so after making sure that the right number of tabs are in that window, we select the tab for that restored window.

This causes the async tab switcher to kick in, which notices that the tab we're switching from (the initial tab) hasn't yet had its layers arrive at the compositor, so it's in the unloaded state. The tab that we've requested also doesn't have its layers loaded either, and in the postActions, we call updateDisplay. updateDisplay computes that we don't need to show a blank tab, and computes that we do need to show a spinner, and so a spinner is shown.

Eventually, the TabChildNotReady events reach the switcher, and (I think) blank is shown instead.

Trying to come up with solutions now...
Okay, so I think I know why bug 1256472 caused this to go up.

Like I said in comment 8, when a window is restored, the selected tab for that window is changed to match the restored state. Before bug 1256472, new tabs that were added during the restoration were non-remote by default, meaning that we could switch to them synchronously.

After bug 1256472 landed, those background tabs are remote, which means we need to wait for their layers to show up at the compositor before we can complete the switch. I suspect that when we do that switch, the content process has completed starting up, but the content process isn't in a position for us to interrupt its JS right away (not 100% sure why just yet). When the HangMonitorChild finally gets an opportunity to ask the TabChild to force paint, we discover that the TabChild doesn't exist, and do the work to fire the TabChildNotReady event that was added in bug 1342927 to complete the switch.

There are two things I think that will help with this:

1) Lazy browser instantiation during restoration from bug 1345090. This will mean far less traffic to the content process, so it has more time to deal with the tab switch

2) We can avoid doing the switch entirely if we just _move_ the currently selected tab to the right index before doing the restore.

(1) is underway. I'm going to try (2) since that'll allow us to avoid unnecessary work anyways.
Attachment #8852708 - Attachment is obsolete: true
Hey mikedeboer, I know that there's already significant work going on in bug 1345090 for SessionStore.jsm, but I don't think these patches contradict it. In fact, I believe these patches compliment the work nicely (though they can land separately), since once we use lazy browsers, it'll allow us to avoid inserting a browser for the selected tab if it's not the initial tab.

Also note that I think browser_394759_purge.js has been race-y for a while, and I guess my change caused the race to change somewhat, as the browser:purge-domain-data notification started to appear _before_ we had a chance to add the observer - so now, I make sure we add the observer first.
See Also: → 1345090
Assignee: nobody → mconley
Summary: Find out why more Nightly users are seeing tab switch spinners since March 21st → Async tab switcher tries to show a spinner if the selected tab in a restored window isn't the initial one
Comment on attachment 8854577 [details]
Bug 1351677 - When restoring a window, swap the initially selected tab with the desired selected tab instead of tab switching.

https://reviewboard.mozilla.org/r/126534/#review129488

LGTM! r=me with comments addressed.

::: browser/components/sessionstore/SessionStore.jsm:3209
(Diff revision 1)
>      let overwriteTabs = aOptions && aOptions.overwriteTabs;
>      let isFollowUp = aOptions && aOptions.isFollowUp;
>      let firstWindow = aOptions && aOptions.firstWindow;
> +    // See SessionStoreInternal.restoreTabs for a description of what
> +    // selectTab represents.
> +    let selectTab = (overwriteTabs ? (parseInt(winData.selected || "1")) : 0);

I see you moved this code, but here goes: always use a base-value with `parseInt` -> `parseInt(winData.selected || 1, 10)`. Also, the double parentheses are unnecessary.

::: browser/components/sessionstore/SessionStore.jsm:3300
(Diff revision 1)
>        }
>      }
>  
> +    if (selectTab > 0) {
> +      // The state we're restoring wants to select a particular tab. This
> +      // imples that we're overwriting tabs.

nit: implies.

::: browser/components/sessionstore/SessionStore.jsm:3319
(Diff revision 1)
> +        // We'll try to do (1), and then fallback to (2).
> +
> +        let selectedTab = tabbrowser.selectedTab;
> +        let tabAtTargetIndex = tabs[targetIndex];
> +        let userContextsMatch =
> +          selectedTab.userContextId == tabAtTargetIndex.userContextId;

nit: I'm OK with this staying on a single line.

::: browser/components/sessionstore/SessionStore.jsm:3336
(Diff revision 1)
> +          tabbrowser.selectedTab = tabs[targetIndex];
> +        }
> +      }
> +    }
> +
> +    for (let i = 0; i < winData.tabs.length; ++i) {

You can re-use `newTabCount` for `winData.tabs.length`.

::: browser/components/sessionstore/SessionStore.jsm:3339
(Diff revision 1)
> +    }
> +
> +    for (let i = 0; i < winData.tabs.length; ++i) {
> +      if (winData.tabs[i].pinned) {
> +        let tab = tabs[i]
> +        tabbrowser.pinTab(tab);

You can fold these two lines together; they're quite easy to follow.

::: browser/components/sessionstore/SessionStore.jsm:3344
(Diff revision 1)
> +        tabbrowser.pinTab(tab);
> +      } else {
> +        // Pinned tabs are clustered at the start of the tab strip. As
> +        // soon as we reach a tab that isn't pinned, we know there aren't
> +        // any more for this window.
> +        break;

This made me less worried about introducing a second loop here. Thanks!
Attachment #8854577 - Flags: review?(mdeboer) → review+
Comment on attachment 8854578 [details]
Bug 1351677 - Update browser_remoteness_flip_on_restore test to account for initial tab swapping.

https://reviewboard.mozilla.org/r/126536/#review129494

Sounds good to me, thanks for the little explainers near the tests.
Attachment #8854578 - Flags: review?(mdeboer) → review+
Comment on attachment 8854579 [details]
Bug 1351677 - Fix a race in browser_394759_purge.js.

https://reviewboard.mozilla.org/r/126538/#review129498
Attachment #8854579 - Flags: review?(mdeboer) → review+
I'm not sure what to make of the M-bc results, because they seem to fail for reasons unrelated to these patches. I mean, why would 'browser/components/resistfingerprinting/test/browser/browser_roundedWindow_newWindow.js' permafail on OSX 10.10 suddenly? And what about 'browser/base/content/test/static/browser_all_files_referenced.js' on Win7?
(In reply to Mike de Boer [:mikedeboer] from comment #20)
> I'm not sure what to make of the M-bc results, because they seem to fail for
> reasons unrelated to these patches. I mean, why would
> 'browser/components/resistfingerprinting/test/browser/
> browser_roundedWindow_newWindow.js' permafail on OSX 10.10 suddenly? And
> what about
> 'browser/base/content/test/static/browser_all_files_referenced.js' on Win7?

Unsure. I can't reproduce locally...

However, my try push was for an artifact build. I'll do a non-artifact try push now with the latest changes to see if they go away. If so, I'll file a new set of bugs for the artifact failures.
Yeah, the oranges went away with a normal build (as opposed to an artifact build). I'm going to push central to try for an artifact build without these patches to see if those existed prior to these patches. I'd be very surprised to learn if these patches somehow introduced these oranges only for artifact builds, but let's find out. Setting needinfo on self to sort that out.

Going to land the current set of patches on autoland.
Flags: needinfo?(mconley)
(In reply to Mike Conley (:mconley) from comment #21)
> However, my try push was for an artifact build.

Ugh. Yeah, that must be it. I didn't notice it was an artifact build.
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cb2a92a86132
When restoring a window, swap the initially selected tab with the desired selected tab instead of tab switching. r=mikedeboer
https://hg.mozilla.org/integration/autoland/rev/f9049de6a1f4
Update browser_remoteness_flip_on_restore test to account for initial tab swapping. r=mikedeboer
https://hg.mozilla.org/integration/autoland/rev/77a91768f109
Fix a race in browser_394759_purge.js. r=mikedeboer
Turns out those artifact oranges are bug 1353894 and bug 1351657. Unrelated, and we're done here.
Flags: needinfo?(mconley)
Depends on: 1357098
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: