Slow tab closes
Categories
(Firefox :: Sync, defect)
Tracking
()
People
(Reporter: yoasif, Assigned: skhamis)
References
Details
(Whiteboard: [fxsync-])
Attachments
(2 files)
### Basic information
Been noticing this for a little while now, and it is hard to track down what is going on. Disabled a bunch of extensions to see if I could resolve it, but seeing slow tab closes.
Steps to Reproduce:
- Open large restored session.
- Close tab via ctrl-w
Expected Results:
Next tab opens quickly and loads.
Actual Results:
This seems to stutter and jank.
Performance recording (profile)
Profile URL: https://share.firefox.dev/3ldqnWz
System configuration:
OS version: Fedora 37
GPU model: Mesa Intel® UHD Graphics (TGL GT2)
Number of cores: 11th Gen Intel® Core™ i3-1125G4 × 8
Amount of memory (RAM): 64gb
More information
Please consider attaching the following information after filing this bug, if relevant:
- Screenshot / screen recording
- Anonymized about:memory dump, for issues with memory usage
- Troubleshooting information: Go to about:support, click "Copy text to clipboard", paste it to a file, save it, and attach the file here.
Thanks so much for your help.
Reporter | ||
Comment 1•1 year ago
|
||
Comment 2•1 year ago
|
||
The Performance Impact Calculator has determined this bug's performance impact to be medium. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.
Platforms: Windows
Impact on browser: Causes noticeable jank
This appears to be spending most of its time in something called 'TryFitItems', Mike might have an idea of what to do here.
Comment 3•1 year ago
|
||
So there are quite a few things going on in this profile. The main thread in the parent process is janking several times, but if we want to focus in on the tab close / tab switch, we can do that.
This is a slice of the original profile that spans from the beginning of the cmd_close command to close the tab to the actual tab switch of that tab close. From start to finish, this takes 3.1 seconds, which is pretty brutal.
The first chunk of that time, the parent process is spinning a nested event loop waiting to hear back from the content process. That wait is 1.1s. Unfortunately, I can't see any samples from the Reddit page that's taking so long to respond, but presumably it's main thread is blocked and can't service the PermitUnload request fast enough.
A significant chunk of time (about 826ms) is spent preparing to close the tab. There's nothing super expensive going on here, so it looks more like the thread just isn't getting serviced - I see a bunch of dropped samples in that region.
Then there are some long style and layout flushes for updating the tab strip before the new tab is foregrounded, ending the tab close operation.
So my conclusion is that this machine was highly encumbered when this profile was gathered.
I wonder if there's more we can do to prioritize operations in the foreground window? From what I can tell, there were 6 windows open at the time that this profile was gathered. Asif, do you recall if all windows were visible when you dumped the profile? Or were some overlapping / occluding each other?
Reporter | ||
Comment 4•1 year ago
|
||
I wonder if there's more we can do to prioritize operations in the foreground window? From what I can tell, there were 6 windows open at the time that this profile was gathered. Asif, do you recall if all windows were visible when you dumped the profile? Or were some overlapping / occluding each other?
Mike, I would think that all of at most, all but one or two windows would be occluded - I have one display on this machine and windows are either maximized or there is at most one window in the foreground that is not maximized, with another window maximized (occluding the rest).
Comment 5•1 year ago
|
||
:mconley, we're doing firefox::general triage and I'm not clear what the best next steps should be here. Can you suggest?
Comment 6•1 year ago
|
||
I think the biggest thing we've identified (rather, Bas identified) is that some of our Sync client code is taking a long time to do things on the main thread: https://share.firefox.dev/407LKYA
So let's move this to Sync for now and see what the gang over there thinks.
Comment 7•1 year ago
•
|
||
Passing by, looks like the code that's taking a while is in tryFitItems
in https://searchfox.org/mozilla-central/source/services/sync/modules/util.sys.mjs#422
Which tries to figure out the largest number of tabs we can sync without hitting the limit of the Sync server. Looks like we're taking a long time doing the calculation because it computes the size (by serializing the tabs to JSON) after removing each individual tab if it exceeds the maximum (which would take a long time if a user has a very large number of tabs)
It might an opportunity to improve this code a bit!
:skhamis you implemented these bits in the Rust client quite recently, but the code that's taking a while is in JS, do you know if that path still makes sense now that we use the Rust client?
(and maybe any improvements we make here we can make back in the Rust engine)
Comment 8•1 year ago
|
||
FWIW I'm sure we can do some smart binary search to figure out the right spot to cut the payload, but I'm not sure if we'd be over-optimizing and we don't need it :)
Assignee | ||
Comment 9•1 year ago
|
||
Yes we also implemented tryFitItems
in Rust, so it's possible we can simply remove that call all-together. However something is a bit off as tryFitItems
has been used for quite some time (even way before we added the Rust side of it). I'll dive in a little to ensure this is the actual problem area and removing tryFitItems
is the sole culprit (from the sync perspective).
Assignee | ||
Comment 10•1 year ago
|
||
Still researching this but wanted to do a little bit of an update:
My perf link: https://share.firefox.dev/3lzqVGJ
Steps I took:
- Open 4,000 tabs (tested using
data:
schema to not worry about network hits) - Lowered the "max payload size" to 0.5mb so
- Force closed my browser
- Restored session to open 4k tabs
- Start closing a bunch of tabs
While I'm not quite seeing the same performance hit, I'll be trying to get my browser to hit 8k tabs (which is about around the server limit, using my "test" urls). The culprit though seems to be https://searchfox.org/mozilla-central/source/services/sync/modules/util.sys.mjs#422-423 will take really long with a large amount of tabs.
Asif, do you happen to know how many tabs this profile has? If it's around the ballpark of what I did above then I believe I'm on the right track, if not there there might another culprit.
Assignee | ||
Comment 11•1 year ago
|
||
(In reply to Tarik Eshaq from comment #8)
FWIW I'm sure we can do some smart binary search to figure out the right spot to cut the payload, but I'm not sure if we'd be over-optimizing and we don't need it :)
We already try to do a "best guess" on where we need to cut off the records: https://searchfox.org/mozilla-central/source/services/sync/modules/util.sys.mjs#438. In my testing, it's quite close (give or take a couple of tabs) to where we need to be. So worst case scenarios is in the single-digits! However, trying to get this async will probably help in those scenarios.
Assignee | ||
Comment 12•1 year ago
|
||
Updated•1 year ago
|
Reporter | ||
Comment 13•1 year ago
|
||
Sammy, while I don't have a backup of my profile (sorry!) from this time period, 8,000 tabs would not be unexpected - it is entirely possible that it is above 10k tabs.
Assignee | ||
Updated•1 year ago
|
Updated•1 year ago
|
Comment 14•1 year ago
|
||
Pushed by skhamis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6b87ed25b070 Optimize tryFitItems to better handle larger payloads r=markh,mconley
Comment 15•1 year ago
|
||
bugherder |
Comment 16•1 year ago
|
||
Since nightly and release are affected, beta will likely be affected too.
For more information, please visit auto_nag documentation.
Comment 17•1 year ago
|
||
The patch landed in nightly and beta is affected.
:skhamis, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox112
towontfix
.
For more information, please visit auto_nag documentation.
Assignee | ||
Updated•1 year ago
|
Updated•1 year ago
|
Description
•