Jank and high CPU usage for tens of seconds, after opening a bunch of new tabs (with browser.tabs.remote.separatePrivilegedContentProcess = true)
Categories
(Firefox :: New Tab Page, defect, P2)
Tracking
()
People
(Reporter: dholbert, Assigned: mconley)
References
(Regression)
Details
(Keywords: perf, perf:responsiveness, regression, Whiteboard: [fxperf:p3])
Attachments
(1 file)
STR:
- Install profiler from https://profiler.firefox.com/
- Visit about:blank
- Ctrl+Shift+1 to start profiling.
- Hold down Ctrl+T for 5 seconds, to open lots of new tabs.
- After you've stopped pressing Ctrl+T, wait for the foreground new-tab's content to paint.
- Ctrl+Shift+2 to stop profiling.
ACTUAL RESULTS:
After my 5 seconds of holding Ctrl+T, the foreground new tab is blank for at least 10-15 seconds. After that, the foreground new-tab paints, but during that time and for tens of seconds afterwards, the privileged content process continues to use 80%-150% CPU as reported by the top
command-line tool.
Here's a profile: https://perfht.ml/33frzLG
EXPECTED RESULTS:
Specifics:
- The final foreground new tab should render quickly.
- Backgrounded new tabs should not peg the CPU so aggressively/permanently. They should render more lazily. In particular, if the user is still holding Ctrl+T after a new tab is opened, we shouldn't spend any resources rendering that new tab, because we know more are coming.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 1•5 years ago
|
||
Note: for me, the STR produced 162 tabs. I think the first second of holding Ctrl+T is special ("do you really want to do this") and only opens one tab, and then the rest are rapidly opened in subsequent seconds -- so I got around 40 new tabs per "rapid-opening" second from seconds 2, 3, 4, and 5.
Updated•5 years ago
|
Reporter | ||
Comment 2•5 years ago
|
||
mozregression says this broke during this range:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=af5a791e7b7e7fc42646f202db59cfbf4c48cd6e&tochange=18e46df44cd16a2bd94e8dfd99abb461ce24fe2f
(Before that range, my STR's final new-tab page will render almost instantly, i.e. within a second. After that range it takes ~10 seconds or more to render.)
So it looks like this is a regression from bug 1472212 and specifically from the pref browser.tabs.remote.separatePrivilegedContentProcess
. I verified that I can make current Nightly give me "expected results" (at least, a quickly loading final new-tab page) if I toggle that pref off.
Reporter | ||
Comment 3•5 years ago
|
||
It looks like that pref (separatePrivilegedContentProcess
) is nightly-only right now, and bug 1513045 is the metabug for letting it ride the trains.
This issue should probably block the release of this feature, so let's mark this as blocking bug 1513045.
Updated•5 years ago
|
Comment 4•5 years ago
|
||
I wonder to what degree fixing bug 1540832 would help. There must be some other issue as well, because otherwise the pref flip wouldn't be enough to fix this, but even so...
Does turning off browser.newtab.preload
change anything here?
Updated•5 years ago
|
Comment 5•5 years ago
|
||
A lot of time is spent in Activity Stream's javascript here. https://perfht.ml/2MbsO9v
Andrei, does this profile seem actionable? Feel free to redirect the needinfo to someone else who can help.
Comment 6•5 years ago
|
||
Scott can you please take a look at this?
Comment 7•5 years ago
|
||
What happens if we go to a commit before the broken regression changes, but set separatePrivilegedContentProcess
to true and do the steps? Because setting separatePrivilegedContentProcess
to true doesn't necessarily show when and if this regressed.
But doing a deeper investigation might take a bit of time until we have fixes. What's the timeline like for this?
I can certainly do work to get this running better after 71+.
Comment 8•5 years ago
|
||
(In reply to Scott [:thecount] Downe from comment #7)
What happens if we go to a commit before the broken regression changes, but set
separatePrivilegedContentProcess
to true and do the steps? Because settingseparatePrivilegedContentProcess
to true doesn't necessarily show when and if this regressed.
The pref was added more or less immediately when the privileged content process was introduced. So it's just that the feature interacts badly with about:newtab, not like the feature was developed for years and the preffing on just tripped this. Something about the different process is making things slow(er) here.
But doing a deeper investigation might take a bit of time until we have fixes. What's the timeline like for this?
While about:newtab doesn't load in this process except on nightly, this doesn't track any releases, but I imagine it'll block shipping the pref flip outside of nightly. I don't think there are immediate plans to do that. Mike?
Reporter | ||
Comment 9•5 years ago
|
||
(In reply to :Gijs (he/him) from comment #4)
Does turning off
browser.newtab.preload
change anything here?
That pref-toggle does not help, as far as I can tell. (I turned that pref off, restarted Firefox, held down Ctrl+T for 5 seconds, and still got ~10 seconds of blank content.)
Assignee | ||
Comment 10•5 years ago
|
||
(In reply to :Gijs (he/him) from comment #8)
While about:newtab doesn't load in this process except on nightly, this doesn't track any releases, but I imagine it'll block shipping the pref flip outside of nightly. I don't think there are immediate plans to do that. Mike?
We're blocked on letting this ride the trains past Nightly. See bug 1513045 for the list of blockers.
I'm actually inclined now to just disable it on Nightly until DOM has finished the work to move the process switching logic out of JS into the networking layer. I've filed bug 1574169 to do that.
Comment 11•5 years ago
|
||
I'm removing the iteration field, and setting the priority to P2 to clean up New Tab work. Let me know if you need it to be set to something else to keep up with it on your side. Thanks!
Assignee | ||
Comment 12•5 years ago
|
||
bigiri and I will be making this better once we make Redux state computation easier / cheaper. Until then, however, I think we can mostly address this stress test by only requesting redux state eagerly from the Activity Stream document if the document is visible. Otherwise, we can use an idle callback to request the state. I believe this will prioritize any about:newtab's that get opened and are made visible, whereas any backgrounded ones (including the preloaded one) will wait until the event loop frees up a bit before requesting the Redux state.
Testing this approach with the separate privileged about content process enabled, I see a substantial improvement. Patch coming up.
Assignee | ||
Comment 13•5 years ago
|
||
Updated•5 years ago
|
Comment 14•5 years ago
|
||
Comment 15•5 years ago
|
||
bugherder |
Comment 16•5 years ago
|
||
Hey @Mike,
I’ve tested the performance by using the latest Firefox Nightly 77.0a1 (Build ID: 20200408214238) and the STR from comment 0 on Windows 10 x64, macOS 10.15, and Ubuntu Linux. The results of my testing can be found here:
- Windows + browser.newtab.preload - true : https://perfht.ml/3e6HIbv
- Windows + browser.newtab.preload - false : https://perfht.ml/3c19Pak
- macOS + browser.newtab.preload - true : https://perfht.ml/2XljSTz
- macOS + browser.newtab.preload - false : https://perfht.ml/2UTByE3
- Linux Ubuntu + browser.newtab.preload - true : https://perfht.ml/2Xt711w
- Linux Ubuntu + browser.newtab.preload - false : https://perfht.ml/2VfURGB
I can see a substantial improvement in the performance everywhere, however I’ve still noticed jank in the profiler. Do you think that the existent jank is insignificant enough to consider the bug VERIFIED?
Assignee | ||
Comment 17•5 years ago
|
||
Instead of using browser.newtab.preload
, a more meaningful comparison would be to compare the same behaviour in the build before this fix landed, and the build after. The browser.newtab.preload
preference isn't actually meaningful as a way of comparing in this particular case.
Would you be willing to do the measurements again, but this time using a Nightly from before this patch landed (April 6th would work), and one after (April 8th, I guess)?
Updated•5 years ago
|
Comment 18•5 years ago
|
||
Hey @Mike, of course I’m willing to do the measurements again.
I’ve retested the performance using Nightly 77.0a1(Build ID: 20200406214750) and Nightly 77.0a1 (Build ID: 20200409131623) on Windows 10 x64, macOS 10.15, and Ubuntu x32. Each OS was tested on a different machine. Here are the results:
- Windows before the fix: https://perfht.ml/2XBst4T
- Windows after the fix : https://perfht.ml/34pECLq
- macOS before the fix : https://perfht.ml/3c4Z4nt
- macOS after the fix : https://perfht.ml/2y7f4GF
- Linux Ubuntu before the fix : https://perfht.ml/3c8Nbga
- Linux Ubuntu after the fix :https://perfht.ml/3b0m9rp
Do you think that this is enough to mark the issue as VERIFIED?
Assignee | ||
Comment 19•5 years ago
|
||
Thanks, romartin.
The Linux and Windows profiles show improvement in the time spent running RemotePageManager stuff, since the messages requesting state from the parent are throttled. The macOS profiles show less improvement. So I suspect the benefits here are best experienced on weaker hardware.
To be clear, however, this is very much a stress test (opening ~100 new tabs in a row is... unusual behaviour), and likely not reflective of a real-world user action.
I think I'm happy marking this as VERIFIED. Thanks!
Comment 20•5 years ago
|
||
Marking this issue as VERIFIED based on the verifications done in comment 18 and the confirmation in comment 19.
Updated•3 years ago
|
Updated•3 years ago
|
Description
•