Closed Bug 1867161 Opened 1 year ago Closed 1 year ago

Session restore causes noticeable jank and memory spike every 15 seconds, if you've got hundreds of tabs open

Categories

(Firefox :: Session Restore, defect)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1849393

People

(Reporter: dholbert, Unassigned)

References

Details

(Keywords: perf)

Attachments

(2 files)

STR:

  1. Open a single Firefox window with 500 Bugzilla tabs. (See [1] for a quick way to do that.) Some of these may result in 429 Too Many Requests; that's fine and doesn't impact the reproducibility of this bug. (Hopefully this doesn't annoy BMO admins too much. :) )
  2. Start profiling with the Firefox profiler (https://profiler.firefox.com/ )
  3. Perform some interactive/smooth task for 30+ seconds -- e.g.:
  1. Watch for periods where Firefox seems to briefly freeze. Capture the profile after a minute or so.

ACTUAL RESULTS:
Every 15 seconds, Firefox freezes for a half-second or so.
The profile shows that these are ~500ms calls to write in resource:///modules/sessionstore/SessionWriter.sys.mjs.
The profile's memory track also shows memory usage persistently increasing by 225MB with each call; some of the memory is freed, but not all. So e.g. after 1 minute, the memory usage has gone up by 900MB.

EXPECTED RESULTS:
No such jank at such a frequent interval, especially not while I'm actively using the browser (typing or watching a video). Particularly when I'm watching a video and haven't done any sessionstore-impacting operations (no new form state, tabs, etc), it's not obvious why sessionstore is jumping in every 15s.
When sessionstore does jump in, it ideally shouldn't have such a persistent memory-usage increase. These lead to enormous GCs after a while, I think (I'm pretty sure this is what bit me in bug 1866145, for example).

[1] Here's one way to get a window with 500 Bugzilla tabs:
a) In a fresh Firefox profile, visit about:preferences, click "Home" on the left, and then click the dropdown for "Homepage and new windows" and choose "Custom URLs"
b) Paste the contents of the attached textfile into the textbox that appears.
c) Ctrl+N to open a new window.
d) Back in your about:preferences tab, return from "Custom URLs" to "Firefox Home (Default)" (so that you don't inadvertently create any more hundred-tab windows)
e) (optional) close your initial about:preferences window, and now just use the window that you opened in (c), which has 500 Bugzilla tabs.

(In reply to Daniel Holbert [:dholbert] from comment #0)

  • or just visit https://paste.mozilla.org/ and hold down the a character on your keyboard to watch the character repeat at a predictable rate

Here's a profile of me doing this^ (simulating fast typing) for about a minute: https://share.firefox.dev/41fFP5D

Every ~15 seconds, there's a 450ms jank bar in the parent process, which pauses the activity in the compositor and renderer thread. And each of these causes a ~225MB persistent increase in the parent process's memory usage. So by the end of this profile, the memory usage is up 900MB vs. the start of the profile, just from that background operation that Firefox is doing every 15 seconds.

See Also: → 1866145
See Also: → 1867137

Ah, I guess this is a dupe of bug 1849393 which sfoster linked over on my related bug 1867137 (which I just noticed).

Status: NEW → RESOLVED
Closed: 1 year ago
Duplicate of bug: 1849393
Resolution: --- → DUPLICATE
See Also: 1867137

I'm posting some revised STR here that don't involve 500 queries to bugzilla.mozilla.org at once, both to reduce variability and to avoid causing trouble for Bugzilla admins. :)

I'm attaching a tarball which has a directory testenv/ with two subdirectories:

  • server (for a local http server -- it has a simple html file with a 1.4MB favicon.ico file (bugzilla's favicon, scaled up 10x). I've also included the actual bugzilla favicon which is smaller.
  • profile (for use as a Firefox profile - it just has a single file, prefs.js, which tells Firefox to start up with a paste.mozilla.org tab and 500 copies of the locally-served testcase).

STR:

  1. Extract the attached tarball, let's say to /tmp

  2. Start a local http server on port 8999 in /tmp/testenv/server.

cd /tmp/testenv/server
python3 -m http.server 8999
  1. In another terminal, start Firefox (directly or via mozregression) with profile /tmp/testenv/profile:
# pick one of:
mozregression  --launch 2023-11-30 -p /tmp/testenv/profile/
firefox --profile /tmp/testenv/profile
  1. Wait for the Firefox window to stabilize (watch your terminal from step 2 to wait for the http requests to stop, and watch for your Firefox CPU usage to drop from 100%, and watch for your visible tabs to have a visible favicon). This takes less than a minute on my machine.

  2. Focus the textbox in your pastebin tab (should be the foreground tab) and hold down your a key on your keyboard for 30seconds or more, and watch the repeating character to see how long it janks at 15-second intervals.

In Nightly 2018-10-05 ("good"), this gives me ~1 sec of jank
In Nightly 2018-10-06 ("bad"), this instead gives me ~6 sec of jank
In current Nightly 2023-11-30, this gives me ~3 sec of jank

So: we had a regression which we've partially recovered from, but it's still worse than it was.

Regression range:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=54cb6a2f028b033ef567f00af2f82f5fb97ab437&tochange=07c609fc8eb89140f602ef0c838b900c2964287a

Here's a profile of those STR, captured in current Nightly -- notice the 3-4second periods of jank in the parent process:
https://share.firefox.dev/47UKDz7

(Unfortunately I can' t capture a "good" profile from Nightly 2018-10-05 for comparison, because https://profiler.firefox.com/ doesn't load in that build. Maybe it's possible but needs special setup; not sure.)

evilpie: could you take a look here perhaps? I think your bug 1493150 seems like the only thing in that push range that might've influenced sessionstore JSON-serialization jank-time. (It looks like bug 1493150 was meant to be an optimization, but I'm guessing that maybe there could be edge cases where it made things slower, and this was one of them.)

If that patch is indeed the regressor, I wonder if there are any obvious improvements to the approach in that patch that could make things better here; or, I wonder if there are changes we could make on the frontend to get a faster operation here (as good as or better as we had on 2018-10-05).

([EDITING to add a note]: I'm bringing this up over here rather than on the dupe-target because I don't want to derail that bug any more than I already have. :) We probably want to keep that bug focused on "how can we redesign sessionstore-serialization overall so that things are faster"; whereas this possible JS-engine-regression is also worth investigating but is kind of a separate discussion.)

Flags: needinfo?(evilpies)
Flags: needinfo?(evilpies)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: