Closed Bug 1741865 Opened 3 years ago Closed 2 years ago

Firefox loads tabs slowly after starting and update to new version

Categories

(Core :: Storage: Quota Manager, defect, P3)

Desktop
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1899015
Performance Impact high

People

(Reporter: mte90net, Unassigned)

References

Details

(Keywords: perf:responsiveness)

Attachments

(2 files)

Attached file perf.tar.gz

Basically since an year to use Firefox I have to wait like 5 minutes to use it after starting.

  • Update firefox nightly automatically
  • Restart, the tabs are in loading with the spinner but I cannot open any tabs or extension popup until finishes

So on my tests happens at every boot of my Debian Sid machine with KDE and at every update, so I keep Firefox nightly turned on all the day just for that reason to not waste time.
Moving the same profile files to a different computer don't show that issue, on that Firefox doesn't show any issue also if it is the same distribution.

Attached there is the perf generated (original file and the one for profiler generated) as I cannot use profiler.firefox.com when is running because of the tabs unresponsiveness.
The perf is generated yesterday with 96.0a1 (2021-11-16).

OS: Unspecified → Linux
Hardware: Unspecified → Desktop

Hi Daniele,

Thanks for gathering this profile. The information we can get from perf profiles is pretty limited - what would be ideal is a profile from the Firefox Profiler itself. It's possible for you to profile startup - the instructions are here: https://profiler.firefox.com/docs/#/./guide-startup-shutdown, but make sure to have the Profiler button enabled first in the toolbar (which you can do by visiting https://profiler.firefox.com and clicking the "Enable" button on the page).

Flags: needinfo?(mte90net)

The problem is that firefox UI is not interactive until finish loading, if I press a button that open a tab is still loading.
When that time is finished everything is loaded so I cannot use the firefox profiler.

Flags: needinfo?(mte90net)

Hi Daniele,

The instructions in the documentation I sent you allow you to set an environment variable so that the profiler is recording immediately upon startup. Then, once the browser becomes responsive, you can click on the profiler to immediately dump that recording. Does that work?

Flags: needinfo?(mte90net)

(In reply to Daniele "Mte90" Scasciafratte from comment #0)

Created attachment 9251358 [details]
perf.tar.gz

Here's a link to the perf profile: https://share.firefox.dev/3oHodwn

The renderer thread seems very busy, and >60% of the CPU time spent in that thread is spent in the /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.495.44 library.

I did now [removed] but Firefox wasn't hanging as usual I guess because didn't download a new version.
I will wait for the next nightly update in this way :-)

Flags: needinfo?(mte90net)

(In reply to Daniele "Mte90" Scasciafratte from comment #5)

I did now https://profiler.firefox.com/from-browser/calltree/?globalTrackOrder=cd0wb&hiddenGlobalTracks=1w7&hiddenLocalTracksByPid=61735-1w7~62053-01~62047-01~62055-0&localTrackOrderByPid=61735-9a0w8~61863-10~61858-10~61786-10~62311-10~62153-10~62152-10~61826-10~62053-340w2~62047-340w2~62055-2301~61906-120&thread=o&timelineType=cpu-category&v=6 but Firefox wasn't hanging as usual I guess because didn't download a new version.

This link points to a profile that only exists locally on your computer. To share it, in the profiler web page you need to click the "Upload Local Profile" button in the top right corner.

I knew that I was doing something wrong... anyway:

Same machine, same profile and same firefox to the nightly 96.0a1 (2021-11-19).

(In reply to Daniele "Mte90" Scasciafratte from comment #7)

I knew that I was doing something wrong... anyway:

Thanks a lot for the profiles! The same issue is visible in both profiles. In both cases, web page loads are blocked until add-ons have finished initializing, and add-on initialization is blocked on accessing local storage, which for some reason does a LOT of disk I/O. In one case, this is blocked for 3.3s, in the other case it's blocked for 400s(!).

It might be interesting to look at the content of the /home/mte90/.mozilla/firefox/ixnzmhh8.default/storage/default/ folder on your disk. How many files does it contain? What's the total size?

Andrew, could you please have a look at these profiles? What should be the next step in fixing these extremely slow startups?

Flags: needinfo?(bugmail)

So my profile is around since years, in that folder there are 2729 folders.
I see that the disk usage is 3GB, also in that folder there are folders very old not modified since 2019. So I think that cleaning the one not updated in 2021 can be a way but before to do it I guess that is helpful for further debugging.

PS: the folder not modified between 2019 and 2020 are instead 2.5GB.

Just for guessing, firefox load all of them on browser starting instead of a lazy loading when is needed?

Attached image size_folders.png

With Filelight the most heavy folder is the one of app.Slack.com with 2GB.

Just for curiosity (as I am thinking to do a talk for the Mozilla Italia community after this experience to help them to report issues), where did you see this information in the profiles?

PS: I have a SSD disk in this machine.

(In reply to Daniele "Mte90" Scasciafratte from comment #9)

Just for guessing, firefox load all of them on browser starting instead of a lazy loading when is needed?

My guess is that at startup Firefox goes through all these files to check their sizes, and remove the oldest files if a quota has been exceeded. But that's just my guess; I've asked someone who knows this code to have a look at what's happening.

(In reply to Daniele "Mte90" Scasciafratte from comment #10)

Just for curiosity (as I am thinking to do a talk for the Mozilla Italia community after this experience to help them to report issues), where did you see this information in the profiles?

I'm afraid that will be difficult to explain with just a text comment, a video of going through the profile would probably make more sense.

PS: I have a SSD disk in this machine.

I would not have guessed that! I saw some fsync operations taking up to 2.1s in your profiles.

This bug reminds me bug 1567863.

See Also: → 1567863

I have a SSD disk that is mounted in /home/ where the profile is.
That disk also have the / but they are in 2 different partitions, so it is surprising maybe the SSD in the laptop where I migrated the profile and don't have that issue is more fast of this one.

(In reply to Florian Quèze [:florian] from comment #12)

I'm afraid that will be difficult to explain with just a text comment, a video of going through the profile would probably make more sense.

Fine too if it was complicated to me to understand maybe it is too much advanced for newcomers where the talk would be the target audience.

Flags: needinfo?(bugmail) → needinfo?(jvarga)
Whiteboard: [qf]

Talking with Jan privately, he suggested me some config changes to use a new archive feature (that I see i naction in the archive folder since septmber of this year).

  1. dom.quotaManager.unaccessedForLongTimeThresholdSec set as 15552000
    https://searchfox.org/mozilla-central/rev/70b32246fce5ca1f53af573a21c1939df58cb969/modules/libpref/init/StaticPrefList.yaml#2933

  2. dom.quotaManager.longQuotaInfoLoadTimeThresholdMs as 5000
    https://searchfox.org/mozilla-central/rev/70b32246fce5ca1f53af573a21c1939df58cb969/modules/libpref/init/StaticPrefList.yaml#2971

  3. dom.quotaManager.loadQuotaFromCache as false

Restart firefox and took like 3-4 seconds! I saw also like 700MB less in the storage/default folder instead the new archive folder is just 300MB.
Reverting the last option to true require awhile to firefox to load it, but I saw that is more fast probably removing all those files helped :-)

My Firefox now take less time compared before but a minute is required anyway.
I should be able to profile a new start with the latest nightly per today.
https://share.firefox.dev/3y8Aiz1

Thanks, Daniele.

The profile is very helpful. What I can see is 3 WebExtensions taking a really really long time to pull things from storage, and I would hazard a guess that one of these WebExtensions are using the WebRequest API for blocking network requests, so that all network activity is suspended until initialization has completed.

The WebExtensions I see taking a long time to start up are:

  • Local Sheriff
  • Medium Unlimited
  • Bitwarden

It's unclear to me whether this is a problem in our WebExtension layer, or in the storage layer that the WebExtensions are communicating with.

Product: Firefox → WebExtensions

In the parent process I see a quite long sequence of FileIO markers which seems related to the quota manager service initialization (https://share.firefox.dev/3oEDy2d), and so my guess is that:

  • some of the extension enabled are blocking network requests and awaiting for some data pulled from browser.storage.local WebExtensions (and/or localStorage and IndexedDB WebAPIs which also depends from the quota manager service being fully initialized)
  • and so all startup network requests that have been intercepted by these extensions are going to stay suspended until the quota manager service is fully initialized

I'm also linking as see also:

  • Bug 1734947, another extension-related startup issue that we were triaging a few days ago (which is also very likely triggered by the same kind of issue with waiting for the quota manager initialization to be completed)
  • Bug 1743485, an enhancement that we have been discussing with the rest of the team (related to "showing a warning box similar to the one we show when an extension content script is keeping a tab too busy", so that we can make the user aware of which are the extensions that are preventing a loading tab to complete for longer than a certain threshold time)
See Also: → 1734947, 1743485

So as suggested in the other ticket I removed the slack files and seems more faster, after all there were 2gb of files...
Now the biggest DB is one from the local sheriff extension that I am removing.

Anyway I just logged in the mozilla slack after this cleaning and I have already 72mb of database.

I will do a new profile tomorrow with the new version now that the storage folder is 525mb maybe everything will be different.

Flagging Daniele to get the updated profile :)

Flags: needinfo?(mte90net)

Sorry for the delay, I was trying to generate a new profile but I was forgetting as this happen on the first run of the day (without an update).
This is the last profile of now https://share.firefox.dev/3DSzMX6 after some files cleaning

Flags: needinfo?(mte90net)

I don't use Slack and don't have a giant profile (~250MB) and I am still seeing 30 second delays on startup. Here is a profile:

https://share.firefox.dev/3F6L8bu

Interestingly, when nightly restarts to update, there typically is no startup delay.

The same issue, I guess that there is a daily check for those files so happens once a day.
This option seems to avoid those lag:

dom.quotaManager.loadQuotaFromCache as false

Setting that preference to False does not abate the issue for me, unfortunately :(

(In reply to Daniele "Mte90" Scasciafratte from comment #25)

The same issue, I guess that there is a daily check for those files so happens once a day.
This option seems to avoid those lag:

dom.quotaManager.loadQuotaFromCache as false

Actually, this should be used only for special testing. It disables the cache so if there are many files in profile, initialization will take long time after each restart, not only after an upgrade.
Each upgrade of FF needs to invalidate the cache, that's why the delay is observed only after an upgrade which happens much more often with Nightly builds.

Flags: needinfo?(jvarga)

I have the opposite effect, there is no delay only when nightly restarts for an update (probably because cache invalidation this is not the root cause of my issue)

I commented here because https://bugzilla.mozilla.org/show_bug.cgi?id=1734947 was marked duplicate of this, but I don't think they're strictly related

So what's the next step here? Do we need more information about what's happening, or do we need to find who can fix it?

Flags: needinfo?(jvarga)

So, I was the reporter on Bug 1734947; I duped it here after deleting my Slack cache improved things. Having said that, things still aren't great.

I've recorded a startup profile: https://share.firefox.dev/3FevwCE I don't know what's important there, but there's a network request that says it spent 12 seconds "Waiting for Socket Thread". This jibes with my experience of a huge hang on the startup, and is visible in the screenshots pane.

I did look at my profile again: no more huge slack; now the largest piece of content there is 625Mb inside of settings/security-state/cert-revocations

(Very welcome to split this bug into different ones if we're just seeing overlapping symptoms for different bugs)

I also have a feeling that two or more different issues are tracked here. One issue is definitely a profile with a lot of files and directories under the storage/ directory. We introduced some functionality for clearing unused stuff, but preferences for that are currently set to quite conservative numbers. We plan to tweak them and see results in telemetry.

Flags: needinfo?(jvarga)

(In reply to Jan Varga [:janv] from comment #31)

I also have a feeling that two or more different issues are tracked here.

The main issue here, visible in all profiles, is the fact that page loads are blocked (for many seconds, sometimes minutes) during startup when an ad-blocker add-on attempts to access IndexedDB on the WebExtensions process, because IndexedDB only becomes available after touching on disk many files in the <user profile>/storage/default folder that are not related to the data the add-on is trying to access. Disk I/O is known to be very slow, so we can't block startup on iterating through a potentially big folder.

What needs to happen so that we no longer block startup on this disk I/O?

Flags: needinfo?(jvarga)

(In reply to Florian Quèze [:florian] from comment #32)

What needs to happen so that we no longer block startup on this disk I/O?

We need to resume and finish work in bug 1671932. That's a long term solution, but it's also a non-trivial task/project. However, we can investigate some other things which can mitigate the issue:

  1. As I mentioned earlier, we have a cache which is used to avoid iterating over entire storage/ directory, but the cache is currently invalidated after each upgrade because there may be changes in the way how usage is calculated and tracked. It might be possible to find a way to invalidate the cache only when it's really needed.

  2. Another short term option is to reduce number of files and directories as I mentioned previously.

It's all about resources, we didn't have time for this, things might change next year.

Flags: needinfo?(jvarga)

(In reply to Florian Quèze [:florian] from comment #32)

The main issue here, visible in all profiles, is the fact that page loads are blocked (for many seconds, sometimes minutes) during startup when an ad-blocker add-on attempts to access IndexedDB on the WebExtensions process, because IndexedDB only becomes available after touching on disk many files in the <user profile>/storage/default folder that are not related to the data the add-on is trying to access. Disk I/O is known to be very slow, so we can't block startup on iterating through a potentially big folder.

Generally speaking, this shouldn't have anything to do with WebExtensions directly. IndexedDB (or any other storage API which stores data in storage/ directory) can be accessed from many places, so the problem can be triggered by launching FF which has no special extensions installed and loading a page which uses one of the storage APIs.

Bug 1743485 was mentioned here, but I'm not sure if that really belongs here.

There's also the issue with Slack which is again a bit different problem.

(In reply to Jan Varga [:janv] from comment #34)

Bug 1743485 was mentioned here, but I'm not sure if that really belongs here.

We added a link to Bug 1743485 as an enhancement on the "WebExtensions internals" side that may at least help the user to better deal with the more general "an extension prevents a tab load to be completed" issue (first of all by making the user aware that a tab load is being blocker for longer than expected by an installed extension and not because of Firefox itself or networking issues, and possibly also by pointing out to the user which is the extension responsible for the tab load being stuck and offer to disable that extension or unblock the suspended request).

It is definitely not meant to be a solution for the the specific "extension waiting for the browser.storage.local/IndexedDB/LocalStorage data to be available" root cause, but an extension blocking webrequest may also be get a tab load to be stuck for other reasons, e.g. they may be awaiting for some other condition (e.g. a promise created internally which may never be resolved because the extension didn't handle some error condition that was not expecting), and Bug 1743485 may help the user to report to us this kind of issue as an extension issue and to know how to workaround the issue in the meantime.

(btw, Thanks Jan for taking a look to this issue and mentioning other relevant bugzilla issues).

Looks like there's little to do here as part of the WebExtensions framework, moving to storage.

(In reply to Tomislav Jovanovic :zombie from comment #36)

Looks like there's little to do here as part of the WebExtensions framework, moving to storage.

Did you mean to change the product and component of this bug?

Thanks, I thought I did! :)

Component: General → Storage: Quota Manager
Product: WebExtensions → Core

Based on the above thread:

Severity: -- → S3
Depends on: 1671932
Priority: -- → P3
See Also: 1734947

first of all by making the user aware that a tab load is being blocker for longer than expected by an installed extension and not because of Firefox itself

I disagree with the "not because of Firefox itself" part, because essentially this is where the stall is at launch when an extension (such as uBlock Origin) tries to read back data from the storage, the extension is just waiting, not using any CPU at all as would be the case with bad JS hogging the process (such as seen in this instance).

I've gone to great length to ensure speedy launch time, and I consider blaming the extension for a delayed first page load while it's merely waiting on the Firefox API to respond back to be unfair, and essentially feeding flawed information to the user who may end up making bad decision on that flawed information.

(In reply to rhill@raymondhill.net from comment #40)

first of all by making the user aware that a tab load is being blocker for longer than expected by an installed extension and not because of Firefox itself

I disagree with the "not because of Firefox itself" part, because essentially this is where the stall is at launch when an extension (such as uBlock Origin) tries to read back data from the storage, the extension is just waiting, not using any CPU at all as would be the case with bad JS hogging the process (such as seen in this instance).

I've gone to great length to ensure speedy launch time, and I consider blaming the extension for a delayed first page load while it's merely waiting on the Firefox API to respond back to be unfair, and essentially feeding flawed information to the user who may end up making bad decision on that flawed information.

You are definitely right, my sincere apologies, I didn't want to imply that when an extension is blocking a webpage loading on startup (especially if it is happening because Firefox itself is waiting for the QuotaManagerService to be ready as in this case) we should be blaming the addon by default, this bug is definitely about an issue to fix on a Firefox component side (which just happens to be made more visible to the user because of how it interacts with the extensions depending on the same component to be ready during startup).

This is clearly a good point and I'm going to highlight it in Bug 1743485 to make sure we will take it into account when we may be evaluating it in more detail (I just added Bug 1743485 comment 3 to point it out, and linked comment 40 as additional context).

Whiteboard: [qf] → [qf:p1:pageload]
Performance Impact: --- → P1
Keywords: perf:pageload
Whiteboard: [qf:p1:pageload]

Just to let you know that after 4 months from my last comment the issue is still there and removing the data of Slack.com (database files) seems make the difference (and they are not from an extension).
It isn't possible to load the data of website from storage only when the website is opened? in this way they are not loaded at every boot and to find files don't updated a solution could be get the last visit from the history or the last changed date.

I am just doing suggestions because it is very annoying that issue...

(In reply to Daniele "Mte90" Scasciafratte from comment #43)

It isn't possible to load the data of website from storage only when the website is opened? in this way they are not loaded at every boot and to find files don't updated a solution could be get the last visit from the history or the last changed date.

This is tracked already in bug 1671932. Still: Thanks for the suggestion!

See Also: → 1671932
See Also: → 1651434

There is any hope for this ticket?
Every day I have Firefox that takes some minutes before I can use it...

(In reply to Daniele "Mte90" Scasciafratte from comment #45)

There is any hope for this ticket?
Every day I have Firefox that takes some minutes before I can use it...

Yes, there's active work happening towards fixing bug 1567863, which I believe has the same root cause.

It looks like bug 1567863 is now fixed. Daniele, can you still reproduce in latest Nightly?

Flags: needinfo?(mte90net)
Summary: Firefox not load tabs after starting and update to new version → Firefox loads tabs slowly after starting and update to new version

No I waited a bit as I saw the other ticket's patch, I am using now 103.0a1 (2022-06-09) (64 bit).
I am not sure if takes less compared to before as in those moths to live better with this bug I removed a lot of files but the problem is still there.
Do you think that can be helpful another profiling?

Flags: needinfo?(mte90net)

(In reply to Daniele "Mte90" Scasciafratte from comment #48)

No I waited a bit as I saw the other ticket's patch, I am using now 103.0a1 (2022-06-09) (64 bit).
I am not sure if takes less compared to before as in those moths to live better with this bug I removed a lot of files but the problem is still there.
Do you think that can be helpful another profiling?

The patch from bug 1702600 should help to avoid that the cached origin size information cannot be read (if too big) on startup, causing then a full scan. The positive effect should be visible after two starts of the same build id (the size cache is invalidated with each change of build number, which makes it happen quite often for nightly users). If after two starts you still experience slow startup, a profile could definitely help to rule out that we still see the same problem (there are many things that can influence startup time).

Flags: needinfo?(mte90net)

So if with Firefox nightly I have that issue, I will have at every boot when a new version is downloaded (that is the issue of this ticket).
I see during the day that closing Firefox and reopening (new boot without a FF update) it takes less of 6 seconds to show the tabs, but the issue is that I have on every day as Nightly is updated twice during the day.

Maybe in Nightly cases it should do it on every new version bump and not at every new release?

Flags: needinfo?(mte90net)

(In reply to Daniele "Mte90" Scasciafratte from comment #50)

Maybe in Nightly cases it should do it on every new version bump and not at every new release?

There are pros and cons to this. Testing more frequently than once every release cycle if full origin scanning works as expected is probably desirable and binding it to the build number was kind of straight forward for the normal release channels. However, having it twice a day is probably not needed and might even paper over potential problems with using the cached information more often.

Please note that there is no strong technical reason that dictates us when we would want to invalidate the cached size information. We kind of expect that no user or other program modifies the data inside Firefox' origin specific folders inside the profile between two starts, independently from changing build numbers. We could definitely reason about changing our strategy here a bit and take the cached information always for good on startup but then have some background thread that updates them from time to time. This is also getting worse these days as modern web applications store now much more information through our APIs, so what might have been acceptable on startup a few years ago has started to become disturbing these days. We have other bugs on file to mitigate the impact, in particular bug 1671932 would help a great deal here.

I understand those issues but if every day I have to wait minutes for Firefox starting up just because I am using Firefox nightly is not something sustainable and will force me to switch to beta or stable release.

I want to use that to help on development but if for the storage of website like Slack.com that use GB of datas I have to wait every day is not something that I am expecting for my daily browsing.
Maybe in case that the files aren't updated in the previous days not update the cache for some files can be a solution?

I don't know how works Firefox internally but a solution to simplify the nightly users life should be on schedule :-D

(In reply to Jens Stutte [:jstutte] from comment #51)

There are pros and cons to this. Testing more frequently than once every release cycle if full origin scanning works as expected is probably desirable and binding it to the build number was kind of straight forward for the normal release channels. However, having it twice a day is probably not needed and might even paper over potential problems with using the cached information more often.

Even on the release channel, with release cycles now being only 4 weeks long, and us frequently having one or more dot releases in the meantime, rechecking at startup at every build id change seems like it would delay page loads frequently (and potentially at an unfortunate time, if the page load that is delayed is the "what's new?" page after an update).

We could definitely reason about changing our strategy here a bit and take the cached information always for good on startup but then have some background thread that updates them from time to time.

Sounds like a good solution. Is there already a bug on file to implement this? Or is it covered by bug 1671932?

(In reply to Florian Quèze [:florian] from comment #53)

Sounds like a good solution. Is there already a bug on file to implement this? Or is it covered by bug 1671932?

I'd say it is covered by that one but the description of that bug might merit some love and we might want to check, if we can have some intermediate steps here that give us some quick win.

There is something I can do for this?
Let me know if you need a new profiling etc.

The Performance Priority Calculator has determined this bug's performance priority to be P1. If you'd like to request re-triage, you can reset the Performance flag to "?" or needinfo the triage sheriff.

Platforms: [x] Windows [x] macOS [x] Linux
Impact on browser UI: Renders browser effectively unusable
[x] Multiple reporters

The severity field for this bug is set to S3. However, the following bug duplicate has higher severity:

:hsingh, could you consider increasing the severity of this bug to S2?

For more information, please visit auto_nag documentation.

Flags: needinfo?(hsingh)

Per comment #10, is this still problematic after bug 1784700? Maybe purge the Slack (and anything that has large cache size) data about:preferences#privacy and see if it happens again?

I removed manually the folder in /home/mte90/.mozilla/firefox/ixnzmhh8.default/storage/default/ that include the slack domain in the folder because I am very annoyed about that issue.
Anyway I still have the issue.

This issue has an extra impact on extensions using event page.

If the extension has to setup some state at startup, we would write code like this:

browser.runtime.onStartup.addListener(init);
function init() {
  const options = await browser.storage.local.get();
  if (options.foobar) {
    /* foobar setup */
  }
}

However, this would not work. After startup, the call to browser.storage would be blocked by this bug, and Firefox would unload the event page.

Therefore foobar setup never happens.

See Also: → 1800914

(In reply to eight04 from comment #60)

This issue has an extra impact on extensions using event page.

Thanks for the example. This is undesirable, and I filed a new bug to track a solution (bug 1800914).

The severity field for this bug is set to S3. However, the Performance Impact field flags this bug as having a high impact on the performance.
:aiunusov, could you consider increasing the severity of this performance-impacting bug? Alternatively, if you think the performance impact is lower than previously assessed, could you request a re-triage from the performance team by setting the Performance Impact flag to ??

For more information, please visit auto_nag documentation.

Flags: needinfo?(aiunusov)
Severity: S3 → S2
Flags: needinfo?(aiunusov)

I'll mark this as duplicate of bug 1671932, raising its severity over there. IIUC bug 1800914 might help to mitigate the specific extension case, but the performance impact of storage initialization is a generally known problem and should be solved at the right place.

Status: NEW → RESOLVED
Closed: 2 years ago
Duplicate of bug: 1671932
Resolution: --- → DUPLICATE
Flags: needinfo?(hsingh)
No longer depends on: 1671932
Duplicate of bug: 1899015
No longer duplicate of bug: 1671932
See Also: 1671932
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: