Open Bug 1713129 Opened 4 months ago Updated 2 months ago

When restarting after an upgrade Firefox hangs...

Categories

(Core :: Performance, defect, P5)

Firefox 89
x86_64
Windows 10
defect

Tracking

()

People

(Reporter: jeffisontheroad, Unassigned)

Details

(Whiteboard: QA-not-reproducible)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0

Steps to reproduce:

Each time Firefox updates itself this will happen starting back a few weeks (maybe a month or so) ago.

Actual results:

Nothing. The firefox window will open and after quite some time it will display the menu bar and ask for a password and then.... hang there. The way out is to close the program and reopen, sometimes twice. If I look in the task manager it seems as if FF is scanning every font on my computer but it never comes back.

Expected results:

Firefox should have opened normally.

Hey Jmg,
I tried reproducing this by upgrading from older versions to the latest versions of Firefox Nightly 90.0a1 (2021-05-30), beta 89.0 and release 88.0.1 but I encounter no issues.

Could you test on a fresh new profile? You can find more about creating a new profile here : https://support.mozilla.org/en-US/kb/troubleshoot-and-diagnose-firefox-problems#w_6-create-a-new-firefox-profile .
If possible, you can test this issue on the nightly build as well. Download the build from : https://www.mozilla.org/en-US/firefox/nightly/all/ .

Flags: needinfo?(jeffisontheroad)
Whiteboard: QA-not-reproducible

The Bugbug bot thinks this bug should belong to the 'Toolkit::Application Update' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Application Update
Product: Firefox → Toolkit

I have tried but cannot reproduce the problem loading the nightly release. However, once a new beta update comes around I'll be sure to pay attention!

Flags: needinfo?(jeffisontheroad)

(In reply to JmG from comment #0)

The way out is to close the program and reopen, sometimes twice. If I look in the task manager it seems as if FF is scanning every font on my computer but it never comes back.

This doesn't sound like a problem with application update, to me.

Component: Application Update → Untriaged
Product: Toolkit → Firefox

When I updated to the newest beta last night (90.0b1) the same thing happened. What it looks like it's doing is that Firefox seems to be checking every font (600+meg worth) and all it's cookies before starting properly. This only happens when installing automatic updates. I'm about to install b2 and expect the same...

Similar problem here. After upgrading to 89 firefox refuse to restore my previous sessions. The windows opens but only one of them manage to render any UI/decoration. None has content and they're completely unresponsive. Xorg use close to 100% CPU and gnome reports "the application is unsresponsive".

No useful output when starting in from a terminal.

Starting a fresh profile works, but it's unacceptable to lose history and opened tabs.

Ubuntu 20.10

Um, after letting it work for around 5 minutes it becomes responsive and normal. It happens again when restarting firefox.

(In reply to olejorgenb from comment #7)

Um, after letting it work for around 5 minutes it becomes responsive and normal. It happens again when restarting firefox.

As noted above, it seems as though FF is scanning every font in my directory before becoming responsive and I have no idea why it would do that. And that would take up the time.

When starting in "troubleshooting mode" (but same large profile) the problem is not present.

My enabled extensions:

Norsk bokmål ordliste
Amazon.com
Bing
Dormancy
DuckDuckGo
eBay
Firefox Multi-Account Containers
Google
Google™ Translator
hn-friends
Media URL Timestamper
ModHeader
OneTab
Refined Hacker News
Stylus
uBlock Origin
Video Speed Controller
Wikipedia (en)
English (CA) Language Pack
English (GB) Language Pack

Setting a component for this issue in order to get the dev team involved.
If you feel it's an incorrect one please feel free to change it to a more appropriate one.

Component: Untriaged → Performance
Product: Firefox → Core

olejorgenb, ok, then this could be some addon issue. Could you try disabling addons, possibly one by one?

JmG, do you have some addons? And how did you check that Firefox is going through all the fonts and cookies? If you have some performance profile, could you perhaps share it?

Flags: needinfo?(olejorgenb)
Flags: needinfo?(jeffisontheroad)

Even starting in safe mode the same thing happens. How do I know it's scanning my fonts folder? By using the Resource Monitor and watching what's going on. Since last night's update it's a tad faster but that's about it. I'm still confused as to why Firefox wants to scan my fonts but once it's done the application becomes responsive.

Flags: needinfo?(jeffisontheroad)

Sounds like there are two issues here.
olejorgenb, could you file a new bug for the issue you see.

jfkthame, does that font handling ring any bells?

Flags: needinfo?(jfkthame)

In the mean time, JmG, it might be helpful to capture a profile of Firefox while it's having this issue. Documentation for the profiler is here, and you can get a startup profile by setting the MOZ_PROFILER_STARTUP environment variable to 1. (And I would also set MOZ_PROFILER_STARTUP_FEATURES to js,leaf,stackwalk,threads,fileio)

Flags: needinfo?(jeffisontheroad)

At startup, Firefox collects the list of installed fonts, but it's always done that and it shouldn't take a huge amount of time. Firefox 89 enabled a new implementation of the font management, so that could have changed something - but Safe Mode should have reverted to the old implementation and so if it were an issue with the new font code, that would "fix" it.

I wonder if some kind of security (e.g. antivirus?) software is intercepting all file access calls and making what would normally be a quick scan of the font files take much longer?

Flags: needinfo?(jfkthame)

(In reply to Doug Thayer [:dthayer] (he/him) from comment #14)

In the mean time, JmG, it might be helpful to capture a profile of Firefox while it's having this issue. Documentation for the profiler is here, and you can get a startup profile by setting the MOZ_PROFILER_STARTUP environment variable to 1. (And I would also set MOZ_PROFILER_STARTUP_FEATURES to js,leaf,stackwalk,threads,fileio)

I've got the profiler but I have no idea where to set the variables you're referring to though I have watched the intro videos. I also have 224 typefaces installed....

Flags: needinfo?(jeffisontheroad)

(In reply to JmG from comment #16)

(In reply to Doug Thayer [:dthayer] (he/him) from comment #14)

In the mean time, JmG, it might be helpful to capture a profile of Firefox while it's having this issue. Documentation for the profiler is here, and you can get a startup profile by setting the MOZ_PROFILER_STARTUP environment variable to 1. (And I would also set MOZ_PROFILER_STARTUP_FEATURES to js,leaf,stackwalk,threads,fileio)

I've got the profiler but I have no idea where to set the variables you're referring to though I have watched the intro videos. I also have 224 typefaces installed....

Ah, sorry I should have been clearer. It looks like you're on Windows 10, so you can hit the windows key, and search for "Edit the system environment variables". Click on the Environment Variables button, and in the bottom section click the "New..." button. You can then enter MOZ_PROFILER_STARTUP in the "Variable name" field and 1 in the "Variable value" field. Repeat that last step again for MOZ_PROFILER_STARTUP_FEATURES with js,leaf,stackwalk,threads,fileio as the value.

Then when you launch Firefox, if you've already enabled the profiler button, it should already be blue, and you can click on it to capture a profile.

Also, regarding Jonathan's question above, do you by any chance have antivirus software installed which may be making it more expensive to scan your font files?

Flags: needinfo?(jeffisontheroad)

(In reply to Doug Thayer [:dthayer] (he/him) from comment #17)

(In reply to JmG from comment #16)

(In reply to Doug Thayer [:dthayer] (he/him) from comment #14)

In the mean time, JmG, it might be helpful to capture a profile of Firefox while it's having this issue. Documentation for the profiler is here, and you can get a startup profile by setting the MOZ_PROFILER_STARTUP environment variable to 1. (And I would also set MOZ_PROFILER_STARTUP_FEATURES to js,leaf,stackwalk,threads,fileio)

I've got the profiler but I have no idea where to set the variables you're referring to though I have watched the intro videos. I also have 224 typefaces installed....

Ah, sorry I should have been clearer. It looks like you're on Windows 10, so you can hit the windows key, and search for "Edit the system environment variables". Click on the Environment Variables button, and in the bottom section click the "New..." button. You can then enter MOZ_PROFILER_STARTUP in the "Variable name" field and 1 in the "Variable value" field. Repeat that last step again for MOZ_PROFILER_STARTUP_FEATURES with js,leaf,stackwalk,threads,fileio as the value.

Then when you launch Firefox, if you've already enabled the profiler button, it should already be blue, and you can click on it to capture a profile.

Also, regarding Jonathan's question above, do you by any chance have antivirus software installed which may be making it more expensive to scan your font files?

Done. And I'll take care of this in the morning.

As for antivirus, whatever comes with windows. Malwarebytes was slowing me down and this box is old. But, I don't see the font files (there are more than 200 installed typefaces and all their attendant files) being touched by defender, only by Firefox. Someone noted above that Firefox does check my fonts but I don't know why it would check anything other than the ones designated in settings and look for others, if it needs to, on the fly.

Flags: needinfo?(jeffisontheroad)

(In reply to JmG from comment #18)

(In reply to Doug Thayer [:dthayer] (he/him) from comment #17)

(In reply to JmG from comment #16)

(In reply to Doug Thayer [:dthayer] (he/him) from comment #14)

In the mean time, JmG, it might be helpful to capture a profile of Firefox while it's having this issue. Documentation for the profiler is here, and you can get a startup profile by setting the MOZ_PROFILER_STARTUP environment variable to 1. (And I would also set MOZ_PROFILER_STARTUP_FEATURES to js,leaf,stackwalk,threads,fileio)

I've got the profiler but I have no idea where to set the variables you're referring to though I have watched the intro videos. I also have 224 typefaces installed....

Ah, sorry I should have been clearer. It looks like you're on Windows 10, so you can hit the windows key, and search for "Edit the system environment variables". Click on the Environment Variables button, and in the bottom section click the "New..." button. You can then enter MOZ_PROFILER_STARTUP in the "Variable name" field and 1 in the "Variable value" field. Repeat that last step again for MOZ_PROFILER_STARTUP_FEATURES with js,leaf,stackwalk,threads,fileio as the value.

Then when you launch Firefox, if you've already enabled the profiler button, it should already be blue, and you can click on it to capture a profile.

Also, regarding Jonathan's question above, do you by any chance have antivirus software installed which may be making it more expensive to scan your font files?

Done. And I'll take care of this in the morning.

As for antivirus, whatever comes with windows. Malwarebytes was slowing me down and this box is old. But, I don't see the font files (there are more than 200 installed typefaces and all their attendant files) being touched by defender, only by Firefox. Someone noted above that Firefox does check my fonts but I don't know why it would check anything other than the ones designated in settings and look for others, if it needs to, on the fly.

https://share.firefox.dev/3gNYfEJ

I'm thinking that's what you're looking for.

https://share.firefox.dev/3gNYfEJ

I'm thinking that's what you're looking for.

(In reply to Doug Thayer [:dthayer] (he/him) from comment #17)

(In reply to JmG from comment #16)

(In reply to Doug Thayer [:dthayer] (he/him) from comment #14)

In the mean time, JmG, it might be helpful to capture a profile of Firefox while it's having this issue. Documentation for the profiler is here, and you can get a startup profile by setting the MOZ_PROFILER_STARTUP environment variable to 1. (And I would also set MOZ_PROFILER_STARTUP_FEATURES to js,leaf,stackwalk,threads,fileio)

I've got the profiler but I have no idea where to set the variables you're referring to though I have watched the intro videos. I also have 224 typefaces installed....

Ah, sorry I should have been clearer. It looks like you're on Windows 10, so you can hit the windows key, and search for "Edit the system environment variables". Click on the Environment Variables button, and in the bottom section click the "New..." button. You can then enter MOZ_PROFILER_STARTUP in the "Variable name" field and 1 in the "Variable value" field. Repeat that last step again for MOZ_PROFILER_STARTUP_FEATURES with js,leaf,stackwalk,threads,fileio as the value.

Then when you launch Firefox, if you've already enabled the profiler button, it should already be blue, and you can click on it to capture a profile.

Also, regarding Jonathan's question above, do you by any chance have antivirus software installed which may be making it more expensive to scan your font files?

Doug, This profile is from this morning, after a reboot and an update of FF to 90.0b12

https://share.firefox.dev/3h8o6X1

I hope this helps as I can't make head nor tales of the data collected.

Hey JmG, sorry for the delay here. These stacks are actually a bit bizarre for me. During this window where we seem to be doing a lot of font stuff, the stacks are all exceptionally flat. Gerald, it looks like during this window the profiler is only collecting the leaf nodes and XRE_main? Is that normal? I've never seen it.

Flags: needinfo?(gsquelart)

(In reply to Doug Thayer [:dthayer] (he/him) from comment #22)

Hey JmG, sorry for the delay here. These stacks are actually a bit bizarre for me. During this window where we seem to be doing a lot of font stuff, the stacks are all exceptionally flat. Gerald, it looks like during this window the profiler is only collecting the leaf nodes and XRE_main? Is that normal? I've never seen it.

No need for apologies. Things take time and, there's life beyond the internet!

Hopefully you're using the Profile log from Comment 21 as that's the one where FF installed it's update, usually when the issue is the worst. It's also just after a fresh boot of my machine.

I've tried making sense of the Profiler charts myself but I might as well be reading Tralfamadorian.

(In reply to Olli Pettay [:smaug] from comment #13)

Sounds like there are two issues here.
olejorgenb, could you file a new bug for the issue you see.

Can no longer reproduce - think I've had one minor update since last time. I'm on 89.0.1 now

Flags: needinfo?(olejorgenb)

(In reply to Doug Thayer [:dthayer] (he/him) from comment #22)

[...] These stacks are actually a bit bizarre for me. During this window where we seem to be doing a lot of font stuff, the stacks are all exceptionally flat. Gerald, it looks like during this window the profiler is only collecting the leaf nodes and XRE_main? Is that normal? I've never seen it.

They are indeed strangely flat, like only the native lead function was recorded.
Even looking at the full range, in the parent process main thead, 95% of stacks start at the "XREMain::XRE_main" label (instead of the more usual RtlUserThreadStart visible in the other 5%), which shows Firefox was not able to capture native samples most of the time.
I'm afraid I can't explain it. One future Profiler project is to record sampling failures (bug 1683040), which could have helped in this case.

Forwarding NI to Florian, who has more experience looking at profiles, he may be able to interpret what's visible.

Note: Mozilla is closed this week (28 June - 2 July), so please be patient for further replies...

Flags: needinfo?(gsquelart) → needinfo?(florian)

(In reply to JmG from comment #21)

Doug, This profile is from this morning, after a reboot and an update of FF to 90.0b12

https://share.firefox.dev/3h8o6X1

I hope this helps as I can't make head nor tales of the data collected.

This profile confirms that the hang is font-related: https://share.firefox.dev/3hNzwj6
There's one minute of font-related code running here, and responsiveness isn't great during that time.

(Just in passing: the machine we're looking at here seems to be running really, really slow for some reason. Looking earlier in the profile, the parent process apparently takes 3.4 seconds to style the gfxsanity window, with just a handful of elements. That seems terrible to me.)

The font-related code running there will presumably be the loader that runs shortly after startup to preload the data needed for font selection/fallback. That's supposedly running on a low-priority thread, but apparently even that is enough activity to really drag down the system. I think what must be happening is that DirectWrite is doing an awful lot of work to (re-)build its internal caches (most of the symbols there are internal dwrite.dll functions, not gecko code or things we'd be calling directly at this point). Unfortunately the lack of full stacks in the profile makes it hard to be sure exactly which entry-point from gecko is leading to the expensive DWrite work.

(In reply to Jonathan Kew (:jfkthame) from comment #27)

(Just in passing: the machine we're looking at here seems to be running really, really slow for some reason. Looking earlier in the profile, the parent process apparently takes 3.4 seconds to style the gfxsanity window, with just a handful of elements. That seems terrible to me.)

Maybe because it's closing in on ten years old?

The font-related code running there will presumably be the loader that runs shortly after startup to preload the data needed for font selection/fallback. That's supposedly running on a low-priority thread, but apparently even that is enough activity to really drag down the system. I think what must be happening is that DirectWrite is doing an awful lot of work to (re-)build its internal caches (most of the symbols there are internal dwrite.dll functions, not gecko code or things we'd be calling directly at this point). Unfortunately the lack of full stacks in the profile makes it hard to be sure exactly which entry-point from gecko is leading to the expensive DWrite work.

Severity: -- → S3
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Unspecified → Windows 10
Priority: -- → P5
Hardware: Unspecified → x86_64

(In reply to Markus Stange [:mstange] from comment #26)

(In reply to JmG from comment #21)

Doug, This profile is from this morning, after a reboot and an update of FF to 90.0b12

https://share.firefox.dev/3h8o6X1

I hope this helps as I can't make head nor tales of the data collected.

This profile confirms that the hang is font-related: https://share.firefox.dev/3hNzwj6
There's one minute of font-related code running here, and responsiveness isn't great during that time.

The profile also shows that we are posting lots of messages to the OS.File worker, and that worker keeps the file system busy stat'ing files that seem related telemetry: https://share.firefox.dev/3wOrhck That file system activity likely contributes to making the font code slower than we would like.

Flags: needinfo?(florian)
You need to log in before you can comment on or make changes to this bug.