Closed Bug 1565902 Opened 5 years ago Closed 5 years ago

Firefox freezes system when opening a link

Categories

(Core :: XPCOM, defect)

69 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox72 --- fixed

People

(Reporter: gotmilk0112, Assigned: alexical, NeedInfo)

References

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0

Steps to reproduce:

Open a link to a website or image from another program (eg. Discord)

Actual results:

  1. The link does not open
  2. The program that you clicked from (Discord) completely freezes up
  3. Explorer.exe completely freezes up
  4. Firefox becomes unresponsive, pages do not load at all
  5. After 20-30 seconds, everything un-freezes and all clicks / key presses during the "freeze" time happen all at once.

Expected results:

Clicking on a link to any website from another program should not be freezing the entire system.

Should also add that this bug also happened under version 68. It just started happening roughly a week ago.

I don't see how your system's behavior has anything to do with Firefox? If clicking on a link to any website from another program than Firefox creates a problem then you need to report that to the support forum of your operating system, I'd say.
If "explorer.exe" freezes then you need to fix "explorer.exe"...?

Flags: needinfo?(gotmilk0112)

(In reply to Andre Klapper from comment #2)

I don't see how your system's behavior has anything to do with Firefox? If clicking on a link to any website from another program than Firefox creates a problem then you need to report that to the support forum of your operating system, I'd say.
If "explorer.exe" freezes then you need to fix "explorer.exe"...?

This problem ONLY happens with Firefox. Chrome and Edge do not exhibit the same issue.

Flags: needinfo?(gotmilk0112)

I tested this issue on Windows 10 x64 with FF Release 68 and FF Nightly 70.0a1(2019-07-17) and I can't reproduce it.
gotmilk0112, the Firefox is set as default browser?
Do you have add-ons on your profile? If yes please test this in safe mode, here is a link to help you with that: https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode
Also, it will be a good idea to try with FF Nightly to see if you are able to reproduce it. Please download the build from here: https://www.mozilla.org/en-US/firefox/channel/desktop/

Flags: needinfo?(gotmilk0112)

Firefox is the default browser.

I tried Safe Mode and the problem persisted.

Also tried using the beta/nightly version and the problem still persisted.

Flags: needinfo?(gotmilk0112)

Can you please capture a performance profile? You can get more info on how to install and use the Cleopatra add-on (that helps you get the performance profile) by going to:
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler
https://perf-html.io/
Please also note that this add-on works only on FF Nightly, so that means you need to reproduce this on Nightly.

Flags: needinfo?(gotmilk0112)

Hi,
Marking this as Resolved: Incomplete due to the lack of response from gotmilk0112.
If the issue is still reproducible with the latest Firefox version, feel free to reopen the bug with more information.

Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → INCOMPLETE

Hi,
I apologise for doing a "me too" post, but I've been monitoring this thread in hope of a resolution as I've been experiencing this same issue since the upgdate to 68.0.1.
I have done all suggestions like resetting Profile, to no avail, & have also tried the Profiler (which works with 68.0.1 btw). I've downloaded the JSON of results so if you wish to see it, let me know where to send it.

The only other details I can add are that during the pause, which is always worse on first starting FF after reboot, there is excessive SSD access (100%) for ~ 2mins, during which the file C:\Program Files (x86)\Mozilla Firefox\xul.dll is being read from (as shown in Windows Resource Monitor), averaging over 9,000,000 B/sec. During this time all mouse/keyboard input freezes until FF eventually opens.
This only happens on one of my PCs, my work PC, details of which are:-
Dell Optiplex 7010
CPU: i7-3770S
Mem: 12GB
OS: Win 10 Pro 1809.

As per OP, everything else runs fine, & this only started happening since update to v68.

This is becoming a serious issue for me, & will soon have to switch to Chrome if can't resolve, which I'd rather not do as I prefer Firefox!

Thanks Mark for your reply, can you also try the suggestions from comment 6?

Hi Ovidiu,
Yep, I've already ran the Profile, setting the MOZ_PROFILER_STARTUP=1 env variable, rebooting my PC then starting FF.
I'm just not sure how to get the info to you? I have downloaded a json file from results page, which I can email to you. It also has an option to Publish, but I've not done that - I can do if that's correct?

The Cleopatra add-on works only with the FF Nightly version. You can download it from here: https://nightly.mozilla.org/

If by Cleopatra, you mean the add-on here: https://perf-html.io/, then I've installed that, and run it, and have some profiling results. This was with v68.0.1 & appears to have worked. I've gone ahead and published the results which you can see here:
https://perfht.ml/2TfHqFp

If this is not what you want, can you point me to Cleopatra add on please, as Googling it sends me to the same link as above, but called Firefox Profiler.

Yes, the add-on, the idea is that you need to install FF Nightly, then install and start the add-on( Cleopatra) and try to reproduce the issue. Thanks for your implication on this issue.

Ok, I've done the same with the Nightly build now:-
https://perfht.ml/2TcGwJB

Interestingly, it initially started up without the pause, straight after installing FF Nightly, but after reboot, the pause returned. The heavy disk reads are still against the "xul.dll" file (in Nightly install dir).

Thanks Mark for your help.
Mike, can you please look at the performance profile from comment 14?

Flags: needinfo?(mconley)

Hi Mark,

Do you have any anti-virus or security software enabled?

Flags: needinfo?(mconley) → needinfo?(markc)

Hi Mike,
Yes, I'm using Bitdefender Endpoint Security Tools, but as this is a work PC, it's controlled remotely, so I don't have any option to disable/uninstall it.
If you need me to do this, let me know, but I'll have to go ask our support dept, so may take a few days!

Flags: needinfo?(markc)
See Also: → 1566314

Hi,
Any update on this issue?

Flags: needinfo?(mconley)

(In reply to Mark from comment #18)

Hi,
Any update on this issue?

Hi Mark, sorry - I hadn't noticed your response in this bug. I apologize for that (and thanks, Dao, for tagging me in!)

My current hypothesis is that the security software is causing hangs when we try to launch new content processes.

(In reply to Mark from comment #17)

If you need me to do this, let me know, but I'll have to go ask our support dept, so may take a few days!

If you could get it disabled and see if the problem goes away or stays, that'd certainly help support or refute my hypothesis, and either is useful for narrowing down what's happening here. So, yes please, if this is possible, that'd be very useful.

Flags: needinfo?(mconley) → needinfo?(markc)

Hi Mike, no worries, I appreciate you looking into this :)

Ok, I uninstalled BitDefender, rebooted my PC, & then tried starting Firefox, but unfortunately did the same-sat there for ~1m hammering the file C:\Program Files (x86)\Mozilla Firefox\xul.dll.
I also tried it on the Nightly release & that did the same.
One other thing to note, which I don't think I mentioned, is that once this has happened, if I immediately close down FF & then restart, it fires up quickly. The pause only returns after a period of time, maybe 30mins or so later (this includes opening up links from other apps).
If I run the Nightly version immediately after running Latest, I still get the pause on Nightly initially, but again, if I close & restart, it'll fire up quickly for a while.

Let me know if there's anything else you'd like me to try.

Flags: needinfo?(markc) → needinfo?(mconley)

Hey dthayer,

I'm starting to run out of ideas, short of having Mark run UIforETW and trying to use WPA to figure out what's happening here. Can you think of anything else for Mark to try before we go that route?

Flags: needinfo?(mconley) → needinfo?(dothayer)

Mark, do you think you could please paste in the text from about:support in here?

Attached file About Support Info
Sure.  I've done this on the Nightly install, but I can do normal release too if you wish:-

So I'm clear, is startup significantly slower when opening it via an external link? Or is the bug here just that Firefox is slow to start on your system?

Regarding differences in dll loading between Chrome and Firefox, I'm wondering if this boils down to PrefetchVirtualMemory. I'm curious what things look like if we remove the ReadAheadLib call. Short of that / identifying when this went wrong (could have been a Windows update), WPA would likely be the next step, yes.

Flags: needinfo?(dothayer)

Hi Doug,

It's both really. It is slow when you click the Firefox icon on Windows task bar, but also slow when opening a link from another application Eg. a link in an email, but as I mentioned above, only either after I have left Firefox running for a while after startup (30mins or more), or I haven't yet started Firefox.
And when I say slow, it actually freezes all Windows GUI input while it's "paused" - I can't click on anything or type until Firefox unfreezes. Windows Task Manager does keep refreshing during this time though so I can see CPU/disk usage graphs.

Flags: needinfo?(dothayer)

Does the Windows freeze only happen when you open a link from another application? Or does it also freeze on a regular FF startup?

The bit where it happens when you've left Firefox running for a while after startup makes sense, I think, as at that point the unused pages of xul.dll have likely been paged out. It also makes a clearer case for PrefetchVirtualMemory being the culprit as we end up prefetching all of xul.dll with it (so we have to fetch in the pages that have been paged out.)

Flags: needinfo?(dothayer)

No, it happens on regular FF startup also.

Flags: needinfo?(dothayer)

Mark, would you mind testing with this build and seeing if you can still reproduce the problem?

https://queue.taskcluster.net/v1/task/TDXwTxh_Q_W7-Gk3v6v6BQ/runs/0/artifacts/public/build/target.zip

(And thanks for your help in diagnosing this!)

Flags: needinfo?(dothayer) → needinfo?(markc)

Hi Doug,

That version seems to be working fine so far! Starts up immediately, even after a PC reboot, so looks like what ever you have changed has fixed it :)

I'll try it a few times througout the day to be sure, but as v68 always freezes on start after a reboot, it's looking promising!

Flags: needinfo?(markc) → needinfo?(dothayer)

Tried it a number of times throughout day & worked fine, no freezes whatsoever.
Think it's fixed in this version!

Aaron - do you have any theories here? Per the above ~7 comments it appears that PrefetchVirtualMemory to load in xul.dll is for some users causing Windows to freeze up.

Flags: needinfo?(dothayer) → needinfo?(aklotz)

Interesting. Unfortunately I do not know much about how PrefetchVirtualMemory works under the hood, so I do not really have anything to offer here.

Flags: needinfo?(aklotz)

Do we have any contacts at Microsoft that we could reach out to ask about this?

We have a mailing list with them. I'm just hoping to be able to reproduce the issue before posting a problem there, trying to come up with ideas to reproduce it locally.

Hi, just wondering if you've made any progress on this? I've recently had the v69 update & if anything, the issue seems to have got worse, in that initial start of FF is taking longer. Took over 2 minutes this morning, after which, other apps, like Thunderbird, had graphical glitches like missing Restore, Maximise & "X" buttons. Had to restart them to fix.

Flags: needinfo?(dothayer)

Hey Mark, sorry for the delay on this.

I have a thread open with Microsoft on this, trying to determine if it's a bug on their end. If I can't get anywhere with them in a week or so, my plan is to experiment with just landing the patch that I gave you in Nightly, and see how it affects our startup numbers across all users.

Flags: needinfo?(dothayer)

Ok, thanks Doug for the update, appreciate it.

Hey Mark, I'm pasting here what the Microsoft contact said:

We will need a perf trace from the user when this happens. The best thing they can do is file a feedback hub bug (start-->feedback hub) and share > the link.

Specifically in feedback hub, report a problem-->choose a category-->apps->firefox (it should find it, if not, use all other apps)
under more details, choose "inability to use device", then "recreate my problem' and have the user launch firefox.

It is entirely possible there is another app or driver that is the actual culprit here.

If they can't/won't use feedback hub, then they can manually recreate the issue and get perftraces using wpr, which is inbox in win10.
https://docs.microsoft.com/en-us/windows-hardware/test/wpt/wpr-quick-start

Let me know if there are any troubles following those steps!

Flags: needinfo?(markc)

Hi Doug,
As I don't have a MS account for my work PC, which feedback hub seems to require, I did the WPR option instead as that was pretty straight forward to do.
One odd thing I noticed is that FF paused for approximately half as long when I was running WPR as when I didn't, which seems strange! I tried it a few times & it was always quicker than if I didn't run it.
Also, I ran it with the latest v71 Nightly rather than v69, which I hope is ok?

Only problem I have now is where do I send the output file WPR generated? It's over 7GB uncompressed, and 870MB zipped.

Flags: needinfo?(markc) → needinfo?(dothayer)
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: INCOMPLETE → ---

(In reply to Mark from comment #39)

Hi Doug,
As I don't have a MS account for my work PC, which feedback hub seems to require, I did the WPR option instead as that was pretty straight forward to do.
One odd thing I noticed is that FF paused for approximately half as long when I was running WPR as when I didn't, which seems strange! I tried it a few times & it was always quicker than if I didn't run it.
Also, I ran it with the latest v71 Nightly rather than v69, which I hope is ok?

Only problem I have now is where do I send the output file WPR generated? It's over 7GB uncompressed, and 870MB zipped.

Hmm, could you use https://send.firefox.com to send me a link to the compressed version?

Flags: needinfo?(dothayer) → needinfo?(markc)

Ok have uploaded, and emailed you the link.

Flags: needinfo?(markc) → needinfo?(dothayer)

Mark, just giving an update on this so you're not in the dark: I have examined the trace and nothing jumps out at me. I asked the MS engineers for help and have heard no response.

Aaron, I'm thinking it might be worth it to experiment with disabling the prefetch of libs listed in dependentlibs.list? Superfetch might just cover us on Windows, and in any case it would be good to understand the impact with real telemetry numbers for all platforms. Thoughts?

Flags: needinfo?(dothayer) → needinfo?(aklotz)

Well, measurement is king! Until we get a response back from Microsoft, it looks like this is currently a net loss. You're closer to what's going on than I am, so I think it's probably your call!

Flags: needinfo?(aklotz)

Hi Doug,
Ok, thanks for the update!

Component: Untriaged → XPCOM
Product: Firefox → Core
Assignee: nobody → dothayer
Status: REOPENED → ASSIGNED

We haven't tested this in recent times, and it would be good to understand
what the impact is looking at telemetry measures of startup in Nightly.
This doesn't rip out everything, but we will need to do that if we
determine that the readahead has a neutral / negative effect.

Pushed by dothayer@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5f0b392beadb
Test the impact of removing startup dll readahead r=glandium
Status: ASSIGNED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72

== Change summary for alert #23866 (as of Tue, 12 Nov 2019 17:33:18 GMT) ==

Improvements:

99% tp5n main_startup_fileio windows7-32-shippable opt e10s stylo 99,801,573.17 -> 663,681.00
2% ts_paint_webext windows7-32-shippable opt e10s stylo 324.83 -> 317.00
2% ts_paint_webext windows10-64-shippable opt e10s stylo 315.33 -> 308.50
2% ts_paint windows7-32-shippable opt e10s stylo 323.83 -> 317.33

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=23866

Hi,
does the above mean that the issue has been fixed & will be in next release?

(In reply to Mark from comment #49)

Hi,
does the above mean that the issue has been fixed & will be in next release?

Hey Mark,
It means the issue should be fixed and released on Firefox Nightly (if you could confirm, that would be great, since I can't reproduce the problem myself). However we need to wait for data to come in on how this change affects startup for all of our Nightly users before we decide whether we can let this "ride the trains" to release.

Hi Doug,

I've just tried it with version 72.0a1 (2019-11-17) (64-bit), and the issue seems to be fixed - fires up almost instantly.
Fingers crossed it makes it to release!

So, unfortunately I don't think this is ready to ride the trains as is. However, I think there's something interesting here.

  • Windows 7 seems to be affected rather poorly by this change, so we should not apply this change for Windows 7
  • Windows 10 seems to generally be positively affected by the change, except at the 95th percentile, which shows a sharp increase in time to blank window shown, and other measures. I am wondering if this is from users which have disabled Superfetch.
  • OSX seems to show a very slight negative affect from this change
  • Linux seems completely unaffected

So, if we can get the positive affects of this for Windows 10 while getting rid of all of the negative effects, this is definitely worthwhile. I'm going to be mulling over what might be going on here for Windows 10, and slicing up the data a bit to get clues.

Could this fix be applied based on a config setting? IE. by default work as it is now, but if user enables a config setting, your fix is enabled.

Might save you some time/head scratching etc, just a thought...

(In reply to Mark from comment #53)

Could this fix be applied based on a config setting? IE. by default work as it is now, but if user enables a config setting, your fix is enabled.

Might save you some time/head scratching etc, just a thought...

Yeah - unfortunately there's a few problems there. For one, the affected code runs before we load a user's config, so we wouldn't be able to use the typical Firefox config infrastructure. So we'd probably have to use something like a registry value. However, while that would put this issue to bed for you and a handful of other people who find this bug, I'm mostly interested in fixing it for the thousands(?) of silent users who are severely negatively affected by this, and the millions of users who are moderately negatively affected by this. I of course don't want to leave you in the lurch without a workaround for this, but in general adding complexity to the startup path of the browser for the benefit of a small number of users is generally not something that goes over too well :(

Hi,

Not quite sure the status of this issue (it says "Resolved"), but just to let you know, my Windows updated to 1909 on Monday, and since then, the issue seems to have disappeared. Firefox now starts up instantly & clicking links open up imediately. This is using Firefox 72.0.2, which did have the issue prior to Windows update.

Regards,
Mark.

Good to know! Due to the fact that this mildly improved a minor metric across some systems and regressed other metrics on other systems, I'm going to file a follow-up to just remove this change then (it was only as yet enabled on Nightly due to said regressions).

Blocks: 1613430
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: