Closed Bug 1901603 Opened 7 months ago Closed 3 months ago

Firefox content processes intermittently freeze/become unresponsive on MacOS

Categories

(Core :: Performance, defect)

Unspecified
macOS
defect

Tracking

()

RESOLVED DUPLICATE of bug 1896172
Performance Impact low

People

(Reporter: denschub, Unassigned, NeedInfo)

References

Details

(Keywords: perf:responsiveness)

Sometimes, and I don't know exactly when or how, I run into situations where Firefox Content processes become more or less completely unresponsive (the browser UI seems to stay responsive). I have captured a profile at one instance, see https://share.firefox.dev/3XdKtB8 - profile was created on macOS 14.5 (23F79) on a 2021 16 inch MacBook Pro with the M1 Max.

You'll see in that profile that all three content processes jank, a lot. IIRC, I had the Discord and the schub.social tabs in one window, and the YouTube tab in another. The YouTube window was on a second screen, maximized window, playing a video. The YouTube window was in plain view, but didn't have focus (focus was likely on the text editor). The other window was in the background, too, I think - but I don't fully remember.

When I wanted to select a new video, I noticed that the tab's contents were pretty unresponsive (it took a solid 30 seconds for it to load a new video). I also noticed that the other tabs were unresponsive as well, but starting a profile recording worked fine. There was no elevated CPU or memory usage, and no significant storage I/O. Besides a text editor and a terminal, I don't had any resource-intensive applications running (although, to be fair, the text editor is VSCode...). The system remained responsive, and other applications stayed fast, too.

Component: XPCOM → Performance

I filed this in Core::XPCOM because Haik asked me to do so. :) ni? him so he can move and tag this bug accordingly.

Flags: needinfo?(haftandilian)

This has not been root caused yet, but the profile shows the three content processes were all in the background priority state suggesting it may be related to our QoS changes. I've made this blocking bug 1895985 for now.

Blocks: 1895985
Flags: needinfo?(haftandilian)

This bug was moved into the Performance component.

:denschub, could you make sure the following information is on this bug?

  • ✅ For slowness or high CPU usage, capture a profile with http://profiler.firefox.com/, upload it and share the link here.
  • For memory usage issues, capture a memory dump from about:memory and attach it to this bug.
  • Troubleshooting information: Go to about:support, click "Copy raw data to clipboard", paste it into a file, save it, and attach the file here.

If the requested information is already in the bug, please confirm it is recent.

Thank you.

Flags: needinfo?(dschubert)

(In reply to Haik Aftandilian [:haik] from comment #2)

This has not been root caused yet, but the profile shows the three content processes were all in the background priority state suggesting it may be related to our QoS changes. I've made this blocking bug 1895985 for now.

Probably an existing bug related to process priorities, that becomes a lot more painful now that we set the QoS for the process main thread.

Looking at the other tracks in the profile, there's another youtube content process that has the foreground priority (but does nothing).

Steps to reproduce would be extremely useful. I wonder if this could be caused by an add-on. Because the bug description mentions multiple windows, I wonder if the bug could be cause by bogus window occlusion detection. On Mac we trust the OS to decide if a window is occluded, and we have seen multiple cases where the OS told us the wrong thing (but so far it has always been the other way: not telling us that a window was invisible).

I sadly don't have any STR :( This is somewhat rare, and I couldn't find clear steps to make this happen. And I've seen this with no addons at all.

Flags: needinfo?(dschubert)

This isn't just affecting YouTube. I just had this when having two Firefox windows side-by-side on the same monitor, in this case it was GitHub. Profile: https://share.firefox.dev/3xgEJf8

What I did: originally, I had all GitHub tabs in the same window. I then moved out one tab by drag-and-dropping it out of the tabbar. I then used Rectangle to arrange them side-by-side (took a second or so), and then wanted to scroll with the content in the newly opened window, and the content immediately froze, and scrolling checkerboarded. Looking at the profile, the content process in the new window has background activity, which doesn't make a lot of sense to me.

The Performance Impact Calculator has determined this bug's performance impact to be low. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.

Platforms: Windows
Impact on site: Causes noticeable jank
Configuration: Specific but common

Performance Impact: --- → low

I just had this happen to a Bugzilla tab - while I was in a triage meeting, actively screensharing that specific window over Zoom, and actively interacting with its contents.

Hello I have the same problem, it started appearing recently, in the last couple of days.
Its symptom is that firefox freezes for several seconds and then resumes like normal.
I have only 1 extension, uBlock Origin.
This behavior seem to exhibit itself most often when switching from some application to firefox, but it sometimes happens when focus is not switched. It also seem to appear most often when doing tab closing/opening but its also not a 100% rule.

It happens on latest Firefox 127.0.1 and MacOS Sonoma 14.5

(In reply to Haik Aftandilian [:haik] from comment #2)

This has not been root caused yet, but the profile shows the three content processes were all in the background priority state suggesting it may be related to our QoS changes. I've made this blocking bug 1895985 for now.

The QoS changes (requiring threads.use_low_power.enabled and threads.lower_mainthread_priority_in_background.enabled to be true) are intended to be Nightly-only at this time. Those prefs are false on Release and Beta.

However, the QoS changes made the process priority manager (dom.ipc.processPriorityManager.enabled) be enabled on all channels on bug 1805932 which may have been a mistake since it was only intended to be enabled due to being a dependency of the QoS changes.

So, there may be bug in process priority handling on macOS (as Florian mentioned on comment 4) meaning it would affect Release. I want to take a look at the implications of dom.ipc.processPriorityManager.enabled. It could be causing issues on Release, but it was set back on 114.

I will set the dom.ipc.processPriorityManager.enabled to false in about:config and test it for some time, its a pretty elusive behavior so i will probably not be sure if it goes away, but i could immediately confirm its still here if it happens at least once

After some time the problem reappeared, the exact scenario was switching between tabs, Firefox was in focus all this time.
Switching the tab caused the hang before the tab was switched, meanwhile the cursor still responded on hover, for example, the cursor changed into text input one when hovering over url box, and into normal outside, but the whole browser UI was completely frozen. It lasted approximately 4 seconds.

(In reply to panoczek from comment #12)

After some time the problem reappeared, the exact scenario was switching between tabs, Firefox was in focus all this time.
Switching the tab caused the hang before the tab was switched, meanwhile the cursor still responded on hover, for example, the cursor changed into text input one when hovering over url box, and into normal outside, but the whole browser UI was completely frozen. It lasted approximately 4 seconds.

Thanks for the report, this is helpful. To confirm, you hit the problem after setting dom.ipc.processPriorityManager.enabled to false?

If you can catch the problem happening with the profiler running, that would help us debug what's going wrong. There are instructions for that here https://firefox-source-docs.mozilla.org/performance/reporting_a_performance_problem.html Once you have a profile link, you can add it to the bug report.

Yes it was after setting it to false and restarting the browser. I will try to catch it with profiling, it will be difficult though... It also feels like it happens more rarely now.

Yeah, there definitely is still something going wrong, even with dom.ipc.processPriorityManager.enabled=false. And it does feel like creating a new window with an existing tab is a good way to trigger this.

Here's what I did:

  1. Create a blank profile, set dom.ipc.processPriorityManager.enabled=false, restart.
  2. Open a mozilla.com and an en.wikipedia.org tab.
  3. With the Wikipedia tab enabled, I dragged the Tab out of the window to create a new window.
  4. That window starts focused, and I scrolled the contents. There is immediate checkerboarding in that content process.

This is a full profile from the whole story: https://share.firefox.dev/3z4vlM8

You can see how the Wikipedia CP starts with foreground priority, but as soon as I detach the tab, it gets set to background priority, and remains there. Even though you can see me scrolling the content in the screenshots (and you can see the checkerboarding). I'm somewhat confident have never seen this issue occur with only a single Firefox window open, so this might be related to how we handle multiple windows.

I also am confused why there even is a background process priority, it feels like processPriorityManager.enabled should turn that off? But maybe I'm missing something here. I have, however, confirmed twice that the pref was set to false.

MacOS doesn't even support actually setting the process priority via the priority manager, does it? I see HAL implementations for Android, Windows and Linux, but not MacOS.

Okay, so it looks like turning off the priority manager only disables the HAL behavior now, which as I said never does anything on MacOS. So it makes sense that turning off the pref does nothing. Looking at ContentChild::RecvNotifyProcessPriorityChanged() it seems like there's a raft of other things that might cause us to stall stuff. I guess we can just assume one of those is the problem and investigate why the priority manager thinks we're in the background.

Summary: Firefox content processes intermittently freeze/become unresponsive → Firefox content processes intermittently freeze/become unresponsive on MacOS

Since the last occurrence for me it didn't come back yet, I'm sure it will come back some time but the dom.ipc.processPriorityManager.enabled set to false helped a lot

It looks like bug 1747138 (from 2022) made the priority manager always run and just stopped the HAL part of it. Kind of feels like we should bail out of ContentChild::RecvNotifyProcessPriorityChanged() early if it is disabled. Of course, that doesn't help the underlying problem, as the pref is always enabled by default.

(In reply to panoczek from comment #18)

Since the last occurrence for me it didn't come back yet, I'm sure it will come back some time but the dom.ipc.processPriorityManager.enabled set to false helped a lot

I'm glad you aren't experiencing the problem any more, but according to my reading of the code, changing the pref should have no effect on the behavior on MacOS.

(In reply to Dennis Schubert [:denschub] from comment #15)

Yeah, there definitely is still something going wrong, even with dom.ipc.processPriorityManager.enabled=false. And it does feel like creating a new window with an existing tab is a good way to trigger this.

Here's what I did:

  1. Create a blank profile, set dom.ipc.processPriorityManager.enabled=false, restart.
  2. Open a mozilla.com and an en.wikipedia.org tab.
  3. With the Wikipedia tab enabled, I dragged the Tab out of the window to create a new window.
  4. That window starts focused, and I scrolled the contents. There is immediate checkerboarding in that content process.

Given the tab drag out of the window, that sounds like what is reported on bug 1896172. That should be Nightly-only if we're correct that it is caused by QoS changes. As another experiment, setting both threads.use_low_power.enabled and threads.lower_mainthread_priority_in_background.enabled to false should disable QoS and potentially workaround this problem.

Those prefs seem to fix the perf impact. I still see the CP being in "background" priority, but yeah, no more jank. That being said, there is pretty much no load on the system, so the bug description isn't entirely accurate.

The bug came back after one day. It happens when Firefox is in focus or bringing into focus, it really doesn't look like its related to the focus thing.
Somehow, profiling seem to fix it and for a long time. Hence why I can't catch it during profiling. Maybe something is being GCed or something doesn't go to sleep when profiling is in progress?
Anyway I'm now really disappointed by this bug.

@panoczek Could you include your about:support contents as an attachment here?

Flags: needinfo?(panoczek)
See Also: → 1917501

@Dennis, assuming you are no longer experiencing this problem, I'd like to close this bug as a dupe of bug 1896172 which landed 3 months ago in 130. That addressed the issue where a tab dragged off the tab bar got put into the background priority. On macOS Nightly, QoS code is enabled to slow down background tabs and that could cause these hangs. I see evidence of that problem in all your profiles and am assuming it was bug 1896172.

@panoczek, I suspect you were hitting a different bug because you were running on Release Firefox (not Nightly). If you encounter the problem again, collecting a performance profile using http://profiler.firefox.com/ would be hugely helpful. Please file a new bug.

Flags: needinfo?(dschubert)

Yeah, I think this is resolved now.

Status: NEW → RESOLVED
Closed: 3 months ago
Duplicate of bug: 1896172
Flags: needinfo?(dschubert)
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.