Closed Bug 1693219 Opened 3 years ago Closed 1 year ago

Firefox 78.7 ESR takes a long time to recover after being SIGSTOPped for some time

Categories

(Core :: Performance, defect, P3)

78 Branch
defect

Tracking

()

RESOLVED INACTIVE

People

(Reporter: fedja, Unassigned)

Details

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0

Steps to reproduce:

Update from Firefox 68.12.0 to 78.6.0 (78.7.0 has the same issue).

For power saving reasons, my WM automatically issues a SIGSTOP to Firefox once I unfocus Firefox. It issues SIGCONT once I focus back into Firefox.
(https://github.com/blueyed/awesome-www/tree/stop_unfocused)

Actual results:

After SIGCONT, Firefox is partially unresponsive for a time that seems proportional to the time it spent stopped. For example, while I can enter URLs into the URL bar and get visual feedback for doing so, URL autocomplete is inoperative (no suggestions are shown) and pressing enter has no effect. Clicking on individual tabs will show them as focused in the tab bar, but the "web page view" will not change.

Expected results:

Firefox should respond with unnoticeable delay.

The Bugbug bot thinks this bug should belong to the 'Firefox::Address Bar' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Address Bar
Component: Address Bar → General
Component: General → Performance
Product: Firefox → Core

Does https://profiler.firefox.com/ possibly work with ESR too. If so, could you try to capture a profile when the slowness happens?
It sounds like some slow operation happens just after SIGCONT.

Severity: -- → S3
Flags: needinfo?(fedja)
Priority: -- → P3

I have a very similar issue with 102.0.1 on NixOS. After SIGSTOPping it for half a day and then SIGCONT, I could resume a youtube video in the current tab, but basically nothing else, such as switching to another tab or creating a new one, worked. Meanwhile htop showed it using around 150% CPU, and after half a minute or so, it got terminated by the oomkiller (apparently, memory allocation had also gotten out of control, which I hadn't been paying attention to). The previous time this happened, after a similar half a minute or a minute or so, it eventually recovered rather than getting oomkilled.

Weirdly, out of two user accounts, it has only occurred with the newer one.

If it happens again I'll try to capture a profile. I'm curious whether that will work under the circumstances.

Redirect a needinfo that is pending on an inactive user to the triage owner.
:fdoty, since the bug has recent activity, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(fedja) → needinfo?(fdoty)

Hi :glaebhoerl - If you're able to collect a profile that would be a great help towards diagnosing the issue.

Flags: needinfo?(fdoty) → needinfo?(glaebhoerl)

As luck would have it, it seems to have fixed itself and hasn't occurred since. :-\

I don't think anything meaningful changed w.r.t. circumstances, so maybe it's timing dependent? Guess I'll find out in due course...

OP here. This time 91.11.0esr (64-bit) on Gentoo and with a profile: https://share.firefox.dev/3wowbze

Closed & reopened firefox before sleep, in the morning it took about 10 minutes to become responsive. During these 10 minutes it allocated all available memory and then released it. This is with no actual websites loaded.

Output of https://github.com/pixelb/ps_mem at low memory:
188.0 MiB + 61.3 MiB = 249.3 MiB firefox
2.0 GiB + 624.0 KiB = 2.0 GiB WebExtensions
3.4 GiB + 433.8 MiB = 3.8 GiB Web Content (3)

Output of ps_mem at time of writing this:
241.5 MiB + 132.0 MiB = 373.5 MiB firefox
119.5 MiB + 757.6 MiB = 877.0 MiB WebExtensions
207.0 MiB + 753.8 MiB = 960.9 MiB Web Content (4)

Addendum: it would seem that my stop_unfocused configuration does not catch all firefox processes, as there are still some running after the SIGSTOP happens. Perhaps this results in some message passing queue(s) filling up, which then results in a lot of backlog to process?

Fedja, have you still seen this?

Flags: needinfo?(glaebhoerl) → needinfo?(fedja)

An update on my part:

My scenario is that I have two user accounts that I keep logged in and switch between (for "work" and "play"). I normally SIGSTOP all the firefox processes on one account before switching to the other. This works fine. It's only when I forget and don't do the SIGSTOP that I experience the "Firefox is partially unresponsive in specific ways and starts consuming memory at a prodigious rate" phenomenon when I log back in to that account. Under the circumstances unfortunately I'm usually more preoccupied with killing the process before it eats all the available memory than I am with trying to get a profile. (On the surface this smells like maybe an unbounded queue gets backed up somewhere, the processing of which results in the rapid memory allocation, which could explain why it's also managed to recover before OOMing a couple of times. But that's just a guess.)

(That it doesn't happen when I do the SIGSTOP is basically 100% (this is exercised at least twice daily), the converse I'm slightly less certain of. AFAIR the couple of times I recently forgot, I was bitten, but I can't swear there wasn't a case when it turned out fine.)

Redirect a needinfo that is pending on an inactive user to the triage owner.
:fdoty, since the bug has recent activity, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(fedja) → needinfo?(fdoty)

Is it possible to take action on this bug without feedback from the reporter? If we're unable to move forward, this bug can be closed until we hear back.

NI open on reporter to ping once more for information in Comment 9

Flags: needinfo?(smaug)
Flags: needinfo?(glaebhoerl)
Flags: needinfo?(fedja)
Flags: needinfo?(fdoty)

Redirect a needinfo that is pending on an inactive user to the triage owner.
:fdoty, since the bug has recent activity, could you please find another way to get the information or close the bug as INCOMPLETE if it is not actionable?

For more information, please visit BugBot documentation.

Flags: needinfo?(fedja) → needinfo?(fdoty)

Closing this bug for lack of response from reporter. If you believe this has been closed in error, please file a new bug with reference to this one.

Status: UNCONFIRMED → RESOLVED
Closed: 1 year ago
Flags: needinfo?(smaug)
Flags: needinfo?(glaebhoerl)
Flags: needinfo?(fdoty)
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.