Closed Bug 1551990 Opened 5 months ago Closed 5 months ago

[DoS/Hang] Firefox hangs on wake up from OS X Power Nap (parent process is sleeping, content processes are not, periodically scheduling e.g. network requests)

Categories

(Core :: Widget: Cocoa, defect, P1, critical)

Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla69
Tracking Status
firefox68 --- fixed
firefox69 --- fixed

People

(Reporter: mayhemer, Assigned: haik)

References

(Depends on 1 open bug)

Details

(Keywords: hang)

Attachments

(2 files)

Originally reported here:
https://bugzilla.mozilla.org/show_bug.cgi?id=1540736#c19 and on

(In reply to Honza Bambas (:mayhemer) from bug 1540736 comment #30)

(In reply to Florian Quèze [:florian] from bug 1540736 comment #28)

I think this bug is specifically related to how Firefox handles (or rather doesn't handle) the Apple Power Nap feature which is in the middle between having a machine asleep or working.

Sounds like a very good theory! Reading about the feature (never heard about it before, me not having any osx device) it seems like OSX is leaving some services running, so that the CPU is not in S3/5 at all but still in S1. When the parent process is trapped and other processes still go, then this can happen, yes.

Captured profiles:
https://perfht.ml/2DYjY94
https://perfht.ml/2EalQvx (probably more interesting to look at first)

Sev:crit because this can kill/DoS the machine on waking up.

Please fix the component if this should be elsewhere, but this is really OSX specific.

Note this could have something to do with sandboxing (my theory why content processes are still running - the system may not have privs/doesn't see them to put them to proper semi-sleep).

If this is sandboxing then haik is the best person to look at OS X problems.

Flags: needinfo?(haftandilian)

Jim, does anyone on your team may have ideas how to diagnose and fix this? I believe this should be given a higher priority as well.

Flags: needinfo?(jmathies)
Keywords: hang
Summary: Firefox processes react inconsistently to OS X Power Nap (parent process is sleeping, content processes are not) → [DoS/Hang] Firefox hangs on wake up from OS X Power Nap (parent process is sleeping, content processes are not, periodically scheduling e.g. network requests)

(In reply to Honza Bambas (:mayhemer) from comment #3)

Jim, does anyone on your team may have ideas how to diagnose and fix this? I believe this should be given a higher priority as well.

I'm looking into this.

Assignee: nobody → haftandilian
Flags: needinfo?(jmathies)
Flags: needinfo?(haftandilian)
Priority: -- → P1

No root cause yet, but some debugging progress.

To confirm which processes are getting to run while the computer is in sleep mode, I have a thread in each process that prints out the $PID and date to stdout every second and can see that when the computer is automatically briefly woken up from sleep in Power Nap mode, content processes are getting to run a lot more than the parent process (about 3 to 4 times as much using this crude measurement.)

This reproduces for me with/without the content sandbox enabled.

I've also found that macOS is App Nap'ing the parent process, but not child processes. This can be observed in the Energy tab in Activity Monitor after hiding Nightly for some period of time. App Nap is documented[1] as reducing process priority, I/O throttling, and timer throttling. I don't know if App Nap is used during Power Nap[2] wakeups, but that could explain the problem.

A similar issue was reported[3] in Chromium and the fix used was to disable App Nap.

  1. App Nap documentation
    https://developer.apple.com/library/archive/documentation/Performance/Conceptual/power_efficiency_guidelines_osx/AppNap.html
  2. Power Nap documentation
    https://support.apple.com/en-us/HT204032
  3. Similar Chromium issue
    https://bugs.chromium.org/p/chromium/issues/detail?id=871235#c6
See Also: → 1554417
Duplicate of this bug: 1554417

What I've determined (empirically) about what's happening here is that when an application is hidden or during Power Nap wakeups, the App Nap feature is used to throttle the application. The OS will put the parent process into App Nap, but not our child processes. During App Nap, child processes get to run more than the parent process. With App Nap disabled, the imbalance doesn't happen.

To opt out of App Nap for the parent process, the NSAppSleepDisabled preference must be set. I experimented with setting NSAppSleepDisabled and NSSupportsAppNap in Firefox's Info.plist, but couldn't come up with an Info.plist setting that disabled App Nap for this process.

We can use the standardUserDefaults setObject method or the registerDefaults method to set this. registerDefaults doesn't appear to write to disk so it seems preferable. The value needs to be early enough during startup to take effect for this instance of the application. Setting the value in SetupMacApplicationDelegate() where we set other standardUserDefaults values is too late.

I'll file a follow up bug to investigate a better approach to App Nap. Specifically to determine if we re-enable App Nap in a way that avoids this problem and saves power.

Disable App Nap in Firefox to avoid macOS App Nap'ing the parent process while child processes continue to run normally.

Attached patch Debugging CodeSplinter Review

Patch that creates a thread in parent and child processes which logs the date every second. When the parent process is App Nap'd, the logs show the the child processes continuing to log the date each second, but the parent process has periods where it stops logging the date. Without App Nap, the parent and child processes log the date each second consistently.

$ MOZ_LOG_FILE="/tmp/appnap.log" MOZ_LOG="AppNapLog:4,sync" ./mach run --macos-open

I see in comment 5 that Chrome has taken the approach of disabling App Nap; have you checked what happens if child processes have been declared as supporting app nap? Maybe we could support it across all of our processes? There may be legitimate reasons why we don't want to support App Nap that I can't think of though.

Flags: needinfo?(haftandilian)

(In reply to Stephen A Pohl [:spohl] from comment #13)

I see in comment 5 that Chrome has taken the approach of disabling App Nap; have you checked what happens if child processes have been declared as supporting app nap? Maybe we could support it across all of our processes? There may be legitimate reasons why we don't want to support App Nap that I can't think of though.

At first, I thought child processes might not be getting AppNap'd due to sandboxing and I experimented with adding sandbox access to the appsleep service and then completely disabling sandboxing. I found that it was not caused by sandboxing. I think child processes do not get AppNap'd because they do not have their own window. From Activity Monitor, I can see that most service-type processes and child processes do not get AppNap'd.

I have not tried explicitly enabling AppNap in child processes using the NSAppSleepDisabled=false option because I assumed that wouldn't work, but I agree we should look into this more to see if we can get some power savings by enabling AppNap in a way that doesn't cause this bug, but I think it's a bigger project and this problem requires a timely fix. If we could enable AppNap for our child processes, we would need a mechanism to signal to the OS that the child process should come out of AppNap. In my tests, just having Firefox hidden for a few minutes is enough for the OS to AppNap the process and I'm not sure we want to slowdown JS at that point. It might be possible to use the "activity" method described here[1] in child processes to indicate to the OS when a child process should/shouldn't be AppNap'd and I think that's one of the things we should investigate.

Another option would be for the parent to detect when it is not in the foreground and tell child processes to slow themselves down or simply lower their scheduling priority. But we'd have to make sure this doesn't cause problems for pages that are loaded at that time.

  1. https://developer.apple.com/library/archive/documentation/Performance/Conceptual/power_efficiency_guidelines_osx/PrioritizeWorkAtTheAppLevel.html#//apple_ref/doc/uid/TP40013929-CH36-SW1
Flags: needinfo?(haftandilian)

(In reply to Haik Aftandilian [:haik] from comment #10)

I'll file a follow up bug to investigate a better approach to App Nap. Specifically to determine if we re-enable App Nap in a way that avoids this problem and saves power.

I filed bug 1556558 "Investigate enabling macOS AppNap to save power".

Per Stephen's question in comment 14 about supporting AppNap in all processes, I have been experimenting with getting child processes to enter AppNap. So far, I haven't had any luck. I couldn't get child processes to be AppNap'd by either adding <key>NSSupportsAppNap</key><true/> to the plugin-container Info.plist or by setting NSAppSleepDisabled in standardUserDefaults.

Pushed by haftandilian@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5ad3201e605e
Firefox hangs on wake up from OS X Power Nap r=spohl

Will request uplift after verifying on Nightly.

Flags: needinfo?(haftandilian)
Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69

Comment on attachment 9068619 [details]
Bug 1551990 - Firefox hangs on wake up from OS X Power Nap r?spohl

Beta/Release Uplift Approval Request

  • User impact if declined: On Mac, if the computer is put in sleep mode while Firefox is open and macOS Power Nap is enabled, when the computer is woken up from sleep, Firefox may hang for several minutes using high CPU. The problem can also happen when Firefox is hidden and macOS puts Firefox into App Nap mode. See also bug 1540736 where this problem was originally reported.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce:
  1. Open Firefox and load a web page
  2. Hide Firefox (command-H)
  3. Open Activity Monitor and switch to the "Energy" pane in Activity Monitor and make sure the "App Nap" column is visible.
  4. Wait 10 minutes.
  5. Ensure that Firefox has a "No" in the App Nap column.
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change is small and easy to understand. The new code disables App Nap for the parent process by setting the standardUserDefaults "NSAppSleepDisable" preference to false during startup.
  • String changes made/needed: None
Flags: needinfo?(haftandilian)
Attachment #9068619 - Flags: approval-mozilla-release?
Attachment #9068619 - Flags: approval-mozilla-beta?
Flags: qe-verify+

Comment on attachment 9068619 [details]
Bug 1551990 - Firefox hangs on wake up from OS X Power Nap r?spohl

macos fix, approved for 68.0b8

Attachment #9068619 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [qa-triaged]

Hi all, thank you for addressing this. Now that 67.0.2 is being prepared to address an unrelated bug, could this please also be uplifted to mozilla-release to make it out in time with the dot release?

I've tried to reproduce this bug using the steps but I only get high CPU(100+% -80%) on affected Nightly and on fixed Firefox 68.0b8 I'm getting 30-40% CPU then decrease.
I've also tried with hiding Firefox and put mac on sleep mode.

Flags: needinfo?(haftandilian)

Have larger number of google documents open in various tabs (note that all has to be loaded, when restored from session w/o activating each tab you will not get the expected behavior) before putting the machine to sleep (to power nap). If this is not fixed, firefox should be mostly unresponsive after waking up (let it sleep an hour or so, at least)

I would like to see it to be verified on beta first and have a minimum bake time on beta and nightly before evaluating an uplift to release. So I will not consider the uplift for the 67.0.2 release that we ship next week but I am keeping the request open for a potential 67.0.3.

Hi Pascal, please reconsider for a 67.0.2 uplift -- this is a serious performance problem impacting everyone using the macOS release and should be mitigated in the short term. Seems it's already been verified on nightly (and 68.0b8 from reading the QA note) and will have the same effect on release. You're fighting against the operating system here; one has to opt-out from App Nap to stop it. When macOS users note that Firefox "uses too much energy" or "causes poor battery life" this critical bug is being conflated with many other unrelated ones, making performance troubleshooting matters worse.

Some notes for Haik or others still verifying this:

If you are seeing inconsistent results then you may need to invoke the following NSProcessInfo API:

(In reply to brymaster from comment #28)

If you are seeing inconsistent results then you may need to invoke the following NSProcessInfo API:

With the fix that is now in Nightly and Beta, verifying that App Nap was disabled via Activity Monitor was reliable. (An earlier version of the fix I posted for testing was not reliably disabling App Nap, but that problem has been addressed.)

However, we need to verify that the fix addresses the reported problem, not just that App Nap is disabled.

Additional notes relevant to verification:

  • Power Nap must be enabled.
  • The original problem occurred after the computer was put to sleep for "a day or so" (per bug 1540736 comment 0).
  • The problem is more likely to occur if the computer is plugged in while sleeping.
Flags: needinfo?(haftandilian)

We were unable to reproduce this bug, we tried all steps and on different machines.
Since we were unable to reproduce it in order to confirm the fix, I’m dropping the qe+ flag.
Vicky, could you please confirm this is fixed on your side using latest beta and latest nightly?

Flags: qe-verify+
Flags: needinfo?(mconley)
Flags: needinfo?(mconley) → needinfo?(vchin)

Comment on attachment 9068619 [details]
Bug 1551990 - Firefox hangs on wake up from OS X Power Nap r?spohl

We're not going to ship another 67 dot release with 68 coming next week.

Attachment #9068619 - Flags: approval-mozilla-release? → approval-mozilla-release-

I'm still encountering this bug every now and then, on Nightly. I'm on 10.12. Here's a profile with lots of network requests from the content process which were waiting for the parent process for 11.1 hours: https://perfht.ml/2NTLZWK

Also still occurring for me on Firefox 68.0. Perhaps the NSProcessInfo API I mentioned in comment #28 needs to be used?

macOS 10.14.5
MacBookPro11,5

Before an 8 hour sleep (closing the lid), Firefox had 4 windows open from a previous session, with 1400 inactive/unloaded tabs and 12 active ones (low energy impact in about:performance).

In my zip file:
https://transfer.sh/SlOoc/FF68%20High%20CPU.zip

  • A Time Profiler .trace file saved from Xcode's Instruments, plain text also viewable here: https://pastebin.com/ZKGGqjh9 (opening the .trace in Time Profiler is recommended for ease of navigating)

  • A sample of the process from Activity Monitor, plain text also viewable here: https://pastebin.com/jtSvyRsr

Captured profile:
https://perfht.ml/2YNgPBv

(In reply to brymaster from comment #33)

Also still occurring for me on Firefox 68.0. Perhaps the NSProcessInfo API I mentioned in comment #28 needs to be used?

Thanks for the reports. I'll look into using NSProcessInfo API as a follow-on fix. Will file a new bug.

@markus, @brymaster, do you still see the main Firefox process entering App Nap in Activity Monitor? There is an "App Nap" column that can be turned on or off to see the status. From my testing, if you hide Firefox and do something else for a few minutes, Firefox enters App Nap (without the fix that was landed and is in 68.)

@haik, From testing Firefox 68 - I hid Firefox for 20 minutes and it did not enter App Nap. I then closed the lid for 80 minutes with Firefox still hidden and when I reopened the lid the Firefox main process still had not entered App Nap (and CPU use was low). So it is strange that this problem with high CPU usage is sometimes still happening after longer durations (long Firefox session, perhaps open for 24-48 hours and then the Mac sleeping for 8+ hours). It makes me think that there could be another issue besides which API is being used to opt-out of App Nap.

Maybe some more info to help troubleshoot.

  • I left Firefox open and not hidden/minimized with a tab opened from Twitter that I knew would constantly be updating, https://twitter.com/search?q=twitter&src=typd
  • I then closed the lid on my laptop for 11 hours.
  • Reopened the lid and found Firefox to be using low CPU, so I closed the tab to Twitter since there appeared to be no problems.
  • Shortly after doing this, maybe 5-10 minutes, I noticed Firefox had high CPU usage again!

Process sample: https://pastebin.com/SGc18ssT

Here are profiles I captured after noticing the high CPU use (NOTE: I've noticed that no Network data gets captured).

Here's another profile where I reopened that Twitter URL while I was experienced the high CPU use.

One more after I closed that Twitter tab while CPU use is still high.

Wonder if twitter.com/i/jot is a problem?

I haven't encountered this again yet, but that's probably because I haven't had my MBP sleep for a longer duration while connected to power since my last report. Usually I have it disconnected from power during the night. When this problem happened for me on Sunday, it was after I had had it connected to power overnight.

Yes, I should note I keep mine connected to power but that is not the only time I have experienced it.

Blocks: 1567018

Since a fix (possibly ineffective) was landed for this bug, I've filed bug 1567018 for follow-up work needed on this. Please use that bug for any instances of the hang that have this fix -- disabling AppNap.

No longer blocks: 1567018
Depends on: 1567018

I haven't experienced a hang recently. Clearing NI.

Flags: needinfo?(vchin)
You need to log in before you can comment on or make changes to this bug.