Closed Bug 1672791 Opened 5 years ago Closed 3 years ago

Unhandled network requests accumulate during system sleep, clog event loop after wake-up

Categories

(Core :: Networking, defect, P2)

All
macOS
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mstange, Unassigned)

References

Details

(Whiteboard: [necko-triaged])

(broken out from bug 1567018)

Profile: https://share.firefox.dev/3jnkdwg (taken during unresponsive time after wakeup)

Steps to reproduce:

  1. Be on a Macbook Pro. In the system preferences under Battery -> Power Adapter (or Energy Saver on older versions), make sure "Enable Power Nap while plugged into a power adapter" is checked.
  2. Open a Google Docs document and turn the tab into an App Tab so that it doesn't get throttled if it's in the background.
  3. Connect your Macbook Pro to power.
  4. Close the lid of the Macbook Pro.
  5. Wait for a few hours.
  6. Open the lid of the Macbook Pro, and try to use Firefox.

Expected results:

Firefox should be responsive as soon as possible. Switching tabs should completely quickly, typing in the address bar should show results quickly, initiating new page loads should work.

Actual results:

For a minute or so, Firefox is in a pseudo-hung state.
It shows hover effects and responds to clicks and and keyboard input, but any asynchronous effects of those input effects don't complete until the hung state is over.

Analysis

The profile shows that the event loop is clogged by network events. And the network panel shows that we were initiating network requests at a rate of one every 10 seconds, while the machine was "sleeping". However, none of those network requests got handled until the wakeup.

We send SLEEP and WAKE notifications when the system goes to sleep and wakes up.
The socket transport service and nsIOService seem to listen to those notifications and respond in some way, probably by suspending network requests (but I haven't read the code): SLEEP, WAKE

So it seems we're getting into this state because we allow web content to initiate new network requests while our network handling is suspended.
So it seems we have two choices:

  1. Either we don't suspend, and handle network requests normally during power-nap pseudo-sleep. This will stop the requests from accumulating.
  2. Or we don't allow web page code to run during power-nap pseudo-sleep so that it can't initiate new network requests. This will ensure that there aren't any network requests to begin with.

We could even do both: Try to stop web page code from running during sleep, but in case we have holes in the system and some network requests get through anyway, handle them normally.

It would be good to find out what other browsers do.

Actually, it seems that the every-10-seconds network request is coming from GMail, not from Google Docs.

The thing that confuses me the most about this is the fact that our process is running at all during sleep. All the information I can find online about Power Nap mentions that it's only used by Apple apps and not accessible to other apps.

Jens, can you find someone to look into this? Otherwise, in ~2 weeks I will have time to look at this.

Flags: needinfo?(jstutte)

:mstange: perhaps it's worth sharing the information about Power Nap. I'm only aware of https://support.apple.com/en-us/HT204032, but that's not a binding technical specification. It doesn't explicitly mention that a web-browser is allowed to perform task.

Flags: needinfo?(jstutte) → needinfo?(kershaw)

I did an experiment on my macbook and here is what I found.

  1. The sleep and wake notifications here are not received at all, which means the socket transport service is not aware of the sleep state.
  2. I wrote a simple xhr html to test if the network request works in sleep mode and it seems that every 30 minutes the macos wakes up and works for a while and than goes to sleep again. I am not really sure about this. I'll do more tests to confirm.
  3. I can't reproduce the pseudo-hung state for now. I'll keep trying and see if I can get the networking log.
Assignee: nobody → kershaw
Severity: -- → S3
Flags: needinfo?(kershaw)
Priority: -- → P2
Whiteboard: [necko-triaged]

Per https://support.apple.com/en-us/HT204032 quite a few operations should still work while the system is asleep.
Should we treat foreground tabs as background tabs and stop playing audio/video?

I wonder what other browsers do.

On bug 1415923 comment 53 we got another recorded profile while Firefox is in that state. Maybe it's helpful.

Also lots of users affected by that actually can confirm that switching users back and forth fixes the problem immediately.

Kershaw, do you expect your patch in bug 1415923 to solve this bug and bug 1567018 as well?

Flags: needinfo?(kershaw)

(In reply to Markus Stange [:mstange] from comment #7)

Kershaw, do you expect your patch in bug 1415923 to solve this bug and bug 1567018 as well?

I am not entirely sure about this. The profile in comment #0 looks like not the same as the log I uploaded in bug 1415923 comment#82.
I'd like to see the http log captured at the point when the OS is wake-up to confirm if this is the same case.

Flags: needinfo?(kershaw)

I'm experiencing something very similar to this, but on a 2020 27" iMac and with Power Nap disabled. I'm running 85.0.2 on macOS 11.2.1.

This bug intuitively feels like what's happening in my case.

My Mac never properly goes to sleep because I'm sharing the WiFi over Ethernet (for a device that requires constant network connection, and doesn't have WiFi).

I use my Mac for my job, only - basically 9-5, Mon-Fri. Overnight, the Mac feels like it takes a little while to wake up, but that's broadly expected. A little longer than I'd expect from a brand-new iMac, but not long enough to cause an issue. Firefox is by far the biggest application (200+ tabs - yes, I know). On a Friday, I just get up from my desk and walk away. My Mac then goes to some kind of sleep. I'm running Big Sur, and haven't changed any power settings since the Mac was new. When I get into work on a Monday morning (so ~60 hours since getting up and leaving), Firefox takes ages to start responding. This morning, I left it 10 mins before giving up and restarting FF.

Some observations:

  • Google Mail was giving an "unable to connect, retrying in...." banner, suggesting that FF wasn't entirely stopped, but that it couldn't access the network.
  • When I tried to switch tabs after ~5 mins, it did slowly happen, but the experience was very slow. Reloads did seem to get through, slowly, but it was like trying to use a fast, but congested, network.
  • I have a lot of open Google Docs tabs. I understand that these make lots of requests (1 every 10 secs, minimum)
  • Activity Monitor showed very low CPU usage for FF

Happy to capture any info, traces etc next Monday morning if helpful, but I don't actually know how to do that, so will need guidance.

I was unable to use the steps that were described in the description to reproduce it, however looks like Rob is able to help here which is nice.

Sounds like what we need now is a HTTP logging, so kershaw can verify it.

kershaw, can you please verify that this is what you need https://developer.mozilla.org/en-US/docs/mozilla/debugging/http_logging#using_aboutnetworking ?

Flags: needinfo?(kershaw)

(In reply to Sean Feng [:sefeng] from comment #10)

I was unable to use the steps that were described in the description to reproduce it, however looks like Rob is able to help here which is nice.

Sounds like what we need now is a HTTP logging, so kershaw can verify it.

kershaw, can you please verify that this is what you need https://developer.mozilla.org/en-US/docs/mozilla/debugging/http_logging#using_aboutnetworking ?

Yes, please create the logging. If possible start the logging before putting a computer to sleep and stop the logging 1-3 minute after waking up the computer.

Please not that a log may contain information you do not want to share, cookie and URI you have visited. to minimize this, please use a clean profile and do not log into online accounts. You can also send the log via email and not publish it here.

Flags: needinfo?(kershaw)

Rob, do you mind follow Dragana's instructions to get a HTTP logging for us?

Flags: needinfo?(rob)

Is this bug still an issue?

Assignee: kershaw → nobody

I haven't seen it myself after switching to an M1 machine, and I also haven't seen any other reports of it recently.

Redirect a needinfo that is pending on an inactive user to the triage owner.
:dragana, since the bug has high priority, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(rob) → needinfo?(dd.mozilla)
Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(dd.mozilla)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.