Closed Bug 1646976 Opened 4 years ago Closed 3 years ago

High CPU load on macOS during file download

Categories

(Firefox :: Downloads Panel, defect, P2)

77 Branch
defect

Tracking

()

RESOLVED INCOMPLETE
Performance Impact low
Tracking Status
firefox81 --- affected

People

(Reporter: jonas, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf:resource-use, power, Whiteboard: [necko-triaged])

Attachments

(6 files)

Attached file Instruments.trace.zip

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:77.0) Gecko/20100101 Firefox/77.0

Steps to reproduce:

  1. Start firefox in safe mode with fresh profile (also works if using without safe mode and non-fresh profile)
  2. Download large-ish file (3GiB)

I obtained a trace using the macOS "Instruments" application in the hope that this helps debugging the issue. I can reproduce this all the time so if I should try it with some other tool I'm happy to do so.

Actual results:

Download works fine but 1 of 4 CPUs is at 100% load all the time (makes fans spin on MacBook)

Expected results:

Downloads should not be so CPU heavy

Top result from profiler:

3.99 s 59.1% 0 s -[NSApplication _nextEventMatchingEventMask:untilDate:inMode:dequeue:] 0x268c81

Bugbug thinks this bug should belong to this component, but please revert this change in case of error.

Component: Untriaged → Widget: Cocoa
Product: Firefox → Core

Jonas H, thanks for taking the time to report this issue, can you please capture a performance profile and attach it to this bug?

You can get more info on how to install and use the Cleopatra add-on (that helps you get the performance profile) by going to:
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler
https://perf-html.io/

Please note that this addon works only with FF Nightly, so this means you need to be able to reproduce the issue on Nightly first. You can download the latest Nightly from
https://www.mozilla.org/en-US/firefox/channel/desktop/

Flags: needinfo?(jonas)
Flags: needinfo?(jonas)

Markus, where should I direct reports with performance profiles like these?

Severity: -- → S3
Flags: needinfo?(mstange.moz)
Priority: -- → P3

Core :: Performance is a good place. Or if enough information is known that we know that the bug is already in the correct component, then adding "[qf]" to the whiteboard will also put the bug into the Perf Triage queue.

Flags: needinfo?(mstange.moz)

Thank you!

Component: Widget: Cocoa → Performance

Could you try this in Firefox Nightly (rather than release) and get another Instruments recording? This will have more useful stacks and symbols.

Flags: needinfo?(jonas)

Is there a way to upload a 20MiB file? Otherwise, can I send it to your e-mail address, Markus?

Flags: needinfo?(jonas)
Attached file Instruments3.tar.bz2

I have another file that's too large for upload.

(In reply to Jonas H. from comment #9)

Is there a way to upload a 20MiB file? Otherwise, can I send it to your e-mail address, Markus?

You can try sending it to my bugzilla email address but I'm not sure that attachments larger than 20MB will make it through. But the other profile you uploaded looks like it has all the necessary information, thanks!

In the profile from attachment 9163591 [details], 45% of the CPU usage is spent inside the following stack:

  19 libfreebl3.dylib 21832.0  SHA256_Compress_Generic
  18 libfreebl3.dylib 21825.0  SHA256_Update_Generic
  17 libsoftokn3.dylib 21825.0  NSC_DigestUpdate
  16 libnss3.dylib 21825.0  PK11_DigestOp
  15 XUL 21825.0  mozilla::net::DigestOutputStream::Write(char const*, unsigned int, unsigned int*)
  14 XUL 21825.0  nsStreamCopierIB::ConsumeInputBuffer(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*)
  13 XUL 21825.0  nsPipeInputStream::ReadSegments(nsresult (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*)
  12 XUL 21825.0  nsStreamCopierIB::DoCopy(nsresult*, nsresult*)
  11 XUL 21825.0  nsAStreamCopier::Process()
  10 XUL 21825.0  nsAStreamCopier::Run()

6% are spent in the write system call, another 6% in recvfrom and poll, a bit in memcpy, and a fair bit in starting and stopping threads of a threadpool in nsStreamTransportService::Dispatch which I filed as bug 1652898.

The sha256 hashing was added in bug 829832 and seems to be used for "reputation checking": https://searchfox.org/mozilla-central/rev/82c04b9cad5b98bdf682bd477f2b1e3071b004ad/toolkit/components/downloads/DownloadIntegration.jsm#494-509
I've never seen Firefox block a download on macOS so I'm not sure if this hash is actually used these days.

Jonas, what is your download speed?

If I profile a download on my machine, I get a very different CPU distribution. My download speed is 3.0MB/s, and I see 50% of the CPU usage in the poll system call and only 1.5% in SHA256_Compress_Generic on my machine. Actually, it seems like poll is keeping an entire core busy, which seems bad... but that's a different issue from the one reported in this bug, and I've filed it as bug 1652901.

Status: UNCONFIRMED → NEW
Component: Performance → Networking: File
Ever confirmed: true

The component has been changed since the backlog priority was decided, so we're resetting it.
For more information, please visit auto_nag documentation.

Priority: P3 → --

Can you try this download? https://transfer.sh/iCiI5/Instruments2.tar.bz2

I am not sure if the Instruments3 profile is correct; it didn't show CPU spikes in the Instruments UI.

On good days I have 400Mbit/s download, but mostly it's somewhere in the 30–100Mbit/s range. I did the benchmarks with a localhost download though; the speed I got there was around 85MiB/s while wget gets around 600MiB/s.

(In reply to Jonas H. from comment #14)

Can you try this download? https://transfer.sh/iCiI5/Instruments2.tar.bz2

The download worked but the file doesn't contain the necessary information; please use the "Time Profiler" tool as before. This will contain call stacks of Firefox C++ code.

Attached file localhost.zip
Attached file wifi.zip

Added 2 new files:

localhost.zip: unthrottled download from localhost (webfsd); same setup as "Instruments3" and "Instruments2" files. ~80–85MiB/s download speed. ~150% CPU load according to htop all the time.
wifi.zip: download speed throttled to "WiFi" using Firefox developer tools. ~3.7MiB/s download speed. ~140% CPU load for the first few seconds, then ~30% CPU load.

Flags: needinfo?(mstange.moz)

Thanks. localhost.zip looks the same as comment 10; 43% of CPU in SHA256 hashing. wifi.zip looks different; only 5% of the CPU usage is from SHA256 hashing, and much of the rest spent in overhead from the devtools network monitor.

Flags: needinfo?(mstange.moz)

This more looks like something for PSM or even NSS.

Component: Networking: File → Security: PSM

Maybe this NS_AsyncCopy needs a larger chunk size? https://searchfox.org/mozilla-central/rev/72a1334982cadde0ca8b3d3583877afea80f5639/netwerk/base/BackgroundFileSaver.cpp#601
I'm not surprised that hashing a 3GiB file takes some CPU time - short of somehow making sha256 faster, nothing we change in PSM or NSS will improve this situation.

Component: Security: PSM → Networking

Here's a new profile with latest nightly. Not sure if it has any new info; I was just annoyed by Firefox draining my battery for a simple download so I did another profile.

https://share.firefox.dev/3fuozjf

(Oops, pressed submit too early.)

Download was ~15GB file from Internet Archive. You can see from the CPU usage graphs that download speed varied a lot (they are proportional). The download attempt profiled in the Instruments profile was 200MB in ~150s, so a mean speed of 1.33MB/s. Even at this rate Firefox was by far the most energy consuming application running on the system, according to Activity Monitor :-(

Attached file Instruments.trace.zip2

Had to split profile into two chunks because of upload limit. Please use cat to obtain the zip file. MD5 checksums:

baefc791578ef73b51462c5626605531 Instruments.trace.zip (orig)
1b713ff9de6ae730a5d4e3941d46d79b Instruments.trace.zip1
c51bf6b088fb192cf23cdacd0192e950 Instruments.trace.zip2

P2 as this is reproducible consistently, but not for majority of users.

Priority: -- → P2
Whiteboard: [necko-triaged]

Note that I can see a similar high CPU load of the main process when downloading the following file:
https://www.microsoft.com/en/microsoft-teams/download-app#desktopAppDownloadregion

My download rate is around 5MB/s, and checking the download with Safari it only uses at maximum 10% of CPU and not >100%.

On my machine, with today's Nightly, on macOS, it seems like there might be extra CPU usage if the build gets into a certain "state": After a fresh restart, downloading takes about 70%-90% CPU. But once I get into that other state, I see CPU usage of > 150%. In the Gecko profiler the difference is how frequently we tick the refresh driver: If we only tick it whenever the percentage display in the downloads button's progress bar is updated, then we see the lower CPU usage. But sometimes we tick the refresh driver on every vsync while the download is going on, because we think there is some other animation. This causes the higher CPU usage.
I'm not sure what "other animation" that is, and how to get into that state.

The 70% are roughly 40% main thread, 15% socket thread (recv + TLS decryption), 15% hashing on a background thread.

Interesting. So as workaround I could move the download button from the toolbar into the overflow menu. Doing that the CPU load decreases to around 35%. That's way lesser but still a lot more compared to Safari.

Whiteboard: [necko-triaged] → [necko-triaged][qf]
Whiteboard: [necko-triaged][qf] → [necko-triaged][qf:resource:p3]
Component: Networking → Toolbars and Customization
Product: Core → Firefox
Component: Toolbars and Customization → Downloads Panel
Performance Impact: --- → P3
Whiteboard: [necko-triaged][qf:resource:p3] → [necko-triaged]

(In reply to Markus Stange [:mstange] from comment #30)

On my machine, with today's Nightly, on macOS, it seems like there might be extra CPU usage if the build gets into a certain "state": After a fresh restart, downloading takes about 70%-90% CPU. But once I get into that other state, I see CPU usage of > 150%. In the Gecko profiler the difference is how frequently we tick the refresh driver: If we only tick it whenever the percentage display in the downloads button's progress bar is updated, then we see the lower CPU usage. But sometimes we tick the refresh driver on every vsync while the download is going on, because we think there is some other animation. This causes the higher CPU usage.
I'm not sure what "other animation" that is, and how to get into that state.

Your description here reminds me bug 1742393, that was fixed 5 months ago.

Is there anything actionable here other than bug 1709175?

(In reply to Florian Quèze [:florian] from comment #32)

Your description here reminds me bug 1742393, that was fixed 5 months ago.

Interesting, I agree it does sound like that.

Is there anything actionable here other than bug 1709175?

Probably not.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: