Closed Bug 1665318 Opened 4 years ago Closed 3 years ago

In about:processes refresh, ResidentUniqueDistinguishedAmount is slow

Categories

(Core :: Performance, defect, P3)

defect

Tracking

()

RESOLVED FIXED
94 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox-esr91 --- wontfix
firefox92 --- wontfix
firefox93 --- wontfix
firefox94 --- fixed

People

(Reporter: Yoric, Assigned: florian)

References

(Regression)

Details

(Keywords: regression)

Attachments

(1 file)

No description provided.

Fission M7 Beta

Severity: -- → S3
Fission Milestone: --- → M7
Priority: -- → P3

Frankly, at this stage, I don't know how to act upon the information.

Fission Milestone: M7 → MVP
Component: DOM: Content Processes → Performance
Summary: In about:process refresh for macOS, ResidentUniqueDistinguishedAmount is slow → In about:processes refresh for macOS, ResidentUniqueDistinguishedAmount is slow

Removing Fission milestone tracking for about:processes work.

Fission Milestone: MVP → ---

Here's a profile of it: https://share.firefox.dev/3cHEGvU

The Mac implementation of ResidentUniqueDistinguishedAmount seems very slow, especially the mach_vm_region calls at https://searchfox.org/mozilla-central/rev/54f37fc1ac0f98b590af51e01ce82bb74179bf63/xpcom/base/nsMemoryReporterManager.cpp#480

Nika, you were asking in bug 1652813 for the resident unique value to be shown instead of the resident one (that I assume was less expensive to compute). Do you have a suggestion about what we could do here?

Flags: needinfo?(nika)
Regressed by: 1652813
Has Regression Range: --- → yes

Maybe Andrew could also help here.

Flags: needinfo?(continuation)

I don't have any idea, sorry.

Flags: needinfo?(continuation)
Assignee: nobody → florian
Status: NEW → ASSIGNED

The slowness isn't limited to Mac.

I profiled the "GetProcInfo" runnables running on the StreamTrans threads on Mac/Linux/Windows. On Mac ResidentUniqueDistinguishedAmount dominates the profile (it's more than 98% of the samples). On Linux it's a large part of the time too. On Windows it doesn't dominate the time, because thread enumeration is very slow, but it's still about 40% of the time, to still worth optimizing.

Ideally we would get values from the operating system without doing any computations, and these values would match what's shown in the system's task manager. On Mac I found an API to do that (gives results in a few microseconds, compared to the 50ms ResidentUniqueDistinguishedAmount was taking), on Linux it seems the gnome system monitor shows in its memory column the result of substracting the shared memory from the resident size; we can do that too, and it's cheap. On Windows I couldn't find an API to access the value shown in the "Memory" column of the task manager. The value shown there is the value of the "Working Set - Private" counter, and I found no API to access it. The table at https://docs.microsoft.com/en-us/previous-versions/windows/desktop/legacy/aa965225(v=vs.85)#process-memory-performance-information says 'None'. It seems the best value we can get cheaply is "Private Bytes". This is the value shown by the "Process Hacker" tool, and also by the Chrome task manager, so should be good enough.

Summary: In about:processes refresh for macOS, ResidentUniqueDistinguishedAmount is slow → In about:processes refresh, ResidentUniqueDistinguishedAmount is slow
Pushed by fqueze@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dd1b040b050a
reduce the overhead of collecting memory information for about:processes, r=dthayer.
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 94 Branch
Flags: needinfo?(nika)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: