Closed Bug 1668636 Opened 4 years ago Closed 21 days ago

Add an AWSY test for parent process overhead

Categories

(Testing :: AWSY, task, P3)

Default
task

Tracking

(Fission Milestone:Future)

RESOLVED INCOMPLETE
Fission Milestone Future

People

(Reporter: mccr8, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxp])

Attachments

(2 files, 1 obsolete file)

The AWSY base test measures how much memory we use for an empty content process. However, there is also per-content process overhead in other processes. We should have a test for this, for at least the parent process, so that we can avoid regressions.

My current plan is something like this:
a) Open a web page with a bunch of same-origin iframes, and record an about:memory report.
b) Open a web page with a bunch of different-origin iframes, and record an about:memory repot.
c) Extract relevant measurements from the reports, and report the difference (compared to the current tests which report a median).

I'm not entirely sure what the relevant measurements should be. Here are some interesting values in a diff I made locally:

1.30 MB (100.0%) -- explicit
├──1.08 MB (83.25%) ── heap-unclassified
├──0.15 MB (11.54%) -- heap-overhead
│ ├──0.12 MB (09.28%) ── bin-unused
│ ├──0.02 MB (01.36%) ── bookkeeping
│ └──0.01 MB (00.90%) ── page-cache

0.98 MB ── heap-allocated
1.00 MB ── heap-mapped
1.58 MB ── resident
0.88 MB ── resident-peak
0.23 MB ── resident-unique
0.59 MB ── shmem-mapped
16.52 MB ── vsize

The current base process measurement tracks resident-unique, heap-unclassified, js-main-runtime and explicit, which seems like a reasonable starting point for thinking about what to track.

We might want to use resident instead of resident unique, because the general way we've been doing things is attributing the non-unique part of resident to the parent. shmem-mapped might be interesting, but I guess it gets included in one of the other measurements.

Note that js-main-runtime isn't included in the diff above. It is actually lower with Fission enabled in my single test sample (realms is larger, runtime and zones are lower):
-0.04 MB (100.0%) -- js-main-runtime
├──-0.04 MB (113.01%) ── runtime
├───0.02 MB (-49.83%) ++ realms
└──-0.01 MB (36.82%) ++ zones

Conceptually, there could be JS overhead per content process, but if the different is so low, maybe we don't want to track it because it would be very noisy.

I talked to kmag a bit about this, and it sounds like resident, heap-unclassified, js-main-runtime and explicit is a good set of measurements to try. The JS measurement isn't showing much now, but if we add a lot of JS process actors it could start being a factor. Using resident instead of resident-unique will let us get a measurement for shared memory. The shmem-mapped is specific to IPC shared memory, so it might not capture everything.

Severity: -- → N/A
Status: NEW → ASSIGNED
Fission Milestone: --- → M7
Blocks: fission
Priority: P2 → P1
Here's my rough prototype. You run it with: ./mach awsy-test testing/awsy/awsy/test_base_memory_usage.py It takes a while to run, because it sits there doing nothing for a long time. Probably the idle time can be reduced. It'll create two about:memory logs, and print out the location of each in a line starting with "checkpoint created, stored in". In a regular Firefox session, you can use the load and diff function to look at the results.
Attachment #9194472 - Attachment is obsolete: true

There's some remaining work here, but this at least works as a proof-of-concept. This does something like start the browser with Fission, waits for awhile, opens a webpage with 9 same origin iframes, waits, takes a measurement, closes the origin page, opens a webpage with 9 different origin iframes, waits, takes a measurement. Because both measurements are taken with Fission, if there's some overhead for a single origin when Fission is enabled (due to SHIP or whatever) it won't be measured.

  1. It needs to be split into its own test. Right now, I just hacked up the existing test.

  2. Whatever scalars are interesting need to be extracted from the reports, their difference taken, then divided by the expected number of processes. There's some existing code for extracting scalars, so the actual programming of this shouldn't be hard.

  3. There's some stuff in the diff that doesn't look related to process overhead. Maybe mostly this loading.svg thing? The diff looked better than I remembered.

  4. The stability of this should be assessed.

Here's an extract of some non-trivial bits of the diff of parent process memory. There are 10 "web" content processes involved, so this is about 146kb per process, which is around 1% of the child process overhead.

1.46 MB (100.0%) -- explicit
├──1.55 MB (106.71%) ── heap-unclassified
├──-0.33 MB (-22.59%) ++ heap-overhead
├──0.13 MB (08.67%) -- images
│ ├──0.13 MB (08.64%) -- chrome/vector/used/progress=18f
│ │ ├──0.12 MB (08.06%) ++ image(480x16, chrome://browser/skin/tabbrowser/loading.svg)/locked/types=1/surface(960x32, svgContext:[ viewport=(480x16) contextPaint=( fill=ff0d0c0c fillOpa=1 strokeOpa=1 ) ])
├──0.06 MB (04.34%) -- js-non-window
│ ├──0.06 MB (04.32%) -- zones/zone(0xNNN)
│ │ ├──0.06 MB (03.88%) ++ realm([System Principal], shared JSM global)/classes

I also have a patch in my stack for bug 1683911 that stops the browser from discarding the OS file worker after 30 seconds.

Given that the bulk of the heap overhead is heap-unclassified, it would probably be good to look at what that is, which will require using DMD. I'm not sure how clean DMD diffs will be. I'd expect that is mostly IPC overhead, but who knows. The same origin case has 33 IPC channels, as compared to the different origin case which has 132.

Here's a diff for the heap-unclassified. I limited to 3 stack frames to try to collapse things together as much as possible.

The total was about 1.5MB of heap-unclassified. 256KB of stuff wasn't sampled, though I could redo this with full stacks to get better numbers. About 500KB of that is IPC channels. Another few hundred KB of IPC actors. There's also a bit of APZ/layers kind of stuff, which makes sense.

Bug 1639922 is an existing bug we have on file for adding memory reporting for IPC.

See Also: → 1639922

This isn't needed right now, but. we should get the test ready at some point so moving to MVP.

Fission Milestone: M7 → MVP

:mccr8, are you still working on this task?

Flags: needinfo?(continuation)

Not actively. It seems like a good idea to have the test around in case people want to run it given that I already wrote most of the test, but the parent process overhead is so low I don't think we'll want to run it regularly. I got bogged down in trying to untangle the different variants of AWSY that sort of but not entirely share code.

Flags: needinfo?(continuation)
Assignee: continuation → nobody
Severity: N/A → S3
Status: ASSIGNED → NEW
Priority: P1 → P2

(In reply to Andrew McCreight [:mccr8] from comment #10)

Not actively. It seems like a good idea to have the test around in case people want to run it given that I already wrote most of the test, but the parent process overhead is so low I don't think we'll want to run it regularly. I got bogged down in trying to untangle the different variants of AWSY that sort of but not entirely share code.

In that case, we can move this bug from Fission MVP to Future.

Severity: S3 → N/A
Fission Milestone: MVP → Future
Priority: P2 → P3
Whiteboard: [fxp]

Probably not worth the effort. Kind of neat to see the overhead but I don't think we ever had any concerted effort to remove the parent overhead.

Status: NEW → RESOLVED
Closed: 21 days ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: