Add an AWSY test for parent process overhead
Categories
(Testing :: AWSY, task, P3)
Tracking
(Fission Milestone:Future)
Fission Milestone | Future |
People
(Reporter: mccr8, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [fxp])
Attachments
(2 files, 1 obsolete file)
15.41 KB,
patch
|
Details | Diff | Splinter Review | |
45.23 KB,
text/plain
|
Details |
The AWSY base test measures how much memory we use for an empty content process. However, there is also per-content process overhead in other processes. We should have a test for this, for at least the parent process, so that we can avoid regressions.
My current plan is something like this:
a) Open a web page with a bunch of same-origin iframes, and record an about:memory report.
b) Open a web page with a bunch of different-origin iframes, and record an about:memory repot.
c) Extract relevant measurements from the reports, and report the difference (compared to the current tests which report a median).
I'm not entirely sure what the relevant measurements should be. Here are some interesting values in a diff I made locally:
1.30 MB (100.0%) -- explicit
├──1.08 MB (83.25%) ── heap-unclassified
├──0.15 MB (11.54%) -- heap-overhead
│ ├──0.12 MB (09.28%) ── bin-unused
│ ├──0.02 MB (01.36%) ── bookkeeping
│ └──0.01 MB (00.90%) ── page-cache
0.98 MB ── heap-allocated
1.00 MB ── heap-mapped
1.58 MB ── resident
0.88 MB ── resident-peak
0.23 MB ── resident-unique
0.59 MB ── shmem-mapped
16.52 MB ── vsize
The current base process measurement tracks resident-unique, heap-unclassified, js-main-runtime and explicit, which seems like a reasonable starting point for thinking about what to track.
We might want to use resident instead of resident unique, because the general way we've been doing things is attributing the non-unique part of resident to the parent. shmem-mapped might be interesting, but I guess it gets included in one of the other measurements.
Note that js-main-runtime isn't included in the diff above. It is actually lower with Fission enabled in my single test sample (realms is larger, runtime and zones are lower):
-0.04 MB (100.0%) -- js-main-runtime
├──-0.04 MB (113.01%) ── runtime
├───0.02 MB (-49.83%) ++ realms
└──-0.01 MB (36.82%) ++ zones
Conceptually, there could be JS overhead per content process, but if the different is so low, maybe we don't want to track it because it would be very noisy.
Reporter | ||
Comment 1•4 years ago
|
||
I talked to kmag a bit about this, and it sounds like resident, heap-unclassified, js-main-runtime and explicit is a good set of measurements to try. The JS measurement isn't showing much now, but if we add a lot of JS process actors it could start being a factor. Using resident instead of resident-unique will let us get a measurement for shared memory. The shmem-mapped is specific to IPC shared memory, so it might not capture everything.
Updated•4 years ago
|
Updated•4 years ago
|
Reporter | ||
Comment 2•4 years ago
|
||
Reporter | ||
Comment 3•4 years ago
|
||
Updated•4 years ago
|
Reporter | ||
Comment 4•4 years ago
|
||
There's some remaining work here, but this at least works as a proof-of-concept. This does something like start the browser with Fission, waits for awhile, opens a webpage with 9 same origin iframes, waits, takes a measurement, closes the origin page, opens a webpage with 9 different origin iframes, waits, takes a measurement. Because both measurements are taken with Fission, if there's some overhead for a single origin when Fission is enabled (due to SHIP or whatever) it won't be measured.
-
It needs to be split into its own test. Right now, I just hacked up the existing test.
-
Whatever scalars are interesting need to be extracted from the reports, their difference taken, then divided by the expected number of processes. There's some existing code for extracting scalars, so the actual programming of this shouldn't be hard.
-
There's some stuff in the diff that doesn't look related to process overhead. Maybe mostly this loading.svg thing? The diff looked better than I remembered.
-
The stability of this should be assessed.
Here's an extract of some non-trivial bits of the diff of parent process memory. There are 10 "web" content processes involved, so this is about 146kb per process, which is around 1% of the child process overhead.
1.46 MB (100.0%) -- explicit
├──1.55 MB (106.71%) ── heap-unclassified
├──-0.33 MB (-22.59%) ++ heap-overhead
├──0.13 MB (08.67%) -- images
│ ├──0.13 MB (08.64%) -- chrome/vector/used/progress=18f
│ │ ├──0.12 MB (08.06%) ++ image(480x16, chrome://browser/skin/tabbrowser/loading.svg)/locked/types=1/surface(960x32, svgContext:[ viewport=(480x16) contextPaint=( fill=ff0d0c0c fillOpa=1 strokeOpa=1 ) ])
├──0.06 MB (04.34%) -- js-non-window
│ ├──0.06 MB (04.32%) -- zones/zone(0xNNN)
│ │ ├──0.06 MB (03.88%) ++ realm([System Principal], shared JSM global)/classes
I also have a patch in my stack for bug 1683911 that stops the browser from discarding the OS file worker after 30 seconds.
Given that the bulk of the heap overhead is heap-unclassified, it would probably be good to look at what that is, which will require using DMD. I'm not sure how clean DMD diffs will be. I'd expect that is mostly IPC overhead, but who knows. The same origin case has 33 IPC channels, as compared to the different origin case which has 132.
Reporter | ||
Comment 5•4 years ago
|
||
Reporter | ||
Comment 6•4 years ago
|
||
Here's a diff for the heap-unclassified. I limited to 3 stack frames to try to collapse things together as much as possible.
The total was about 1.5MB of heap-unclassified. 256KB of stuff wasn't sampled, though I could redo this with full stacks to get better numbers. About 500KB of that is IPC channels. Another few hundred KB of IPC actors. There's also a bit of APZ/layers kind of stuff, which makes sense.
Reporter | ||
Comment 7•4 years ago
|
||
Bug 1639922 is an existing bug we have on file for adding memory reporting for IPC.
Comment 8•4 years ago
|
||
This isn't needed right now, but. we should get the test ready at some point so moving to MVP.
Reporter | ||
Comment 10•4 years ago
|
||
Not actively. It seems like a good idea to have the test around in case people want to run it given that I already wrote most of the test, but the parent process overhead is so low I don't think we'll want to run it regularly. I got bogged down in trying to untangle the different variants of AWSY that sort of but not entirely share code.
Updated•4 years ago
|
Comment 11•4 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #10)
Not actively. It seems like a good idea to have the test around in case people want to run it given that I already wrote most of the test, but the parent process overhead is so low I don't think we'll want to run it regularly. I got bogged down in trying to untangle the different variants of AWSY that sort of but not entirely share code.
In that case, we can move this bug from Fission MVP to Future.
Updated•22 days ago
|
Updated•22 days ago
|
Reporter | ||
Comment 12•21 days ago
|
||
Probably not worth the effort. Kind of neat to see the overhead but I don't think we ever had any concerted effort to remove the parent overhead.
Description
•