Closed Bug 1463569 Opened 6 years ago Closed 5 years ago

Tooling for investigating content process chrome JS memory usage

Categories

(Core :: JavaScript Engine, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla68
Fission Milestone M2
Tracking Status
firefox68 --- fixed

People

(Reporter: mccr8, Assigned: mccr8)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [2/28] Active work underway. [overhead:noted])

Attachments

(2 files, 6 obsolete files)

I'm looking at some scripts to analyze GC logs to try to figure out where our chrome JS memory usage is going. This bug is mostly about sharing my intermediate results so other people can look at them. I don't know what the right component is.
The script I have right now is part of my heap graph repository. The script is g/dom_tree.py. It loads in a GC log (which can be generated by clicking on "save concise" in about:memory), and then it computes a dominator tree for all JS objects in memory. (The entry points to the graph are the roots, which includes gray roots. It would be more accurate if it included the CC graph and also took into account CC roots.)

Right now, the script filters out the nodes that dominate the most other nodes, and outputs a .dot file, which can be passed to graphviz to get a PDF.
Attached file Example output (obsolete) —
Here's an example of what the output looks like. I've filtered out all shapes from the graph (skipShape is true in the script), and also all objects that dominate less than 40 objects ("domLimit" in the script).

The number after the descriptor for an object is the number of object it dominates, either immediately or transitively. If no number is present that means that all of the children are displayed in the graph, so you can add them up to see what the total is for the node.

For instance, "ContentProcessMessageManager 1095" means that the content process message manager is holding alive about a thousand objects, which is not too surprising.

An entry like "NSVO LoginManagerContent.jsm 60" roughly means that the local variables for the script LoginManagerContent.jsm dominate 60 other objects. (This is a hack that just looks at the __URI__ property on the NSVO.)

An entry like "script PdfStreamConverter.jsm:860 67" means that the script at line 860 in that JSM dominates 67 objects in the graph.

Nodes that are hex values in the graph give the address of a JS object. This isn't useful for somebody just looking at the graph, but if you have the GC edges files, you can look up the object and try to figure out where it is coming from. For instance, "0x7f6c2c9096a0 173" looks like it is the Printing object declared in the browser-content.js script, declared here:
  https://searchfox.org/mozilla-central/rev/eb51d734a8c06ac5973291be9bcc74ed4e999bca/toolkit/content/browser-content.js#493
Attached file threshold 500, shapes included (obsolete) —
This is what it looks like if you hide everything that dominates less than 500 nodes and include shapes. Snakes, snakes everywhere! On the up side, increasing the threshold like this lets you see what the nodes are that dominate the most other nodes (self hosting global, message manager stuff, some window global, a backstage pass, 3 different document prototypes, "WebExtensionPolicy", and browser-content.js.
Also, keep in mind that these logs are for the entire content process, so they included content memory. The only page I have loaded is example.com, so it should be minimal. (I also have the new tab page disabled, because that adds a ton of memory usage, but should not be relevant with process-per-domain.)
I basically copied over the JS GC thing reporting code into the DumpJSHeap code, so each object gets its size logged. It differs a little from the reporting code because all of the memory attributable to that object is combined, whereas with the reporting code it will get split up. (This is getting very close to what the UbiNode stuff does, I believe.)

I also updated my dom_tree.py analysis scipt to take this into account, so it will report the total size of all entrained objects. (I also made it so that the output for the dot format is no longer a hard coded path, and made the default output text, because the graphs are so simple.)
Attached file sample text output (obsolete) —
Here's some sample output, with a threshold set to 1000 bytes. This is my attempt at ASCII art:

| Function AndroidAppender (1560 bytes)
|-- 0x7facfef8f700 (1432 bytes)
|---- 0x7facfef8f720 (1056 bytes)

Note that the size is cumulative, so the 1560 bytes for AndroidAppender (Android?!?!) includes the 1432 bytes for the object 0x7facfef8f700.
I added a new option --script-split. First, it tries to figure out if each object is associated with a script file (like content.js) or not. There are only two cases for this right now. First, a script object has a script URL, so it uses that. Second, an NSVO for a JSM (basically a special kind of scope object we use for JSMs) has the __URL__ property set to the URL of the script, so it uses that.

Once that is done, if there are objects from exactly one script file, it associates that entire dominator tree with that script file. If there are objects from more than one script file, it labels them as "multiple scripts". An example of this is the ContentFrameMessageManager. If there are not objects from any script, it labels them as "no script found". An example of this is the window global. It is possible that in some of these cases we could do a better job of guessing the script.

Here's an example of the output of for a script (with some redundant lines elided with [...]):

ExtensionUtils.jsm (30136 bytes)
------------------------------
| script ExtensionUtils.jsm:436 (2344 bytes)

| NSVO ExtensionUtils.jsm (2240 bytes)

| script ExtensionUtils.jsm:379 (1920 bytes)
|-- Function promiseObserved/< (1168 bytes)
|---- script ExtensionUtils.jsm:380 (1000 bytes)

etc.

This means that the total retained size of dominator trees associated with ExtensionUtils.jsm was 30kb.
Assignee: nobody → continuation
Attached file sample split-by-script text output (obsolete) —
This is with an m-c from today, so it includes Felipe's various patches (such as bug 1462673) to load things more lazily.
Priority: -- → P3
Here's an updated sample of the output, after Felipe and Kris landed a bunch of patches to make some things loaded more lazily. browser-content.js is down from 124112 bytes to 60480, so that content-sessionStore.js is now instead the second largest script in terms of memory (after self hosted code). content.js is down from 64904 bytes to 33632. RemotePageManager.jsm is down from 51520 bytes to 30248.
Attachment #8979747 - Attachment is obsolete: true
Attachment #8979749 - Attachment is obsolete: true
Attachment #8980751 - Attachment is obsolete: true
Attachment #8981681 - Attachment is obsolete: true
Attached file actually updated
Felipe pointed out that I actually just reuploaded the same log. Oops.
Attachment #8986284 - Attachment is obsolete: true
Whiteboard: [overhead:noted]
Fission Milestone: --- → M2
Priority: P3 → P2
Whiteboard: [overhead:noted] → [2/28] Active work underway. [overhead:noted]

I was spending a ton of time working on some weird refactoring of code to try to reuse existing code from StatsCellCallback as much as possible... but it looks like the JS::ubi::Concrete<T>::size() methods already do exactly what I need! Oops. They do have the same kind of problem of code duplication, but given that the work's already been done, I think it is okay to use it.

Status: NEW → ASSIGNED

If the environment variable MOZ_GC_LOG_SIZE is set, then GC logs will
include some extra data that indicates how much memory the GC thing
uses, including extra data hanging off of it. This can be used
calculate finer grained information about the memory usage of scripts.

Attachment #8980746 - Attachment is obsolete: true
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8fed1a9805c5
Add a way to log the sizes of GC things to GC logs. r=sfink
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: