Investigate memory usage of TelemetryStopwatch in content processes

RESOLVED FIXED in Firefox 65

Status

()

defect
P1
normal
RESOLVED FIXED
9 months ago
5 months ago

People

(Reporter: janerik, Assigned: kmag)

Tracking

(Blocks 2 bugs)

unspecified
mozilla65
Points:
2
Dependency tree / graph

Firefox Tracking Flags

(firefox65 fixed)

Details

(Whiteboard: [overhead:28K])

Attachments

(2 attachments)

(Reporter)

Description

9 months ago
This module somehow has quite some memory in use in a content process (~ 11 kB).
We need to find out where and how to eliminate that usage.

One comment from :chutten:

> The `Timers._timers` map never empties so it tends to end up storing concrete references to a dozen or so WeakMaps. 
> Not sure that accounts for the full 11kB, but it might be a chunk of it.
(Reporter)

Updated

9 months ago
Blocks: 1475556
Priority: -- → P3

Updated

7 months ago
Assignee: nobody → chutten
Status: NEW → ASSIGNED
Priority: P3 → P1

Updated

7 months ago
Points: --- → 2

Comment 1

7 months ago
I applied the patch from bug 1463569 (`git checkout -b bug1482091-fissionStopwatch '{central/branches/default/tip@2018-05-25}'` to check out an old enough m-c to then `git am` the patch and then rebase to latest m-c)

I then loaded about:memory and told it to save concise reports, which it did to /tmp/

I cloned amccreight/heapgraph and ran python g/dom_tree.py on the gc-edges file and it dumped a lot of stuff to stdout. Capturing it in a file I could then grep around for TelemetryStopwatch:

script TelemetryStopwatch.jsm:384 3.1k ( TelemetryStopwatchImpl.timeElapsed(histogram, object, key, aCanceledOkay) )
script TelemetryStopwatch.jsm:411 1.7k ( TelemetryStopwatchImpl.finish(histogram, object, key, aCanceledOkay) )
script TelemetryStopwatch.jsm:363 1.5k ( TelemetryStopwatchImpl.start(histogram, object, key, {inSeconds} = {}) )
script TelemetryStopwatch.jsm:66  1.5k ( Timers.put )
NSVO TelemetryStopwatch.jsm 1.4k
script TelemetryStopwatch.jsm:95  1.4k ( Timers.delete )
script TelemetryStopwatch.jsm:86  1.2k ( Timers.has )
script TelemetryStopwatch.jsm:55  1.2k ( Timers.get )
script TelemetryStopwatch.jsm:46  1.2k ( Timers._validTypes )


From this it seems as though all the costs are in bytecode? I assume that's what "script" means.

jan-erik, am I interpreting this correctly? Are we expected to optimize for generated bytecode size?
Flags: needinfo?(jrediger)
(Reporter)

Comment 2

7 months ago
Hey :mccr8, as the one that wrote the dom_tree script, could you shine some light on the details of what these numbers mean and what we can do?
So far I was able to split up our scripts to reduce memory usage, but for this one we're already pretty minimal.
We can't avoid loading TelemetryStopwatch in content processes.
Flags: needinfo?(jrediger) → needinfo?(continuation)
The first entry means that there's a blob of JS stuff (objects, scripts, etc.) where there's some single thing holding the rest of it alive. The total size of all of that stuff is 3.1kb. Somewhere in that blob is the script for TelemetryStopwatchImpl.timeElapsed(), and there is no other script in that blob.

That file does look pretty trimmed down. You could try looking at the GC edges file for the object that contains "TelemetryStopwatchImpl.timeElapsed" in the description and see what it is pointing to. Maybe that will help. My only guess is that maybe those error strings are the cause of some of the memory? It seems suspicious that the 3 methods with errors are showing up, and the one with two errors has twice as much memory. You could try commenting all of that out and see what happens.

Thanks for looking at this. I'm sorry I haven't updated the patch in bug 1463569 in a while. I should put that on my to do list.
Flags: needinfo?(continuation)

Comment 4

6 months ago
I'm not sure I'm reading this right, but here we go. I found the `script` and `lazyScript` lines for TelemetryStopwatch.jsm:384

There are 21 atom lines, four scopes, one sourceObject and one lazyScript.
The first atoms are 
 35585 0x7f9489a63060 B string <atom: length 6> Timers SIZE:: 32
 24254 0x7f948c325c20 B string <permanent atom: length 3> get SIZE:: 32
 35602 0x7f9489a63280 B string <atom: length 15> _suppressErrors SIZE:: 32
 22326 0x7f948c315940 B string <permanent atom: length 2> Cu SIZE:: 32
 30060 0x7f948c3927a0 B string <atom: length 11> reportError SIZE:: 32
 30666 0x7f948c3a3400 B string <atom: length 48> TelemetryStopwatch: requesting elapsed time for  SIZE:: 96
 30667 0x7f948c3a3420 B string <atom: length 35> nonexisting stopwatch. Histogram: " SIZE:: 80
 30659 0x7f948c3a3320 B string <atom: length 3> ",  SIZE:: 32
 30668 0x7f948c3a3440 B string <atom: length 6> key: " SIZE:: 32
 19621 0x7f948c300480 B string <permanent atom: length 1> " SIZE:: 32
 30095 0x7f948c392c00 B string <atom: length 3> now SIZE:: 32
...I presume it just goes on like this with string atoms for each part of the error strings individually pulled out.

What's interesting to me is the presence of things that aren't strings, like "Cu" and "_suppressErrors".

I don't think there's much to be done if all the allocations are overhead for scopes and string pieces. We could try to optimize for shorter error messages, I suppose, but that seems kinda odd.

For a source file 16k in size, with most of its source reproduced as strings in the engine and the rest covered in pointers, should we really be surprised that it is 11k?

I'm going to mark this Invalid as there doesn't seem to be much worth doing here.
Status: ASSIGNED → RESOLVED
Last Resolved: 6 months ago
Resolution: --- → INVALID
(Assignee)

Updated

6 months ago
Assignee: chutten → kmaglione+bmo
Status: RESOLVED → REOPENED
Resolution: INVALID → ---

Comment 5

6 months ago
:kmag, could you give me a quick precis of why this has been reopened? I'm eager to learn what I might have missed.
Flags: needinfo?(kmaglione+bmo)
(Assignee)

Updated

6 months ago
Whiteboard: [overhead:28K]
(Assignee)

Comment 6

6 months ago
This implementation is considerably more memory efficient than the existing JS
implementation. Migrating to it fully saves us about 28K per base content
process.
Attachment #9020262 - Attachment description: Bug 1482091: Part 1 - Add native TelemetryStopwatch implementation. r=chutten,mccr8 → Bug 1482091: Part 1 - Add native TelemetryStopwatch implementation. r=chutten,mccr8,gfritzsche
(Assignee)

Comment 8

6 months ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/a19c6b3e0402d16a77185f82d9fedab83a7ca52e
Bug 1482091: Part 1 - Add native TelemetryStopwatch implementation. r=chutten,mccr8

https://hg.mozilla.org/integration/mozilla-inbound/rev/afd0a13bcfff4c92ad2497ec687f466484481667
Bug 1482091: Part 2 - Remove TelemetryStopwatch.jsm in favor of native implementation. r=chutten
(Assignee)

Comment 10

6 months ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/28a19b7290ab288a1cb2cbf6d49f905cecc9682b
Bug 1482091: Part 1 - Add native TelemetryStopwatch implementation. r=chutten,mccr8

https://hg.mozilla.org/integration/mozilla-inbound/rev/bc03f101937e373fdaca735f3f356960a5239bc0
Bug 1482091: Part 2 - Remove TelemetryStopwatch.jsm in favor of native implementation. r=chutten
(Assignee)

Comment 12

6 months ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/b3e02a455dc587e31302c594e952594c1cf4d64b
Bug 1482091: Part 1 - Add native TelemetryStopwatch implementation. r=chutten,mccr8

https://hg.mozilla.org/integration/mozilla-inbound/rev/26b4dcdfd6274fb7f648a5a66bfc398b1c28ab92
Bug 1482091: Part 2 - Remove TelemetryStopwatch.jsm in favor of native implementation. r=chutten
(Assignee)

Comment 13

6 months ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/dfd04bbbf00d65f7a474ccbf800878f69799e7b6
Bug 1482091: Follow-up: Whitelist errors in another fullscreen test. r=backouts-are-terrible-for-blame-walking-and-should-always-be-a-last-resort
(Assignee)

Comment 14

6 months ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/cad4c41bf7fefbc2f2142ce9d57c7b0a0ee8c4a8
Bug 1482091: Follow-up: Whitelist errors in yet another fullscreen test. r=bustage
(Assignee)

Updated

5 months ago
Flags: needinfo?(kmaglione+bmo)
You need to log in before you can comment on or make changes to this bug.