Memory telemetry for DevTools-related memory leaks
Categories
(DevTools :: Framework, enhancement, P2)
Tracking
(Not tracked)
People
(Reporter: Harald, Unassigned)
References
(Blocks 1 open bug)
Details
We need to measure impact of DevTools sessions across over time.
The assumptions are:
- Leaks occur during general usage but especially page refreshes and live reloading.
- Some leaks are limited to the toolbox lifespan and collected after closing; while some leaked data are not collected
- Some recent leaks have been identified on the content process, while we also know about frontend leaks on the parent process.
The best case would be to collect memory size on both parent and content process. Related, the current memory_total
probe is only collected for parent process: https://probes.telemetry.mozilla.org/?search=memory_total&view=detail&probeId=histogram%2FMEMORY_TOTAL
We could take memory snapshots 1) after opening DevTools 2) before closing DevTools and 3) after closing and GC. An MVP could report just the memory diff from 2) to 3).
Since memory measurements can be expensive for performance; we probably want to limit collection to Nightly and DevEdition.
One open question would be if we want to force GC to get accurate measurements after closing DevTools.
Comment 1•5 years ago
|
||
I haven't looked at the code yet, but the MEMORY_TOTAL telemetry page references bug 1198209 which makes it sound like maybe the value is the total of the parent process plus all content processes.
Comment 2•5 years ago
|
||
(In reply to :Harald Kirschner :digitarald from comment #0)
We could take memory snapshots 1) after opening DevTools 2) before closing DevTools and 3) after closing and GC. An MVP could report just the memory diff from 2) to 3).
I think that the most important usecase to cover would be page reload.
The snapshots you mentioned may not help tracking these leaks easily?
I'm wondering if we could only record memory just before the page reload starts?
Knowing when the page reload is done is easy, but it is a bit challenging to know when the tools are done processing the page reload.
Only these three tools support tracking that correctly.
I'm not a telemetry expert, but if can record each individual record for all firefox instances, we could see if the memory only goes up, and if it does by how much. We could have to force a GC. The engine should do some every now and then. If you do many page reload, a GC will surely be trigerred.
One open question would be if we want to force GC to get accurate measurements after closing DevTools.
While it sounds important to force a GC when recording toolbox closing, we may find other metrics, like page reload, where forced GC might not be necessary. Instead we could rely on the fact that a GC should be done by platform, just because of the action we are measuring should force one.
Having said that, I'm happy to experiment many approaches here.
Is it easy to experiment with Nightly metrics and tweak the recording code in nightly after let say one week of nightly data?
Reporter | ||
Comment 3•5 years ago
|
||
I think that the most important usecase to cover would be page reload. The snapshots you mentioned may not help tracking these leaks easily?
Would it make sense to monitor the memory diff for before and after reload? Are you concerned about memory that devtools retains even after reloads or how much memory is leaked within a page session?
The snapshots you mentioned may not help tracking these leaks easily?
It would not highlight reload specifically. But if memory accumulates between reloads; it would show in the diff between 1) and 2); just with reload count as the denominator. The overarching measurement would be used for general leak tracking; but the reload specific makes it easier to optimize those specific leaks. Would that work?
Is it easy to experiment with Nightly metrics and tweak the recording code in nightly after let say one week of nightly data?
Depends on how noisy the data is and how much we need to get confidence. Should be fine. As long as its Nightly only we can experiment.
Reporter | ||
Comment 4•5 years ago
|
||
Alex, going back to the decision on what should be the MVP here. Which of the discussed telemetry do you think will give us the biggest bang for the buck and should be tackled first?
Comment 5•5 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Comment 6•3 years ago
|
||
I'm not sure we will have cycles anytime soon to introduce new telemetry probes.
Description
•