Closed
Bug 1315044
Opened 8 years ago
Closed 8 years ago
about:debugging leaks memory in child process
Categories
(DevTools :: about:debugging, defect, P1)
Tracking
(firefox49 wontfix, firefox50 wontfix, firefox51 wontfix, firefox52 wontfix, firefox-esr52 wontfix, firefox53 fixed)
RESOLVED
FIXED
Firefox 53
People
(Reporter: bugzilla.mozilla.org, Assigned: ochameau)
References
(Blocks 1 open bug)
Details
(Keywords: memory-leak, Whiteboard: [MemShrink:P2])
Attachments
(2 files, 1 obsolete file)
Spun off from bug 1308332 comment 40 and following.
I'm seeing increased footprint of the SDK in child processes.
memory report from 29.10 (build date is probably several days older)
├───83.42 MB (15.42%) -- js-non-window
│ ├──51.91 MB (09.60%) -- zones
│ │ ├──37.21 MB (06.88%) -- zone(0x1e64d7a6000)
│ │ │ ├──30.82 MB (05.70%) -- (223 tiny)
│ │ │ │ ├───4.92 MB (00.91%) ++ compartment([System Principal], Addon-SDK (from: resource://gre/modules/commonjs/toolkit/loader.js:414))
this is from nightly 02-11-2016:
├────701.46 MB (53.11%) -- js-non-window
│ ├──670.12 MB (50.74%) -- zones
│ │ ├──660.08 MB (49.97%) -- zone(0x1f2b6b86000)
│ │ │ ├──285.96 MB (21.65%) -- compartment([System Principal], Addon-SDK (from: resource://gre/modules/commonjs/toolkit/loader.js:414))
│ │ │ │ ├──129.76 MB (09.82%) -- classes
│ │ │ │ │ ├───72.74 MB (05.51%) -- class(Function)/objects
│ │ │ │ │ │ ├──69.44 MB (05.26%) ── gc-heap [487]
│ │ │ │ │ │ └───3.30 MB (00.25%) ── malloc-heap/slots [487]
│ │ │ │ │ ├───33.47 MB (02.53%) -- class(Object)/objects
│ │ │ │ │ │ ├──23.88 MB (01.81%) ── gc-heap [487]
│ │ │ │ │ │ └───9.59 MB (00.73%) ++ malloc-heap
│ │ │ │ │ ├───23.46 MB (01.78%) -- class(<non-notable classes>)/objects
│ │ │ │ │ │ ├──19.55 MB (01.48%) ── gc-heap [487]
│ │ │ │ │ │ └───3.91 MB (00.30%) ++ malloc-heap
│ │ │ │ │ └────0.10 MB (00.01%) ++ (2 tiny)
│ │ │ │ ├──121.88 MB (09.23%) -- scripts
│ │ │ │ │ ├───76.28 MB (05.78%) ── gc-heap [487]
│ │ │ │ │ └───45.60 MB (03.45%) ── malloc-heap/data [487]
│ │ │ │ ├───17.94 MB (01.36%) ++ (2 tiny)
│ │ │ │ └───16.37 MB (01.24%) -- type-inference
│ │ │ │ ├──16.37 MB (01.24%) ── object-type-tables [485]
│ │ │ │ └───0.01 MB (00.00%) ── allocation-site-tables
│ │ │ ├───90.99 MB (06.89%) -- compartment([System Principal], resource://gre/modules/commonjs/toolkit/loader.js)
│ │ │ │ ├──78.97 MB (05.98%) -- classes
│ │ │ │ │ ├──56.64 MB (04.29%) -- class(Object)/objects
│ │ │ │ │ │ ├──37.46 MB (02.84%) ── malloc-heap/slots
│ │ │ │ │ │ └──19.19 MB (01.45%) ── gc-heap
│ │ │ │ │ └──22.33 MB (01.69%) ++ (7 tiny)
│ │ │ │ └──12.01 MB (00.91%) ++ (2 tiny)
│ │ │ ├───80.90 MB (06.13%) -- shapes
│ │ │ │ ├──53.61 MB (04.06%) -- gc-heap
│ │ │ │ │ ├──44.35 MB (03.36%) ── tree
│ │ │ │ │ └───9.26 MB (00.70%) ++ (2 tiny)
│ │ │ │ └──27.29 MB (02.07%) -- malloc-heap
│ │ │ │ ├──22.76 MB (01.72%) ── tree-tables
│ │ │ │ └───4.53 MB (00.34%) ++ (2 tiny)
│ │ │ ├───74.41 MB (05.63%) -- scopes
│ │ │ │ ├──41.16 MB (03.12%) ── malloc-heap
│ │ │ │ └──33.25 MB (02.52%) ── gc-heap
│ │ │ ├───58.86 MB (04.46%) ++ (205 tiny)
│ │ │ ├───33.36 MB (02.53%) ── type-pool
│ │ │ ├───21.39 MB (01.62%) -- object-groups
│ │ │ │ ├──21.18 MB (01.60%) ── gc-heap
│ │ │ │ └───0.21 MB (00.02%) ── malloc-heap
│ │ │ └───14.21 MB (01.08%) ++ compartment([System Principal], [anonymous sandbox] (from: resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/shared/builtin-modules.js:185))
│ │ └───10.05 MB (00.76%) ++ (15 tiny)
I'm also seeing increased CC pause times
CC(T+54785.3)[content] max pause: 3420ms, total time: 3838ms, slices: 38, suspected: 171, visited: 19669 RCed and 4615878 GCed, collected: 167 RCed and 912 GCed (912|0|0 waiting for GC)
ForgetSkippable 8 times before CC, min: 1 ms, max: 6 ms, avg: 1 ms, total: 15 ms, max sync: 0 ms, removed: 663
It does seem like the number of SDK compartments has gone up a lot:
old:
547 (100.0%) -- js-main-runtime-compartments
├──338 (61.79%) ++ user
└──209 (38.21%) -- system
├──182 (33.27%) ++ (176 tiny)
└───27 (04.94%) ── [System Principal], outOfProcessTabChildGlobal [27]
new:
2,228 (100.0%) -- js-main-runtime-compartments
├──2,112 (94.79%) -- system
│ ├────487 (21.86%) ── [System Principal], Addon-SDK (from: resource://gre/modules/commonjs/toolkit/loader.js:414) [487]
│ ├────485 (21.77%) ── [System Principal], [anonymous sandbox] (from: resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/shared/builtin-modules.js:185) [485]
│ ├────485 (21.77%) ── [System Principal], [anonymous sandbox] (from: resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/shared/builtin-modules.js:209) [485]
│ ├────485 (21.77%) ── [System Principal], [anonymous sandbox] (from: resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/shared/builtin-modules.js:24) [485]
│ └────170 (07.63%) ++ (159 tiny)
└────116 (05.21%) ++ user
Comment 1•8 years ago
|
||
The Addon SDK loader changes were uplifted to 50/51 as well. It would be good to know if they're also affected.
status-firefox50:
--- → ?
status-firefox51:
--- → ?
status-firefox52:
--- → affected
Keywords: regression
Comment 2•8 years ago
|
||
I don't see anything in telemetry to suggest that this is a wide scale regression. In particular, metrics like GC and CC pauses have been similar, or slightly improved:
https://telemetry.mozilla.org/new-pipeline/evo.html#!aggregates=median!5th-percentile!95th-percentile&cumulative=0&end_date=null&keys=&max_channel_version=nightly%252F52&measure=CYCLE_COLLECTOR_MAX_PAUSE&min_channel_version=nightly%252F52&product=Firefox&sanitize=1&sort_keys=submissions&start_date=null&trim=1&use_submission_date=0
It would be good to have more details about the add-ons you have installed, and the first build that you see this regression in.
> and the first build that you see this regression in.
I don't have a STR yet as I only noticed this after a day of uptime, so i can't narrow the range yet. I'll check if this happens immediately after startup or is a buildup over time during the next nightly update.
> In particular, metrics like GC and CC pauses have been similar, or slightly improved:
Can that be filtered by uptime and profiles that have sdk addons? If the issue accumulates then maybe it gets drowned out in aggregate stats that contain sessions not matching those criteria.
> It would be good to have more details about the add-ons you have installed,
As far as I can tell the SDK consumers are
* https://addons.mozilla.org/en-US/firefox/addon/ipfs-gateway-redirect/
* https://addons.mozilla.org/en-US/firefox/addon/new-tab-override/
* devtools
Disabling the addons + minimize memory usage does not free up the compartments.
Comment 4•8 years ago
|
||
(In reply to The 8472 from comment #3)
> > In particular, metrics like GC and CC pauses have been similar, or slightly improved:
>
> Can that be filtered by uptime and profiles that have sdk addons? If the
> issue accumulates then maybe it gets drowned out in aggregate stats that
> contain sessions not matching those criteria.
Not easily, but even if it were limited to long uptimes and profiles with SDK
add-ons, I'd expect to see it reflected in the 95th percentile of at least
some metrics.
> > It would be good to have more details about the add-ons you have installed,
>
> As far as I can tell the SDK consumers are
>
> * https://addons.mozilla.org/en-US/firefox/addon/ipfs-gateway-redirect/
> * https://addons.mozilla.org/en-US/firefox/addon/new-tab-override/
> * devtools
A full list of extensions and their versions would be helpful. Even if they're
not SDK add-ons, other add-ons may be involved in the issue. As much other
information that you can provide from about:support (or other preferences
you've changed that aren't reflected there) would also be helpful.
> Disabling the addons + minimize memory usage does not free up the
> compartments.
That's suspicious, but not entirely surprising. As far as I can tell, the
SDK never nukes the sandboxes for its child process loader when extensions are
disabled, so if they're leaking, there's a good chance they'd stay leaked.
Comment 5•8 years ago
|
||
It might be notable that the ipfs-gateway-redirect add-on is one of only about 4 or 5 add-ons on AMO that use remoteRequire directly.
Updated•8 years ago
|
Whiteboard: [MemShrink]
I have restarted my FF session and the # of compartments in the child processes is low at the moment. So it's probably something that accumulates over time. I'll keep observing it.
(In reply to Kris Maglione [:kmag] from comment #5)
> It might be notable that the ipfs-gateway-redirect add-on is one of only
> about 4 or 5 add-ons on AMO that use remoteRequire directly.
I wrote that part for e10s compatibility. There is something that might be a semi-leak in there, but not to the extent of the [485] SDK duplicates in comment 0, since for each leak the addon would need to be turned off and on again.
Comment 7•8 years ago
|
||
(In reply to The 8472 from comment #6)
> (In reply to Kris Maglione [:kmag] from comment #5)
> > It might be notable that the ipfs-gateway-redirect add-on is one of only
> > about 4 or 5 add-ons on AMO that use remoteRequire directly.
>
> I wrote that part for e10s compatibility. There is something that might be a
> semi-leak in there, but not to the extent of the [485] SDK duplicates in
> comment 0, since for each leak the addon would need to be turned off and on
> again.
Oh, sorry, I didn't notice that you added compartment counts to this bug. Yeah, something's definitely wrong there. But I still suspect it may have something to do with that add-on calling remoteRequire directly, especially since the impact of these changes on that metric is pretty clearly in the other direction:
https://telemetry.mozilla.org/new-pipeline/evo.html#!aggregates=median!mean!5th-percentile!25th-percentile!75th-percentile!95th-percentile&cumulative=0&end_date=2016-10-17&keys=!__none__!__none__&max_channel_version=nightly%252F52&measure=MEMORY_JS_COMPARTMENTS_SYSTEM&min_channel_version=nightly%252F52&processType=true&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2016-10-17&trim=1&use_submission_date=0
Either way, I can't reproduce it, and I still don't know nearly enough about your configuration to really try.
Comment 8•8 years ago
|
||
(In reply to The 8472 from comment #6)
> I have restarted my FF session and the # of compartments in the child
> processes is low at the moment. So it's probably something that accumulates
> over time. I'll keep observing it.
You could try opening and closing windows six or so times and see if anything goes up. That's a common pattern to trigger leaks that are slowly growing.
I think I have narrowed down the issue and it may not be related to installed addons at all.
1. open some tabs so child processes exist
2. open about:debugging
3. repeatedly switch back and forth between "add-ons" and "workers"
4. count sdk compartments in the child process via about:memory
On a profile with only the built-in addons it still causes the compartment count go up.
I guess I did leave about:debugging open over night and it occasionally refreshes, or something like that.
Comment 10•8 years ago
|
||
Yup, I can reproduce it with those steps. Thanks.
Comment 11•8 years ago
|
||
I just tested and was able to reproduce this in an older version of Aurora without the changes from bug 1308332, so it seems unrelated.
No longer blocks: 1308332
Component: General → Developer Tools: about:debugging
Product: Add-on SDK → Firefox
Summary: Increased SDK footprint in child processes → about:debugging leaks memory in child process
Blocks: dt-leak
Assignee | ||
Comment 12•8 years ago
|
||
Thanks a ton for the STR!
It looks like the Worker tab in about:debugging creates tons of compartments.
Assignee: nobody → poirot.alex
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 15•8 years ago
|
||
Every single layer in the devtools onion miss some caching.
Here, we end up creating brand new SDK loader when opening the Worker panel.
The very precise issue is in content-server.jsm which create a new loader and a new debugger server.
This method somewhat assumes we ever try to create a ContentActor only once per process,
but no code is actually ensuring that it doesn't happen.
With these two patches, the situation is cleared on the backend side. I imagine there is also things to improve in the client API and surely in about:debugging. Also, it looks like about:debugging never closes its client, I think we should try to do that in order to ensure things get cleaned up when we close it.
Updated•8 years ago
|
Blocks: 1225473
status-firefox49:
--- → wontfix
Updated•8 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P2]
Updated•8 years ago
|
Keywords: regression
Did you mean to request a review? You posted to MozReview but no reviewer is set. Just want to make sure it's not lost!
Flags: needinfo?(poirot.alex)
Assignee | ||
Comment 17•8 years ago
|
||
It is not lost. I want to have a test before proceeding with these fixes.
Flags: needinfo?(poirot.alex)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 20•8 years ago
|
||
I only test client.getProcess caching. I would like to test that the loader/debugger server get cleaned up on client disconnection, but there is no observer service notification being sent? So it looks like it would require pulling content-server.jsm internal? Is that reasonable to do that in a test?
Updated•8 years ago
|
Status: NEW → ASSIGNED
Priority: -- → P1
Comment 21•8 years ago
|
||
mozreview-review |
Comment on attachment 8807556 [details]
Bug 1315044 - Cache ContentActor forms to prevent creating new one when calling RootActor.getProcess multiple times.
https://reviewboard.mozilla.org/r/90706/#review91766
Great, these patches combined seem to solve this particular issue! However (as I think you noticed as well) it seems like we leave the server alive when closing about:debugging. So if you close, reopen, close, reopen about:debugging, we keep stacking more compartments in the child. Can you file a follow up for this?
::: devtools/server/tests/mochitest/test_getProcess.html:109
(Diff revision 2)
> });
> }
>
> + // Assert that calling client.getProcess against the same process id is
> + // returning the same actor.
> + function reGetProcess(firstActor, id) {
Maybe `getProcessAgain`? `reGet` reads a bit strangely...
Attachment #8807556 -
Flags: review?(jryans) → review+
Comment 22•8 years ago
|
||
mozreview-review |
Comment on attachment 8807555 [details]
Bug 1315044 - Prevent loading multiple loaders and debugger servers when creating multiple ContentActors.
https://reviewboard.mozilla.org/r/90704/#review91662
Nit: ContentActor's -> ContentActors in commit message
Looks good, but would be even better with a test! Can you use the SDK loader's unload observer notification "sdk:loader:destroy" to build a test for this?
Attachment #8807555 -
Flags: review?(jryans) → review+
Also, should add a similar destruction path in the child.js frame script for tab debugging, maybe once all connections are closed, we would shut down the server in the child?
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 29•8 years ago
|
||
Last three revisions on mozreview:
* address first batch of comments
* rebase
* add the last changeset, to be merged into the first changeset. This adds a test for loader destruction.
Comment 30•8 years ago
|
||
mozreview-review |
Comment on attachment 8810451 [details]
Bug 1315044 - Test loader destruction.
https://reviewboard.mozilla.org/r/92774/#review92820
Thanks, looks like a good addition to the test!
::: devtools/server/tests/mochitest/test_getProcess.html:141
(Diff revision 1)
> + Services.ppmm.loadProcessScript(script, true);
> + client.close();
> +
> + onLoaderDestroyed.then(function () {
> + Services.ppmm.removeDelayedProcessScript(script);
> + info("Loader destroyed in the content process\n");
Nit: don't need \n with `info`
Attachment #8810451 -
Flags: review?(jryans) → review+
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Attachment #8810451 -
Attachment is obsolete: true
Comment 33•8 years ago
|
||
Pushed by apoirot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f57c3052af5a
Cache ContentActor forms to prevent creating new one when calling RootActor.getProcess multiple times. r=jryans
https://hg.mozilla.org/integration/autoland/rev/eca8f585871b
Prevent loading multiple loaders and debugger servers when creating multiple ContentActors. r=jryans
Comment 34•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f57c3052af5a
https://hg.mozilla.org/mozilla-central/rev/eca8f585871b
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox53:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 53
Comment 35•8 years ago
|
||
I guess this is going to ride the train with 53.
Updated•8 years ago
|
status-firefox-esr52:
--- → wontfix
Updated•6 years ago
|
Product: Firefox → DevTools
You need to log in
before you can comment on or make changes to this bug.
Description
•