Closed
Bug 1459404
Opened 6 years ago
Closed 6 years ago
Memory leak (via ChildAPIManager) when asynchronous extension API is used
Categories
(WebExtensions :: Storage, defect, P1)
Tracking
(firefox61 wontfix, firefox62 verified, firefox63 verified)
VERIFIED
FIXED
mozilla63
People
(Reporter: robwu, Assigned: robwu)
References
Details
(Whiteboard: [MemShrink:P1])
Attachments
(7 files)
2.21 KB,
application/zip
|
Details | |
60.90 KB,
application/gzip
|
Details | |
3.52 MB,
text/x-log
|
Details | |
59 bytes,
text/x-review-board-request
|
mixedpuppy
:
review+
lizzard
:
approval-mozilla-beta+
|
Details |
59 bytes,
text/x-review-board-request
|
mixedpuppy
:
review+
lizzard
:
approval-mozilla-beta-
|
Details |
59 bytes,
text/x-review-board-request
|
mixedpuppy
:
review+
lizzard
:
approval-mozilla-beta-
|
Details |
256.40 KB,
image/png
|
Details |
When calling browser.storage.sync.get (and set?), then the jsonSandbox (created via [1], [2] and [3]) is leaked when the page is closed: chrome.storage.sync.get({booleanValue: true}, function(items) {}); If a temporary add-on ID is used (i.e. manifest.json does not have applications.gecko.id), then the content script sandbox is leaked too, but only if a property of the resolved storage object is accessed: chrome.storage.sync.get({booleanValue: true}, function(items) { void items.booleanValue; }); chrome.storage.sync.get({booleanValue: true}, function(items) {}); // ^ The last does not leak the content script sandbox, the first does. STR. 1. Put page.html (in leakjsonSandbox.zip) on a server, e.g. at localhost 2. Load the extension (leakjsonSandbox.zip). 3. Open about:memory 4. Open page.html 5. Follow the instructions at page.html (basically create 10 frames, then remove those frames). In about:memory, the leaked sandboxes can be found by searching for "leaky, moz". To reproduce the content script sandbox leakage, unzip the extension, edit manifest.json and remove the "applications" section. Reproduced in Firefox 59.0.2 and Firefox Nightly 20180504220115 [1]: https://searchfox.org/mozilla-central/rev/c0d81882c7941c4ff13a50603e37095cdab0d1ea/toolkit/components/extensions/ExtensionCommon.jsm#259-268 [2]: https://searchfox.org/mozilla-central/rev/53afcfdbabed96883126d0ebbcac499b358e32f2/toolkit/components/extensions/child/ext-storage.js#66-81 [3]: https://searchfox.org/mozilla-central/rev/53afcfdbabed96883126d0ebbcac499b358e32f2/toolkit/components/extensions/child/ext-storage.js#119-121
Assignee | ||
Comment 1•6 years ago
|
||
Memory report from about:debugging, generated on Firefox Nightly 20180504220115 on Linux as follows: 1. Unzip the attachment from https://bugzilla.mozilla.org/show_bug.cgi?id=1459404 2. Start a server at port 3000 at the directory of step 1. 3. web-ext run -f firefox-nightly -u about:memory -u http://localhost:3000/page.html 4. Click on "Leak me" 5. Click on "Clean me". 6. Switch to "about:memory" and click on "Minimize memory usage" 7. Click on "Measure and save" See attachment for the result. If you load it at about:memory and use Ctrl-F "leaky, moz", then you will find 10 leaked compartments.
Assignee | ||
Comment 2•6 years ago
|
||
CC logs for the process containing localhost:3000/page.html, after running the STR from comment 1 and clicking "Save verbose" at about:memory.
Updated•6 years ago
|
Whiteboard: [MemShrink]
Updated•6 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P1]
Assignee | ||
Comment 3•6 years ago
|
||
I've found the root cause, and will fix it as part of bug 1462121. When the frames in the demo are unloaded, the pagehide event is triggered: https://searchfox.org/mozilla-central/rev/00dd116638001772fe354b081353b73f1cad405d/toolkit/components/extensions/ExtensionCommon.jsm#164-173,176 The "// Put this off until the next tick." logic does not work, because the JS stack is empty when pagehide is triggered, and therefore immediately after dispatching "pagehide" the microtask is run, causing this.active to be set to false. Since the frame is being removed, the inner window is eventually destroyed, which triggers the close handler: https://searchfox.org/mozilla-central/rev/00dd116638001772fe354b081353b73f1cad405d/toolkit/components/extensions/ExtensionCommon.jsm#178,181-184 Since this.active was set to false before, the pageshow/pagehide listeners aren't removed. When I suppress the pagehide handler in ExtensionCommon.jsm, then the leak does not occur.
Assignee | ||
Comment 4•6 years ago
|
||
To show that the above analysis is correct, modify the test case (leakjsonSandbox.zip) as follows: Edit manifest.json and change: "run_at": "document_start" to "run_at": "document_end" Edit page.html, and insert this JS snippet (e.g. before </script>): addEventListener("pagehide", (event) => { event.stopImmediatePropagation(); }); Then run the STR from comment 0 (or comment 1). When I search for "leaky," in about:memory, then I don't get any results, which means that the memory leak does not happen if the pagehide logic is suppressed: https://searchfox.org/mozilla-central/rev/00dd116638001772fe354b081353b73f1cad405d/toolkit/components/extensions/ExtensionCommon.jsm#164-173
Updated•6 years ago
|
Product: Toolkit → WebExtensions
Assignee | ||
Comment 5•6 years ago
|
||
This turns out to be unrelated to bug 1462121. The test case that I used in comment 4 is bogus; Locally I changed the name to something else than "leaky", and then it is not surprising that "leaky," doesn't appear in about:memory any more. I've found that the leak is caused by the unbalanced event listener in https://searchfox.org/mozilla-central/rev/6f86cc3479f80ace97f62634e2c82a483d1ede40/toolkit/components/extensions/ExtensionChild.jsm#883-884 When both of these listeners are unregistered at context closure, the object is GC'd. I'm currently developing a patch with regression tests.
No longer depends on: 1462121
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 8•6 years ago
|
||
Explanation of bug for reviewers: Every instance of BaseContext lazily initializes a ChildAPIManager when an extension API is accessed ("browser" / "chrome"): https://searchfox.org/mozilla-central/source/toolkit/components/extensions/ExtensionContent.jsm#663,669 https://searchfox.org/mozilla-central/source/toolkit/components/extensions/ExtensionPageChild.jsm#271,277 https://searchfox.org/mozilla-central/source/toolkit/components/extensions/ExtensionPageChild.jsm#322,328 In the constructor of ChildAPIManager, event listeners are registered (on a message manager) but never unregistered: https://searchfox.org/mozilla-central/rev/6f86cc3479f80ace97f62634e2c82a483d1ede40/toolkit/components/extensions/ExtensionChild.jsm#883-884 Because the ChildAPIManager cannot be GC'd, and the original context is reachable via the "context" member of the ChildAPIManager, the context is kept alive, and subsequently also any other member of the context, such as the jsonSandbox from the top of this bug. This bug results in wasted memory that can only be reclaimed when the tab is closed (likely when the message manager is destroyed).
Priority: -- → P1
Assignee | ||
Updated•6 years ago
|
Summary: Memory leak when storage.sync API is used → Memory leak (via ChildAPIManager) when asynchronous extension API is used
Comment 9•6 years ago
|
||
mozreview-review |
Comment on attachment 8992686 [details] Bug 1459404 - Ensure that ChildAPIManager can be garbage-collected. https://reviewboard.mozilla.org/r/257544/#review264472 ::: toolkit/components/extensions/test/xpcshell/test_ext_contexts_gc.js:31 (Diff revision 1) > + }); > +} > + > +async function assertContextReleased(contentPage, description) { > + await contentPage.spawn(description, async assertionDescription => { > + // Force GC, see https://searchfox.org/mozilla-central/rev/b0275bc977ad7fda615ef34b822bba938f2b16fd/testing/talos/talos/tests/devtools/addon/content/damp.js#84-98 please also reference the explanation directly since you have to dig through some files, and a non-perma-link to find the reference. bug 610166#c12
Attachment #8992686 -
Flags: review?(mixedpuppy) → review+
Comment 10•6 years ago
|
||
mozreview-review |
Comment on attachment 8992687 [details] Bug 1459404 - Add regression test for GC of ExtensionPageContextChild https://reviewboard.mozilla.org/r/257546/#review264476 I suppose its fine, but splitting this out was unecessary.
Attachment #8992687 -
Flags: review?(mixedpuppy) → review+
Comment hidden (mozreview-request) |
Assignee | ||
Comment 12•6 years ago
|
||
I added a separate patch for the comment change so that there is a single commit with a useful commit message that links the two comments in the code together (good for blame). Also, I decided to separate the first two patches because the second test-only patch feels a bit less reliable than the one in the first test due to the artificial wait with a magic number. I will ask for an uplift of the first patch only, which has the fix and a regression test.
Comment 13•6 years ago
|
||
mozreview-review |
Comment on attachment 8992751 [details] Bug 1459404 - Add permalink of explanation to garbageCollection logic https://reviewboard.mozilla.org/r/257608/#review264478 Just a comment update, so I'll skip passing this to a peer for that section.
Attachment #8992751 -
Flags: review?(mixedpuppy) → review+
Comment 14•6 years ago
|
||
Pushed by rob@robwu.nl: https://hg.mozilla.org/integration/autoland/rev/0f5d5144aeee Ensure that ChildAPIManager can be garbage-collected. r=mixedpuppy https://hg.mozilla.org/integration/autoland/rev/333c6bee4e9c Add regression test for GC of ExtensionPageContextChild r=mixedpuppy https://hg.mozilla.org/integration/autoland/rev/0cf499a0fee9 Add permalink of explanation to garbageCollection logic r=mixedpuppy
Comment 15•6 years ago
|
||
Backed out 3 changesets (bug 1459404) for xpcshell failures in toolkit/components/extensions/test/xpcshell/test_ext_contexts_gc.js on a CLOSED TREE Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=0cf499a0fee9540c2e98a5df5b487cd7be0f11c3&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&selectedJob=188705841 Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=188705841&repo=autoland&lineNumber=13111 Backout push: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=76d032c35a0a017ac8c1350ff3700f3462803391&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable
Flags: needinfo?(rob)
Assignee | ||
Comment 16•6 years ago
|
||
In all cases (Windows), the tests themselves do pass, but the test fail anyway because the test runner fails at removing the temporary test extension: E.g. https://treeherder.mozilla.org/logviewer.html#?job_id=188705841&repo=autoland&lineNumber=13335-13348 09:43:37 INFO - (xpcshell/head.js) | test test_ExtensionPageContextChild_in_toplevel finished (2) 09:43:37 INFO - "CONSOLE_MESSAGE: (error) [JavaScript Error: "[Exception... "Component returned failure code: 0x80520015 (NS_ERROR_FILE_ACCESS_DENIED) [nsIFile.remove]" nsresult: "0x80520015 (NS_ERROR_FILE_ACCESS_DENIED)" location: "JS frame :: resource://gre/modules/Extension.jsm :: cleanupGeneratedFile/< :: line 1815" data: no]"] 09:43:37 INFO - cleanupGeneratedFile/<@resource://gre/modules/Extension.jsm:1815:7 09:43:37 INFO - _do_main@Z:\\task_1531904253\\build\\tests\\xpcshell\\head.js:219:3 09:43:37 INFO - _execute_test@Z:\\task_1531904253\\build\\tests\\xpcshell\\head.js:533:5 09:43:37 INFO - @-e:1:1 09:43:37 INFO - " 09:43:37 INFO - (xpcshell/head.js) | test run_next_test 5 finished (1) 09:43:37 INFO - exiting test 09:43:37 INFO - Error: Found unexpected files in temporary directory: generated-extension.xpi at resource://testing-common/AddonTestUtils.jsm:408 09:43:37 INFO - init/<@resource://testing-common/AddonTestUtils.jsm:408:15 09:43:37 INFO - _execute_test/<@Z:\task_1531904253\build\tests\xpcshell\head.js:597:28 09:43:37 INFO - async*_execute_test@Z:\task_1531904253\build\tests\xpcshell\head.js:594:4 09:43:37 INFO - @-e:1:1 Bah :(
Flags: needinfo?(rob)
Assignee | ||
Comment 17•6 years ago
|
||
This looks unexpected: PID 8852 | --DOMWINDOW == 6 (0967C400) [pid = 4292] [serial = 11] [outer = 00000000] [url = about:neterror?e=fileNotFound&u=moz-extension%3A//f342662a-dbd8-4a43-9158-e264c64e7975/iframe.html&c=UTF-8&f=regular&d=Firefox%20can%E2%80%99t%20find%20the%20file%20at%20moz-extension%3A//f342662a-dbd8-4a43-9158-e264c64e7975/iframe.html.] Apparently the last test has an unnecessary <iframe>. I'll remove it and push without further awaiting review (since this <iframe> does clearly not belong there). diff --git a/toolkit/components/extensions/test/xpcshell/test_ext_contexts_gc.js b/toolkit/components/extensions/test/xpcshell/test_ext_contexts_gc.js index 342496854950..7c84ab123fc6 100644 --- a/toolkit/components/extensions/test/xpcshell/test_ext_contexts_gc.js +++ b/toolkit/components/extensions/test/xpcshell/test_ext_contexts_gc.js @@ -180,7 +180,6 @@ add_task(async function test_ExtensionPageContextChild_in_toplevel() { files: { "toplevel.html": ` <!DOCTYPE html><meta charset="utf8"> - <iframe src="iframe.html"></iframe> <script src="script.js"></script> `, "script.js": "browser.test.sendMessage('extensionPageLoaded');", If the test failure is fixed with this bug, then I think that we need to investigate why failure to load an extension page keeps the file locked.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 20•6 years ago
|
||
Pushed by rob@robwu.nl: https://hg.mozilla.org/integration/autoland/rev/47ec1288baef Ensure that ChildAPIManager can be garbage-collected. r=mixedpuppy https://hg.mozilla.org/integration/autoland/rev/26ded4c81f03 Add regression test for GC of ExtensionPageContextChild r=mixedpuppy https://hg.mozilla.org/integration/autoland/rev/c028d93bbf3f Add permalink of explanation to garbageCollection logic r=mixedpuppy
Comment 21•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/47ec1288baef https://hg.mozilla.org/mozilla-central/rev/26ded4c81f03 https://hg.mozilla.org/mozilla-central/rev/c028d93bbf3f
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
status-firefox63:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Assignee | ||
Comment 22•6 years ago
|
||
I have measured the impact of the memory leak. The impact of the leak is 14 kB per frame/document (until the tab is closed). 180 kB leaked per frame without the patch. 166 kB leaked per frame with patch. 138 kB leaked per frame when the "browser" extension API object is not lazily initialized (regardless of patch) (so we might have another memleak). Details: For each scenario I did the following: - Open 200 frames (by clicking on a button in a test page). - Scenario-specific: * "nonexisting" property access: access the "nonexisting" property in the global namespace. * "browser" property access: access the "browser" property in the global namespace (which triggers lazy initialization of the object that caused the memory leak) - Click "Minimize memory usage" three times at about:memory - Take a heap snapshot with the Memory tab of the Browser toolbox and note the size. - Repeat the above five times. - Calculate the deltas, discard the two extreme values. - Calculate the average of the three remaining values (average of medians). Results (unless stated otherwise, tested with unpatched/buggy Firefox Nightly build ID 20180708100828): Open 200 frames, access global property "nonexisting" and remove all frames: 27.44 MB per 200 moz-extension:-frames 27.81 MB per 200 file:-frames (for reference, with e10s disabled) 27.58 MB per 200 moz-extension:-frames (Firefox Nightly build ID 20180719100142, with patch) Open 200 frames, access global property "browser" and remove all frames (this reproduces the root cause of the bug): 36.01 MB per 200 moz-extension:-frames 33.12 MB per 200 moz-extension:-frames (Firefox Nightly build ID 20180719100142, with patch)
Assignee | ||
Comment 23•6 years ago
|
||
Comment on attachment 8992686 [details] Bug 1459404 - Ensure that ChildAPIManager can be garbage-collected. Approval Request Comment [Feature/Bug causing the regression]: API manager of WebExtensions framework does not remove event listeners upon context destruction (e.g. at document unload). [User impact if declined]: Memory leak in extension pages and non-extension pages with a content script, at least 14 kB per frame until the tab is closed (see https://bugzilla.mozilla.org/show_bug.cgi?id=1459404#c22 ) [Is this code covered by automated tests?]: Yes. [Has the fix been verified in Nightly?]: Yes. [Needs manual test from QE? If yes, steps to reproduce]: No, covered by automated tests (and manually verified). [List of other uplifts needed for the feature/fix]: N/A [Is the change risky?]: No. [Why is the change risky/not risky?]: The fix itself consists of two extra lines. [String changes made/needed]: N/A
Attachment #8992686 -
Flags: approval-mozilla-beta?
Updated•6 years ago
|
status-firefox61:
--- → fix-optional
status-firefox62:
--- → affected
Comment 24•6 years ago
|
||
Note, verification usually should not be done by the developer working on the fix or the reviewer. In this case, given the complexity of the verifying, I'll take it as pretty good evidence this fix works!
Comment 25•6 years ago
|
||
Comment on attachment 8992686 [details] Bug 1459404 - Ensure that ChildAPIManager can be garbage-collected. Fix for a memory leak, let's uplift for beta 11. Thanks for adding new tests.
Attachment #8992686 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 26•6 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #25) > Comment on attachment 8992686 [details] > Bug 1459404 - Ensure that ChildAPIManager can be garbage-collected. > > Fix for a memory leak, let's uplift for beta 11. > Thanks for adding new tests. Hi, should i uplift just the approved part of this bug or wait for the other two to be approved for beta as well?
Flags: needinfo?(lhenry)
Assignee | ||
Comment 27•6 years ago
|
||
(In reply to Andreea Pavel [:apavel] from comment #26) > (In reply to Liz Henry (:lizzard) (needinfo? me) from comment #25) > > Comment on attachment 8992686 [details] > > Bug 1459404 - Ensure that ChildAPIManager can be garbage-collected. > > > > Fix for a memory leak, let's uplift for beta 11. > > Thanks for adding new tests. > > Hi, should i uplift just the approved part of this bug or wait for the other > two to be approved for beta as well? Just the approved one (this features the fix and tests). The other two are a test-only patch (adding more extensive test coverage), and a comment update, respectively.
Flags: needinfo?(lhenry)
Comment 28•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/52dd0b769e22
Comment 29•6 years ago
|
||
:robwu thanks for clarifying that for me.
Updated•6 years ago
|
Attachment #8992687 -
Flags: approval-mozilla-beta+
Updated•6 years ago
|
Attachment #8992751 -
Flags: approval-mozilla-beta+
Updated•6 years ago
|
Attachment #8992687 -
Flags: approval-mozilla-beta+ → approval-mozilla-beta-
Updated•6 years ago
|
Attachment #8992751 -
Flags: approval-mozilla-beta+ → approval-mozilla-beta-
Comment 30•6 years ago
|
||
I was able to reproduce this issue on Firefox 61.0.1 (20180713213322) under Win 7 64-bit and Mac OS X 10.13.3. This issue is verified as fixed on Firefox 63.0a1(20180704003137) and Firefox 62.0b15 (20180806191531) under Win 7 64-bit and Mac OS X 10.13.3. Please see the attached screenshot.
Updated•6 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•