Closed Bug 1256621 Opened 9 years ago Closed 9 years ago

Detached iframe content is kept alive, how to debug and free up?

Categories

(Web Compatibility :: Site Reports, defect)

defect
Not set
blocker

Tracking

(firefox48 affected)

RESOLVED FIXED
Tracking Status
firefox48 --- affected

People

(Reporter: jujjyl, Unassigned)

References

Details

Attachments

(3 files)

We are running a test harness which spawns the individual tests in an iframe, one test at a time (previous iframe is killed/replaced with the new). However the iframes are not being freed/GCd up, which is causing a huge memory leak, and one can't run the suite twice without crashing the browser to an OOM. About:memory shows 2,042.32 MB (100.0%) -- explicit ├──1,019.35 MB (49.91%) -- window-objects │ ├──1,006.89 MB (49.30%) -- top(none)/detached │ │ ├────494.36 MB (24.21%) -- window(http://localhost:5000/assets/games/heroesofparagon.html?playback) │ │ │ ├──486.50 MB (23.82%) ++ js-compartment(http://localhost:5000/assets/games/heroesofparagon.html?playback, about:blank) │ │ │ └────7.86 MB (00.38%) ++ (3 tiny) │ │ ├────470.62 MB (23.04%) -- window(http://localhost:5000/assets/games/suntemple.html?playback) │ │ │ ├──462.76 MB (22.66%) ++ js-compartment(http://localhost:5000/assets/games/suntemple.html?playback, about:blank) │ │ │ └────7.86 MB (00.38%) ++ (3 tiny) │ │ └─────41.90 MB (02.05%) -- window(http://localhost:5000/assets/games/sponza.html?playback) │ │ ├──39.10 MB (01.91%) ++ js-compartment(http://localhost:5000/assets/games/sponza.html?playback, about:blank) │ │ └───2.81 MB (00.14%) ++ (3 tiny) │ └─────12.47 MB (00.61%) ++ top(http://localhost:5000/#/features.html, id=2147483649) which should all be freed (user is navigating features.html, and the page no longer has <iframe>s in the DOM). How does one debug and figure out what is pinning down these detached frames?
Attached verbose CC edges log of the site.
Attached verbose GC log of the site.
Severity: normal → blocker
Resolving this is critical for Mozilla GDC messaging, any help would be appreciated! Query me directly for access to the site for testing if necessary.
The <iframe> is created like this: scope.$evalAsync(function(){ scope.$parent.testState = "in-progress"; scope.activeGameKey = scope.activeGameKey || keys[0]; var width = scope.games[scope.activeGameKey].width; var height = scope.games[scope.activeGameKey].height; var url = scope.games[scope.activeGameKey].url + '?playback'; var html = '<iframe src="' + url + '" width="' + width + '" height="' + height + '" onload="console.log(\'iframe loaded.\');" frameBorder=\"0\"></iframe>'; $('#game-container').html(html); with the previous <iframe> element being killed by the last line. Luke suspected that it might be Firefox navigation cache pinning the iframe window objects down, but that does not feel like the case, since shouldn't those be reclaimable via about:memory GC/CC/Minimize memory usage buttons? Also, Luke suggested forcing a magic onload handler to the iframe to ensure that the iframe might not get cached to page navigation cache as a result, but that does not quite seem to resolve it, so this is probably not a navigation cache issue.
:smaug, I know you know how to read the CC logs in and out, anything you can catch there?
Flags: needinfo?(bugs)
Flags: needinfo?(bugs)
Which version of FF should be used for testing? Can you reproduce in Nightlies?
onload handler doesn't affect to bfcache in anyway. It is adding listener for unload or beforeunload events which disables bfcache. But Gecko uses bfcache only for top level navigations.
I'm still not sure what is actually leaking, but the following looks a bit suspicious via mCallback : 0x12ce7dcc0 [Function n.event.add/r.handle] --[fun_environment]--> 0x136d41080 [Call <no private>] --[enclosing_environment]--> 0x123e06c40 [Call <no private>] --[I]--> 0x133148700 [HTMLHtmlElement <no private>] That hints that some event listener ends up keeping root element of a document alive, which then keeps tons of other stuff alive.
That seems to be some keydown listener, added to nsDocument normal (xhtml) http://localhost:5000/assets/games/suntemple.html?playback, but then that document is kept alive via via mIncumbentJSGlobal : 0x1348672e0 [Window <no private>] --[CLASS_OBJECT(Array)]--> 0x1348ed100 [HTMLDocument <no private>] this takes some time...
oh, gah. right, it is WantAllTraces=true graph. Could you provide both concise and verbose cc/gc logs generated right when you see the leak.
Flags: needinfo?(jujjyl)
Attached file memory-report.json.gz
about:memory report after all tests have run and back in the results page.
Concise and verbose CC&GC logs can be found at http://clb.demon.fi/dump/gc_cc_logs.tar.gz (33MB, a bit too big to attach in bugzilla)
Flags: needinfo?(jujjyl)
Thank you so much for helping to resolve - as of this morning, Mozilla went live with the announcement: https://blog.mozilla.org/blog/2016/03/14/mozilla-pushes-the-web-to-new-levels-as-a-platform-for-games/ , so getting the site to run smoothly is a top priority now! :( :* :)
Issues that are "top priority" should be filed more than 4 hours before release.
Kyle, if it helps, the site is not released yet.
Sorry, let me clarify, the site is announced in the blogpost, but it is not live yet to audience, which will be "later this week". I do still agree this is last minute (I have had visibility to the site less than 24 hours since now, it is being built by an external agency), and I do apologize for reporting last minute items like this.
FWIW, bug 951491 is a previous ghost window I fixed in an asm.js demo.
See Also: → 951491
The issue does show up both in stable and nightly, and also both on Windows and OS X. My suspect is that there would be an event handler or something being registered from parent to child, but not sure how to figure out if that is the case, or exclude that from being suspect.
Looking at the attached log, the window reflector for heroesofparagon.html is being held alive via a field called mScriptOwner, which looks like some indexed DB thing: http://mxr.mozilla.org/mozilla-central/ident?i=mScriptOwner&tree=mozilla-central&filter= (The sponza.html window is being held by a callback like the other one Olli pointed out.)
Oh, so if you run find_roots.py with the -bro option (to ignore things like CCed objects that only mark the object gray) then you get this as the retaining path for the heroesofparagon.html reflector: via mIncumbentJSGlobal : 0x11e0d2060 [Window <no private>] --[stopGame]--> 0x136d87f40 [Function .link/window.stopGame] --[fun_environment]--> 0x1235b0f00 [Call <no private>] --[scope]--> 0x170013e80 [Object <no private>] --[games]--> 0x11e0d44e0 [Object <no private>] --[heroesofparagon]--> 0x133131100 [Object <no private>] --[results]--> 0x136d8cc60 [Proxy <no private>] --[private]--> 0x123fa5160 [Object <no private>] --[shape]--> 0x12ad67ce0 [shape] --[base]--> 0x11eacc060 [base_shape] --[global]--> 0x11e0d2b00 [Window <no private>] So, there's some "games" object that has an array of things that includes a pointer to the window for heroes of Paragon. That looks like it could be useful. The "games" object is in the scope of some function.
In other words, the "games" object (I assume the same one as in comment 4) just has a pointer to the window for heroesofparagon.html. I think 0x11e0d2060 is the reflector for the window http://localhost:5000/#/features.html which I assume is the main window.
Also, using the regular dev tools dominators stuff probably will produce more easily useful results than low level CC logs. :) https://developer.mozilla.org/en-US/docs/Tools/Memory/Dominators_view
I'm going to move this to Tech Evangelism because it sounds like a page leak rather than a DOM issue.
Component: DOM: Core & HTML → Desktop
Product: Core → Tech Evangelism
For sponza, the path looks like this: via mIncumbentJSGlobal : 0x11e0d2060 [Window <no private>] --[stopGame]--> 0x136d87f40 [Function .link/window.stopGame] --[fun_environment]--> 0x1235b0f00 [Call <no private>] --[scope]--> 0x170013e80 [Object <no private>] --[games]--> 0x11e0d44e0 [Object <no private>] --[sponzadynamicshadows]--> 0x136d91f80 [Object <no private>] --[results]--> 0x12ce58840 [Proxy <no private>] --[private]--> 0x13483fb20 [Object <no private>] --[shape]--> 0x12b85f998 [shape] --[base]--> 0x123f25498 [base_shape] --[global]--> 0x11e0d21a0 [Window <no private>] For suntemple, the retaining path looks like this: via mIncumbentJSGlobal : 0x11e0d2060 [Window <no private>] --[stopGame]--> 0x136d87f40 [Function .link/window.stopGame] --[fun_environment]--> 0x1235b0f00 [Call <no private>] --[scope]--> 0x170013e80 [Object <no private>] --[games]--> 0x11e0d44e0 [Object <no private>] --[suntemple]--> 0x133131200 [Object <no private>] --[results]--> 0x12b8c0fc0 [Proxy <no private>] --[private]--> 0x136d921c0 [Object <no private>] --[shape]--> 0x12ee2c510 [shape] --[base]--> 0x12b88f240 [base_shape] --[global]--> 0x1348672e0 [Window <no private>]
For completeness, the way I generated these was: 1. First do "grep nsGlobalWindow cc-edges.log | grep inner". This gives a list of the inner global windows. Three of them are for the three leaking games. 2. Pick one of the leaking inner windows, like 0x14b93d400, and run find_roots [1] on it like this: "python ~/heapgraph/find_roots.py cc-edges.log 0x14b93d400". This will tell you that that inner window is being held alive by a JS object, 0x1348672e0. 3. Take that JS object and pass it to find_roots, with the GC edges file, and the -bro option (the -bro option means to ignore the roots from CCed stuff that don't "really" keep the object alive): "python ~/heapgraph/find_roots.py gc-edges.log -bro 0x1348672e0" find_roots.py is from here: https://github.com/amccreight/heapgraph
Thank you so much for all the help here! With the instructions from comment 25, I was able to locate and fix. Now the site is properly releasing child iframes. Btw, when looking at this issue through the dominator tree view in Memory developer tab, I was not able to make heads or tails out of this, but Andrew's tool above looks like a much more efficient way to debug these. We should totally have that as a built-in provided tool for developers!
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Here's a real-world test case for the dev tools memory stuff, Nick, if you want to check it out. It sounds like Jukka wasn't able to figure it out for whatever reason.
Flags: needinfo?(nfitzgerald)
Is the testcase without the fix hosted somewhere? I'd be interested in talking about ways the memory tool could improve, and also playing with Real World test cases. The dominators view is more for identifying the largest containers in your app, not identifying leaks so much. Leaks are a weakness of the tool currently. The dominators view is currently the only place where we have individuals rather than aggregates/groups, so it is currently the only place we can talk retaining paths (since retaining paths apply to individuals) at the moment. The plan is to have some way to go from the "aggregates" view to a list of all individuals aggregated in a particular group, and then we can talk retaining paths of those individuals from that group. So in the case of a leaking window, you could filter the "aggregates" view to just "Window", then show the list of all individual windows, then inspect the retaining paths of each window. :smaug also says that some kind of view for showing/aggregating all cross compartment edges might be nice for the specific case of leaking windows.
Flags: needinfo?(nfitzgerald) → needinfo?(jujjyl)
There's a ton of memory being used inside the leaked window though, so I would expect this to show up on a dominator view. Does it not account for memory inside other compartments?
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #29) > Does it not account for memory inside other compartments? It does account for all other debuggee compartments (which iframes would be) but not things from non-debuggee compartments/zones. See https://dxr.mozilla.org/mozilla-central/source/dom/webidl/ThreadSafeChromeUtils.webidl?from=ThreadSafeChromeUtils.webidl#60-90 It is possible that there was more than one edge into the iframe's compartment, at which point the iframe wouldn't be a dominator itself and its contents would rise up in the tree.
Sent testcase to Nick on IRC. In the end it turned out that the single object passed from child iframe to parent window via a direct function call "top.stopGame(someObjectCreatedInChildIFrame);" was the only thing retaining the iframes alive, and JSON serializing the object through to the parent fixed it.
Flags: needinfo?(jujjyl)
Product: Tech Evangelism → Web Compatibility
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: