Closed Bug 951491 Opened 12 years ago Closed 12 years ago

Monster Madness always leaves a ghost window

Categories

(Core :: DOM: Core & HTML, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla30
Tracking Status
firefox28 --- verified
firefox29 --- verified
firefox30 --- verified
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- fixed

People

(Reporter: luke, Assigned: mccr8)

References

Details

(Whiteboard: [MemShrink:P2][games:p1][qa-])

Attachments

(1 file, 1 obsolete file)

If I load the Monster Madness demo http://www.playverse.com/Anonplayer/0-a2aadd1b76e14d0e848ea1de18dca4e8 close the tab, open about:memory, and click "Minimize Memory Usage", the associated window (containing the 512MB heap ArrayBuffer) is shown as "top(none)/ghost/window". After that, the window seems to never go away (at least not after 10 minutes and several "Minimize" clicks). Loading the game again leaks a second window. This seems like a pretty serious leak. The Citadel demo uses the same engine but doesn't show any leaks. The difference is I think Monster Madness does a lot more with the DOM and WebAPIs (WebAudio, WebSockets, and IndexedDB, at least) which is why I'm tentatively filing under DOM in hopes that CC-knowing folk will point out the edge keeping the window alive.
Ouch. That's a bad window to leak. From bug 951489: │ │ ├──838,010,064 B (24.10%) -- ghost/window(http://www.playverse.com/Anonplayer/0-a2aadd1b76e14d0e848ea1de18dca4e8) mccr8, do you think you could investigate?
Assignee: nobody → continuation
Hmm, can't get that to load in a debug build to check whether it is a runtime leak only or shutdown.
Looks like a shutdown leak. Something is keeping a reference to the window until we close. (shutdown leaks are annoying, harder to debug than runtime leaks)
This will probably be tricky to figure out, because there's a missing reference to the window itself. Olli said disabling WebVTT didn't fix it.
If it's any help, I am able to reproduce this ghost window by navigating to the page and then closing the tab as soon as the splash screen appears. If you're fast enough, that is before asm.js even kicks in: 311.52 MB (100.0%) -- explicit ├──131.50 MB (42.21%) -- window-objects │ ├───75.21 MB (24.14%) -- top(none)/ghost/window(http://www.playverse.com/Anonplayer/0-a2aadd1b76e14d0e848ea1de18dca4e8) │ │ ├──74.57 MB (23.94%) -- js-compartment(http://www.playverse.com/Anonplayer/0-a2aadd1b76e14d0e848ea1de18dca4 │ │ │ ├──73.17 MB (23.49%) -- objects │ │ │ │ ├──72.18 MB (23.17%) -- malloc-heap │ │ │ │ │ ├──71.98 MB (23.11%) ── elements/non-asm.js │ │ │ │ │ └───0.19 MB (00.06%) ── slots │ │ │ │ └───1.00 MB (00.32%) ++ (2 tiny) │ │ │ └───1.40 MB (00.45%) ++ (5 tiny) │ │ └───0.64 MB (00.20%) ++ (3 tiny)
Whiteboard: [MemShrink][games] → [MemShrink:P2][games]
The link in comment 0 no longer works -- I now get an almost blank page that says "Anonymous accounts are not enabled for this game!" More generally, to make progress with this we need to reduce the test case, e.g. disabling the various advanced features in order to work out the cause. (The fact that the problem occurs immediately is a big clue, and should let us strip out big chunks of code.) Martin, do we have contacts with the Monster Madness people?
Flags: needinfo?(mbest)
I was hoping this might be WebRTC related, as there's already a known leak there that has a patch, but from the blog post it sounds like it doesn't use it. ( https://hacks.mozilla.org/2013/12/monster-madness-creating-games-on-the-web-with-emscripten/ )
I think I am seeing the same issue on another asm.js-using page, which is much smaller: http://www.visagetechnologies.com/HTML5/Samples/VirtualEyewearTryOn/eyewear.html Loading the page, then closing it, shows it remain in about:memory forever, eventually marked as a ghost window. Note that that page uses WebRTC for camera input, not sure if that's relevant or not.
> Note that that page uses WebRTC for camera input, not sure if that's > relevant or not. In that case, it might be the same as bug 919244, and possibly unrelated to this bug...
(In reply to Alon Zakai (:azakai) from comment #9) (That's a great demo!) At least in Nightly, the window and asm.js heap seem to go away without even clicking "Minimize Memory Usage" after I click "Measure" several times. I'll email the monster madness people and see if we can't get a link for work here.
I built using the patch in 919244 (wow, mach is fast!), and it did *not* resolve the issue. So seems to not be that other bug, and could be the same issue as Monster Madness here. luke: odd that it does not happen for you, it is 100% consistent on my machine, including different profiles, etc.
That is weird; I've run it several times. (I also noticed an unrelated webrtc shutdown crash: bug 957837.)
That page certainly leaks here, and nsIDOMGetUserMediaSuccessCallback is a root in the CC graph, so something is keeping that alive.
Even more so, nsIDOMGetUserMediaSuccessCallback is _the_ root of the huge graph of 137176 objects. Nothing else is keeping that all alive.
> That page certainly leaks here, and nsIDOMGetUserMediaSuccessCallback is a > root in the CC graph, so something is keeping that alive. Are you talking about Monster Madness or the new test case from azakai? Can we file a separate bug for the latter?
Filed bug 957888 for my testcase.
I was talking about azakai's test case.
I noticed there are a huge number of CallbackObjects with references to the window that is leaking. Those in turn are being held alive by mListenerManager, through some paths that involve some JS crypto library. Maybe I'll try clearing the event listener manager on the ghost window and seeing if that causes the page to go away. If that works, then I can log all of the releases to see what the extra reference is.
Andrew, Olli, Luke and I have info about how to access the game again, though it's not yet public. Please email if you want access for testing.
Flags: needinfo?(mbest)
Whiteboard: [MemShrink:P2][games] → [MemShrink:P2][games:p1]
Hey guys, happy to put you in contact with the Monster Madness team. I'll drop them a email with this bug number.
(In reply to Martin Best (:mbest) from comment #21) Hi Martin, sorry for not updating the bug, but I got a live private link from the devs and already sent it to Olli, Nick and Andrew.
It turns out that what is happening here is that the WorkerPrivateParent has a field mSynchronizeRunnable which is not CCed, and holds a pointer to the window. If we tell the CC about that window pointer, as this patch does, then there is no ghost window. Presumably this is not really okay, because the runnable is some kind of threadsafe object, but maybe it is. It could also be the case that mSynchronizeRunnable should be killed off when we shut the page, or something. That seems like a more likely cause of the leak.
Component: DOM → DOM: Workers
Blocks: 966762
No longer blocks: 966762
Depends on: 966762
It looks like it isn't workers after all. An XHR spins up a nsResumeTimeoutsEvent for our window, then we call FreeInnerObjects() on the window, then the resume timeouts event fires, which calls nsGlobalWindow::ResumeTimeouts, which fails to realize the FreeInnerObjects() has been called, and it ends up in WorkerPrivateParent<mozilla::dom::workers::WorkerPrivate>::SynchronizeAndResume, which creates a new SynchronizeAndResumeRunnable, which ends up holding alive the window. Returning early from ResumeTimeouts if mListenerManager is nullptr seems to do fix the leak, though surely there's a better way to detect FreeInnerObjects being called. khuey says we should also "add assertions for inner window correctness in the worker code".
Component: DOM: Workers → DOM
I think we should just CheckInnerWindowCorrectness in ResumeTimeouts. Does that sound reasonable?
Flags: needinfo?(bzbarsky)
I tried bailing out of ResumeTimeouts when either the outer window was null, or the inner of the outer wasn't equal to |this|, but there was still a leak.
Looks like we don't really have any good flag for FreeInnerObjects. mChromeEventHandler is closest, but it is really for something else. But why does WorkerPrivateParent<mozilla::dom::workers::WorkerPrivate>::SynchronizeAndResume really do anything if mozilla::dom::workers::CancelWorkersForWindow(this); has been called?
Yes, we could check the parent status in Resume. It does seem like the window should be swallowing these calls though.
Checking for the window being the current inner in ResumeTimeouts may make sense, yes.
Flags: needinfo?(bzbarsky)
Update: the Monster Madness demo is public again after a round of bugfixes: http://www.playverse.com/Anonplayer/0-a2aadd1b76e14d0e848ea1de18dca4e8
This is a bit of a sledgehammer approach, suggested by jst. We simply store a flag on nsGlobalWindow that checks if we've called FreeInnerObjects(). If we have, don't do anything in ResumeTimeouts(). Most of what ResumeTimeouts() does is neutralized by FreeInnerObjects(), but some things, like workers, some stuff with game pad and system events, isn't, as far as I can see, so it seems like it is a good idea to just squelch this altogether, rather than relying on everybody enforcing an uncheckable non-local property. https://tbpl.mozilla.org/?tree=Try&rev=746603e4f8af
Attachment #8377907 - Flags: review?(bzbarsky)
Comment on attachment 8377907 [details] [diff] [review] Ensure ResumeTimeouts() after FreeInnerObjects() does nothing. Document that the member is only used by inner windows, please. r=me
Attachment #8377907 - Flags: review?(bzbarsky) → review+
Updated the comment. Thanks for the reviews! https://hg.mozilla.org/integration/mozilla-inbound/rev/fc5ea4b0b114
Backed out on suspicion of causing Valgrind leaks. https://hg.mozilla.org/integration/mozilla-inbound/rev/9abc3581f256
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
\o/
Attachment #8369718 - Attachment is obsolete: true
Comment on attachment 8377907 [details] [diff] [review] Ensure ResumeTimeouts() after FreeInnerObjects() does nothing. [Approval Request Comment] Bug caused by (feature/regressing bug #): old User impact if declined: leaks on Monster Madness when closing it before it loads, which is the first game to be commercially available using asm.js or something like that. Testing completed (on m-c, etc.): it has been on m-c for a few days Risk to taking this patch (and alternatives if risky): pretty low I think String or IDL/UUID changes made by this patch: none
Attachment #8377907 - Flags: approval-mozilla-beta?
Attachment #8377907 - Flags: approval-mozilla-aurora?
Comment on attachment 8377907 [details] [diff] [review] Ensure ResumeTimeouts() after FreeInnerObjects() does nothing. this is our last week to take speculative (and non-tracked) landings on branches so let's give it a go and we have time to backout next week if any unforeseen issues arise.
Attachment #8377907 - Flags: approval-mozilla-beta?
Attachment #8377907 - Flags: approval-mozilla-beta+
Attachment #8377907 - Flags: approval-mozilla-aurora?
Attachment #8377907 - Flags: approval-mozilla-aurora+
Paul, can you please test this on tomorrow's Nightly and Aurora builds? This will also need to be tested in Firefox 28.0b7 once we get builds.
Flags: needinfo?(paul.silaghi)
Keywords: verifyme
QA Contact: paul.silaghi
It looks like public beta testing is down again. Luke can you email Paul with info about how to run it? I'm not sure how to run it at the moment.
Flags: needinfo?(luke)
Unfortunately, I don't have any special links anymore. However, I did personally test the link, when it was live, and the ghost window problem was fixed.
Flags: needinfo?(luke)
(In reply to Luke Wagner [:luke] from comment #44) > Unfortunately, I don't have any special links anymore. However, I did > personally test the link, when it was live, and the ghost window problem was > fixed. So does this need testing? Is there a demo available for us to test? I'm thinking the answer to both is 'no'.
No and no.
Thanks Luke. I'm marking verified fixed based on your comment 44.
Status: RESOLVED → VERIFIED
Flags: needinfo?(paul.silaghi)
Keywords: verifyme
Whiteboard: [MemShrink:P2][games:p1] → [MemShrink:P2][games:p1][qa-]
See Also: → 1256621
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: