Closed Bug 1061202 Opened 5 years ago Closed 5 years ago

Increased memory use window-objects top(none)/detached, leaks the world

Categories

(Core :: DOM: Core & HTML, defect, major)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla35
blocking-b2g 2.1+
Tracking Status
firefox33 --- unaffected
firefox34 + fixed
firefox35 + fixed
b2g-v2.0 --- unaffected
b2g-v2.0M --- unaffected
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: jmjjeffery, Assigned: bzbarsky)

References

Details

(Keywords: memory-leak, regression, Whiteboard: [MemShrink:P1])

Attachments

(8 files, 1 obsolete file)

Attached file memory-report.json.gz
Noticed a significant increase in memory use using my normal set of tabs where after a few hours mem use will balloon to a gig or more, where it normally on my system runs 500-600meg.

good: 20140829122301 9b3fbc370631
bad:  20140829123201 2a354048f964 

421.67 MB (100.0%) -- explicit
├──256.32 MB (60.79%) -- window-objects
│  ├───85.32 MB (20.23%) -- top(none)/detached
│  │   ├──70.68 MB (16.76%) ++ (67 tiny)
│  │   ├──10.15 MB (02.41%) -- window(http://forums.mozillazine.org/viewforum.php?f=23)
│  │   │  ├───9.98 MB (02.37%) -- js-compartment(http://forums.mozillazine.org/viewforum.php?f=23, about:blank)
│  │   │  │   ├──8.36 MB (01.98%) ++ classes
│  │   │  │   └──1.62 MB (00.38%) ++ (4 tiny)
│  │   │  └───0.17 MB (00.04%) ++ (3 tiny)

Note large amount of windows-objects and the large number of (tiny) those (tiny) are mostly link to google ads

STR:  Load the 'bad' build as noted above:
Open in tabs several mozillazine forums
http://forums.mozillazine.org/viewforum.php?f=23
http://forums.mozillazine.org/viewforum.php?f=38
http://forums.mozillazine.org/viewforum.php?f=7

I have a total of six open all the time.

Repeat several times over the tabs a 'reload' of page.  The more its reloaded/refreshed you will note that in the Windows Taskmanager that memory use is growing and is not reduced over time/reloads.

Tested on win7 x64, I have the 'new cache' disabled, but I've not tested with it enabled.  

Over time as the memory use increases the browser becomes sluggish.
Opening 'about:memory' will cause the browser to go 'not responding' for a brief period before recovering and loading the data.

Need help finding the regression from m-i builds.
Your regression range consists of a b2g-inbound merge, which seems unlikely.

If you could provide the verbose CC/GC logs we could investigate those.  They may contain private information, so if you don't want to post them on the bug email me @mozilla.com and we'll figure something out.
Files are too big to upload, even zipped one was over 34meg

I don't have a dropbox account - suggestions ?
Andrew, can you make arrangements to get the files from Jim and analyze them?
Flags: needinfo?(continuation)
Jim, if you could create the logs earlier when you start seeing something leaking?
Can you reproduce in safe mode?  It could be an addon.

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #3)
> Andrew, can you make arrangements to get the files from Jim and analyze them?

Sure.

Jim, like Olli said, if you could get a log from when you first start seeing top(none)/detached window-objects that could be useful, as the logs won't be quite so large, but should still show the problem.

It would be quite handy to have an addon that periodically checks for ghost-windows. :)
Flags: needinfo?(continuation)
Flags: needinfo?(continuation)
Sorry, I'm not having any luck getting the gc.log verbose below 10meg :(
Attachment #8482363 - Attachment is obsolete: true
after retesting the potential regression range, I still stand by my post above.

It does not take too many reloads of the tabs to force memory use as monitored in the taskmanager to 600meg +, where with the good build it stays around my normal average for this tab-set I use at around 420meg.
I've noticed a whole lot of ghost windows today.

Will attach a memory reports from before and after minimise.
I can reproduce this by visiting http://online.wsj.com/articles/why-i-just-cant-become-chinese-1409333549.

This is a regression from bug 1011738, which added a strong edge from the observer service to documents at https://hg.mozilla.org/mozilla-central/rev/7c682fb2f547#l2.13.  Commenting that out makes this go away.
This makes nightly basically unusable.  We need a fix ASAP.
Assignee: nobody → fabrice
Blocks: 1011738
blocking-b2g: --- → 2.1+
(In reply to Jim Jeffery not reading bug-mail 1/2/11 from comment #10)
> after retesting the potential regression range, I still stand by my post
> above.

Yep, you were correct.
Flags: needinfo?(continuation)
Keywords: mlk
OS: Windows 7 → All
Hardware: x86_64 → All
Whiteboard: [MemShrink]
Hrm.  Are we ending up with documents That we OnPageShow but never OnPageHide?

I'll look into why tomorrow, on the link from comment 15...  Because we really do need some "this document is not being rendered anymore" notification, and I was pretty sure OnPageHide was it.
It appears so, yes.  Commenting out the AddObserver call makes the leak go away, so RemoveObserver is clearly not getting called.
bz, any progress? This is bad...
Flags: needinfo?(bzbarsky)
Summary: Increased memory use window-objects top(none)/detached → Increased memory use window-objects top(none)/detached, leaks the world
Whiteboard: [MemShrink] → [MemShrink:P1]
We end up calling OnPageShow twice on the same document (once for the about:blank, once for the document.close()) without an intermediate OnPageShow.  That adds us to the observer service twice, but only removes once later, so we leak.

Sadly this doesn't show up as a shutdown leak for some reason.  :(
Flags: needinfo?(bzbarsky)
Its not a shutdown leak because the observer service dumps all the observers at shutdown.
Assignee: fabrice → bzbarsky
Status: NEW → ASSIGNED
> Its not a shutdown leak because the observer service dumps all the observers at shutdown.

Gah.  :(
Comment on attachment 8483248 [details] [diff] [review]
Make sure to not double-add app-theme-changed observers when a document has OnPageShow called on it twice without an OnPageHide call in between

Review of attachment 8483248 [details] [diff] [review]:
-----------------------------------------------------------------

Assert that we're not observing in the dtor?

r=me
Attachment #8483248 - Flags: review?(khuey) → review+
> Assert that we're not observing in the dtor?

Ah, that would catch it at shutdown time, good point.
Added:

  MOZ_ASSERT(!mObservingAppThemeChanged,
             "Document leaked to shutdown, then the observer service dropped "
             "its ref to us so we were able to go away.");

in ~nsDocument.
Target Milestone: --- → mozilla35
(In reply to Boris Zbarsky [:bz] from comment #28)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/483f0a6a540e

Testing with this m-i build seems to have fixed the leaks for me, things are now running normal here as far as memory use goes.
Jim, thank you for testing that, and for reporting and narrowing down this thing!
Comment on attachment 8483248 [details] [diff] [review]
Make sure to not double-add app-theme-changed observers when a document has OnPageShow called on it twice without an OnPageHide call in between

Approval Request Comment
[Feature/regressing bug #]: Bug 1011738
[User impact if declined]: Nasty leaks.
[Describe test coverage new/current, TBPL]: We added some asserts that should
   help catch issues if any arise....
[Risks and why]: Risk should be quite low.
[String/UUID change made/needed]: None.
Attachment #8483248 - Flags: approval-mozilla-aurora?
Looks like mochitest-bc in fact leaks documents to shutdown, triggering the assert I added in ~nsDocument.  It's also triggering, in the same function:

###!!! ASSERTION: Destroying a currently-showing document: '!mIsShowing', file /builds/slave/m-in-osx64-d-00000000000000000/build/content/base/src/nsDocument.cpp, line 1602

which is even worse...

I'll downgrade to an NS_ASSERTION from a MOZ_ASSERT for now so mochitest-bc can keep ignoring the fact that it's leaking shit.  :(
I filed bug 1062357 on the mochitest-bc bustage and repushed this with a NS_ASSERTION: https://hg.mozilla.org/integration/mozilla-inbound/rev/10276ebe5711
Blocks: 1062415
https://hg.mozilla.org/mozilla-central/rev/10276ebe5711
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Attachment #8483248 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Duplicate of this bug: 1062415
Duplicate of this bug: 1064288
FWIW, I just got a telemetry regression alert about this:

> This alert was generated because the distribution of the histogram GHOST_WINDOWS
> has changed on the 2014-08-30. Please have a look at the following plot:
> http://vitillo.github.io/cerberus/dashboard/#2014-08-30GHOST_WINDOWS
Hi Nicholas, I'm not sure what the telemetry alert means - should there be a new bug filed? Or is it a suggestion that this may not be as fixed as one might hope?    Those are some cool charts though. Wow.
Flags: needinfo?(n.nethercote)
> Hi Nicholas, I'm not sure what the telemetry alert means - should there be a
> new bug filed? Or is it a suggestion that this may not be as fixed as one
> might hope?    Those are some cool charts though. Wow.

The telemetry alert arrived 8 or 9 days after the date of the regression. During that time the bug was fixed. So the alert is basically a late confirmation that there was a regression.

There hasn't yet been a subsequent telemetry alert telling us that this patch has fixed the problem. But the alerts seem to only happen occasionally, and the most recent data for GHOST_WINDOWS is still from 2014-08-30. Hopefully when the next one comes in it'll show the regression has fixed.

So, basically, everything looks reasonable for the moment. Sorry for the confusion.
Flags: needinfo?(n.nethercote)
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.