Last Comment Bug 833077 - window.open() memory leak?
: window.open() memory leak?
Status: RESOLVED DUPLICATE of bug 833796
[MemShrink:P1]
: mlk
Product: Firefox OS
Classification: Client Software
Component: General (show other bugs)
: unspecified
: ARM Gonk (Firefox OS)
: P1 critical (vote)
: ---
Assigned To: Justin Lebar (not reading bugmail)
:
:
Mentors:
Depends on: 833796
Blocks: 808607
  Show dependency treegraph
 
Reported: 2013-01-21 12:58 PST by Michael Vines [:m1] [:evilmachines]
Modified: 2015-02-27 12:46 PST (History)
16 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
tef+


Attachments
test app demonstrating the issue (8.01 KB, application/x-gzip)
2013-01-21 12:58 PST, Michael Vines [:m1] [:evilmachines]
no flags Details
baseline about-memory followed by usage after 20min (10.64 KB, text/plain)
2013-01-21 14:53 PST, Michael Vines [:m1] [:evilmachines]
no flags Details
output of get_gc_cc_log.py (3.32 MB, application/x-tgz)
2013-01-22 12:34 PST, Michael Vines [:m1] [:evilmachines]
no flags Details
cc logs from build w/ marionette disabled after 303 iterations (3.49 MB, application/x-tgz)
2013-01-22 19:21 PST, Michael Vines [:m1] [:evilmachines]
no flags Details

Description Michael Vines [:m1] [:evilmachines] 2013-01-21 12:58:29 PST
Created attachment 704662 [details]
test app demonstrating the issue

We see unbounded memory growth when window.open() is used repeatedly in an app.  This is impacting long term stability test, as ultimately the app using window.open() is LMKed.

STR:
1. Load attached app, and press "start" in the app. It will open and close a window every 2 seconds.
2. Watch memory growth via |b2g-ps|
Comment 1 Michael Vines [:m1] [:evilmachines] 2013-01-21 12:59:19 PST
(fingers crossed that this is a bug in the test app itself !)
Comment 2 Henrik Skupin (:whimboo) 2013-01-21 13:09:50 PST
Michael, can you please run https://github.com/mozilla-b2g/B2G/blob/master/tools/get_about_memory.py after a while? Copy the link for about:memory you will see in the console, open it in Firefox Nightly, and copy&paste its contents to an attachment. Thanks.
Comment 3 Michael Vines [:m1] [:evilmachines] 2013-01-21 14:53:22 PST
Created attachment 704697 [details]
baseline about-memory followed by usage after 20min

Memory usage for the processes in the system remain roughly the same during this period of time.
Comment 4 Justin Lebar (not reading bugmail) 2013-01-21 15:34:58 PST
Okay, the ~30mb increase in memory in this process is all in the system JS
compartment.

One thing I notice here is the increase from [96] to [245] copies of the system
compartment reports.  I don't know what this means, exactly, but I bet njn
does.

> ├───5.85 MB (20.15%) ── unused-gc-things [96]

vs.

> ├──15.05 MB (24.70%) ── unused-gc-things [245]

It's possible we're not collecting things hard enough, or that we're leaking
something (in Gaia or in Gecko).  What happens if you do get_about_memory.py
--minimize (or, if you're doing this manually, you send SIGRT1)?

In general, Gecko is optimized for use-cases which appear on the web, and
optimizing for use-cases which don't commonly appear is obviously of limited
benefit to the platform.  Calling a cross-process window.open once every two
seconds is obviously not a common use-case.  So while I agree that we need to
get QC's tests unblocked, if there's any way to change the tests to rely only
on behavior that's actually common on the web, that's likely the easiest way
forward.
Comment 5 Michael Vines [:m1] [:evilmachines] 2013-01-21 16:02:20 PST
Common use case aside, what we seem to be seeing here is that the gc never reclaims the memory and eventually the LMK kills the whole process.  Memory pressure events aren't seeming to help either.  

|get_about_memory.py --minimize| doesn't seem to do anything to reduce the process memory usage.  

I see '######################### form.js loaded' every time we open a new window, maybe something leaky in there?
Comment 6 Justin Lebar (not reading bugmail) 2013-01-21 17:15:56 PST
> I see '######################### form.js loaded' every time we open a new window, maybe 
> something leaky in there?

Not to suggest that form.js isn't leaking, but we load lots of stuff whenever we create a new frame.  form.js just happens to be noisy about it.

> |get_about_memory.py --minimize| doesn't seem to do anything to reduce the process memory 
> usage.

Okay, so we're actually holding something alive.  I wonder if there's a simple Gaia leak of the popup iframes.
Comment 7 Nicholas Nethercote [:njn] 2013-01-21 17:17:06 PST
> One thing I notice here is the increase from [96] to [245] copies of the
> system
> compartment reports.  I don't know what this means, exactly, but I bet njn
> does.
> 
> > ├───5.85 MB (20.15%) ── unused-gc-things [96]
> 
> vs.
> 
> > ├──15.05 MB (24.70%) ── unused-gc-things [245]

In the first case, 96 compartments have a non-zero amount of unused-gc-things.  In the second case, 245 do.  Basically, it just shows that a bunch of GC things (objects, strings, etc) have been GC'd, but in a fragment-y kind of way that doesn't cause their 4 KiB arenas to become empty and thus collectable.

A better question:  why do we have 10s or 100s of "[System Principal]" compartments, and why is that number increasing?  I thought khuey fixed that...
Comment 8 Justin Lebar (not reading bugmail) 2013-01-21 17:19:17 PST
Michael, can you provide the output of get_gc_cc_log.py (SIGRT2)?  I see a removeChild call in apps/system/js/popup_manager.js, but it's still possible that it's leaking the popup frame somehow.
Comment 9 Justin Lebar (not reading bugmail) 2013-01-21 17:24:24 PST
> A better question:  why do we have 10s or 100s of "[System Principal]" compartments, and 
> why is that number increasing?  I thought khuey fixed that...

I should have been more specific with my question, although you sort of answered it.

Apparently bug 798491 changed

> ├───7.08 MB (00.83%) ++ compartment([System Principal], chrome://browser/content/browser.xul)
> ├───5.05 MB (00.59%) ++ compartment([System Principal], resource://bar...)
> ├───4.42 MB (00.52%) ++ compartment([System Principal], resource://foo...)

into one compartment([System Principal]) line.  But then I don't get why we have 250 separate memory reporters for some line items.

Even in my desktop browser, I have

> ├───7.08 MB (00.83%) -- compartment([System Principal], chrome://browser/content/browser.xul)
> ├──4.89 MB (00.57%) ++ gc-heap
> ├──1.04 MB (00.12%) ── script-data [33]
> ├──0.52 MB (00.06%) ── shapes-extra/compartment-tables [32]
> ├──0.30 MB (00.04%) ── objects-extra/slots [32]
> ├──0.27 MB (00.03%) ── other-sundries [33]

I imagine that probably holds the key to understanding this bit.
Comment 10 Justin Lebar (not reading bugmail) 2013-01-21 17:25:29 PST
Sorry, I mis-indented the browser.xul part above.  Everything after the first line in the final quote in comment 9 should be indented one level.
Comment 11 Michael Vines [:m1] [:evilmachines] 2013-01-21 18:14:49 PST
(In reply to Justin Lebar [:jlebar] from comment #8)
> Michael, can you provide the output of get_gc_cc_log.py (SIGRT2)?  

Yes, but not before noonish tomorrow PST unfortunately.  AFK.   I attached the test app that reproduces this issue to the bug though.
Comment 12 Nicholas Nethercote [:njn] 2013-01-21 21:34:12 PST
> But then I don't get why we
> have 250 separate memory reporters for some line items.

By far the most likely explanation is that we have 250 compartments with that name.  And the compartment measurements support this theory:

Before:

108 (100.0%) -- js-compartments
├───97 (89.81%) ── system
└───11 (10.19%) ── user

After:

255 (100.0%) -- js-compartments
├──246 (96.47%) ── system
└────9 (03.53%) ── user

Something isn't right.  Based on this evidence I'd suspect that bug 798491 didn't actually merge those compartments, but just gave them all the same name.  But that doesn't square with prior reports about that change, which said that it greatly reduced memory consumption and improved performance.  So I don't know what's going on.
Comment 13 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2013-01-21 21:47:48 PST
Bug 798491 doesn't merge DOM compartments, only component and module compartments.
Comment 14 Nicholas Nethercote [:njn] 2013-01-21 21:49:57 PST
> Bug 798491 doesn't merge DOM compartments, only component and module
> compartments.

Here's the output:

│  │  │  ├──52.14 MB (85.58%) -- compartment([System Principal])
│  │  │  │  ├──34.38 MB (56.44%) -- gc-heap
│  │  │  │  │  ├──15.05 MB (24.70%) ── unused-gc-things [245]

What kind of compartment could have a system principal, but not be merged by bug 798491?
Comment 15 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2013-01-21 21:51:04 PST
(In reply to Nicholas Nethercote [:njn] from comment #14)
> What kind of compartment could have a system principal, but not be merged by
> bug 798491?

Chrome windows, sandboxes, maybe some other stuff that I'm not thinking of right now.
Comment 16 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2013-01-21 21:51:39 PST
Is the leak reproducible on desktop b2g/gaia?  If so a CC log would be great.
Comment 17 Hugh Nougher [:Hughman] 2013-01-21 23:23:05 PST
Sounds very similar to bug 822162 which I reported over a month ago.
Comment 18 Justin Lebar (not reading bugmail) 2013-01-22 06:36:01 PST
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #16)
> Is the leak reproducible on desktop b2g/gaia?  If so a CC log would be great.

We can get those logs from b2g devices too; comment 8.
Comment 19 Justin Lebar (not reading bugmail) 2013-01-22 06:36:55 PST
(In reply to Hugh Nougher [:Hughman] from comment #17)
> Sounds very similar to bug 822162 which I reported over a month ago.

Could be, but the setup here in B2G is /quite/ different.  Each "popup" does not load browser.xul, but instead loads a remote iframe...
Comment 20 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2013-01-22 09:02:54 PST
(In reply to Justin Lebar [:jlebar] from comment #18)
> (In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #16)
> > Is the leak reproducible on desktop b2g/gaia?  If so a CC log would be great.
> 
> We can get those logs from b2g devices too; comment 8.

woo!
Comment 21 Michael Vines [:m1] [:evilmachines] 2013-01-22 12:34:24 PST
Created attachment 705062 [details]
output of get_gc_cc_log.py

(In reply to Justin Lebar [:jlebar] from comment #8)
> Michael, can you provide the output of get_gc_cc_log.py (SIGRT2)?  

Attached now, captured after ~300 cycles
Comment 22 Nicholas Nethercote [:njn] 2013-01-22 14:52:19 PST
So there are two issues here:

- why do we have ~100 system compartments at the start?

- why are we creating all these new system compartments?
Comment 23 Andrew McCreight [:mccr8] 2013-01-22 16:22:35 PST
I told jlebar how to adjust this to adjust the CC dumping so we get a more complete graph. In the mean time, I started looking at the GC heap. I noticed there are 369 copies of chrome://specialpowers/content/specialpowersAPI.js:53, so I wonder how related it might be to that other test harness leak.
Comment 24 Justin Lebar (not reading bugmail) 2013-01-22 16:47:08 PST
> I wonder how related it might be to that other test harness leak.

That was bug 825802, for those following along at home.

Mike, are you running this app inside a test harness, or are you running in a perfectly vanilla build?  I'd be surprised if SpecialPowers.js were involved at all in a vanilla build.
Comment 25 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2013-01-22 16:49:05 PST
Why is SpecialPowers.js anywhere near here?

(Midaired with jlebar :-P)
Comment 26 Michael Vines [:m1] [:evilmachines] 2013-01-22 17:23:27 PST
Just running the attached app directly from the gaia homescreen, no test harness attached.  Marionette is enabled in this build though.
Comment 27 Michael Vines [:m1] [:evilmachines] 2013-01-22 19:12:03 PST
Tried the app on build with Marionette disabled and memory leak is still observed.  :(
Comment 28 Justin Lebar (not reading bugmail) 2013-01-22 19:15:18 PST
If you still have it sitting around, it would be helpful if you could gather the CC log from this Marionette-less build...
Comment 29 Michael Vines [:m1] [:evilmachines] 2013-01-22 19:21:17 PST
Created attachment 705208 [details]
cc logs from build w/ marionette disabled after 303 iterations
Comment 30 Justin Lebar (not reading bugmail) 2013-01-22 19:23:46 PST
Comment on attachment 705208 [details]
cc logs from build w/ marionette disabled after 303 iterations

Okay, no special powers here.  I guess that's...good.
Comment 31 Justin Lebar (not reading bugmail) 2013-01-22 19:31:36 PST
It seems that we have 248 copies of BrowserElementChild loaded.  This jives well with the idea that we're leaking frames.  Or perhaps we're not leaking frames and we still don't free the BrowserElementChild frame script (I think this is less likely, but still possible).

To wit, if I run

> $ xzcat gc-edges.548.log.xz | grep BrowserElement  | sed 's/0x[a-fA-F0-9]\+ /0xdeadbeef/' | sort | uniq -c | sort -n

I see 

>    248 > B BrowserElementChild
>    248 > B BrowserElementPromptService
>    248 B script chrome://global/content/BrowserElementChild.js:1242
>    248 B script chrome://global/content/BrowserElementChild.js:1248
>    248 B script chrome://global/content/BrowserElementChild.js:1254
>    [...]
>    496 B Function BrowserElementChild
>    498 B Function BrowserElementChild.prototype._activateNextPaintListener
>    498 B Function BrowserElementChild.prototype._addMozAfterPaintHandler
>    498 B Function BrowserElementChild.prototype._buildMenuObj
>    [...]

The first number in each line comes from uniq; that's how many times the line appears in the output.
Comment 32 Henrik Skupin (:whimboo) 2013-01-23 02:23:03 PST
(In reply to Michael Vines [:m1] from comment #26)
> Just running the attached app directly from the gaia homescreen, no test
> harness attached.  Marionette is enabled in this build though.

Given that the specialpowers leak should have been fixed on b2g18 which build were you using here? One which included that fix? If yes it's probably worth spawning a new bug on the remaining issue.
Comment 33 Olli Pettay [:smaug] 2013-01-23 03:14:22 PST
A common way to leak chrome JS is to add callbacks to some service, like observerservice, and
never removing those.

And message manager scripts run with chrome privs, so compartments are probably leaked
because BrowserElementChild ends up leaking.
Comment 34 Justin Lebar (not reading bugmail) 2013-01-23 06:25:35 PST
BrowserElementChild is careful to add only weak-ref observers.  Do I have to remove system event listeners, or does the CC handle that?

If it's not BEC leaking in this way, it could be any of the other frame scripts.  Presumably the CC log should give us some insight into this...
Comment 35 Justin Lebar (not reading bugmail) 2013-01-23 14:46:35 PST
Based on bug 833796 comment 4, which says that this bug is fixed by bug 833796, I'm going to forward dupe this.  If that turns out to be wrong, we can either un-dupe or file a new bug.

*** This bug has been marked as a duplicate of bug 833796 ***

Note You need to log in before you can comment on or make changes to this bug.