window.open() memory leak?

RESOLVED DUPLICATE of bug 833796

Status

Firefox OS
General
P1
critical
RESOLVED DUPLICATE of bug 833796
4 years ago
2 years ago

People

(Reporter: m1, Assigned: Justin Lebar (not reading bugmail))

Tracking

({mlk})

unspecified
ARM
Gonk (Firefox OS)
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:tef+)

Details

(Whiteboard: [MemShrink:P1])

Attachments

(4 attachments)

Created attachment 704662 [details]
test app demonstrating the issue

We see unbounded memory growth when window.open() is used repeatedly in an app.  This is impacting long term stability test, as ultimately the app using window.open() is LMKed.

STR:
1. Load attached app, and press "start" in the app. It will open and close a window every 2 seconds.
2. Watch memory growth via |b2g-ps|
(Reporter)

Comment 1

4 years ago
(fingers crossed that this is a bug in the test app itself !)
Michael, can you please run https://github.com/mozilla-b2g/B2G/blob/master/tools/get_about_memory.py after a while? Copy the link for about:memory you will see in the console, open it in Firefox Nightly, and copy&paste its contents to an attachment. Thanks.
Keywords: mlk
(Reporter)

Comment 3

4 years ago
Created attachment 704697 [details]
baseline about-memory followed by usage after 20min

Memory usage for the processes in the system remain roughly the same during this period of time.
(Assignee)

Comment 4

4 years ago
Okay, the ~30mb increase in memory in this process is all in the system JS
compartment.

One thing I notice here is the increase from [96] to [245] copies of the system
compartment reports.  I don't know what this means, exactly, but I bet njn
does.

> ├───5.85 MB (20.15%) ── unused-gc-things [96]

vs.

> ├──15.05 MB (24.70%) ── unused-gc-things [245]

It's possible we're not collecting things hard enough, or that we're leaking
something (in Gaia or in Gecko).  What happens if you do get_about_memory.py
--minimize (or, if you're doing this manually, you send SIGRT1)?

In general, Gecko is optimized for use-cases which appear on the web, and
optimizing for use-cases which don't commonly appear is obviously of limited
benefit to the platform.  Calling a cross-process window.open once every two
seconds is obviously not a common use-case.  So while I agree that we need to
get QC's tests unblocked, if there's any way to change the tests to rely only
on behavior that's actually common on the web, that's likely the easiest way
forward.
(Assignee)

Updated

4 years ago
Whiteboard: [MemShrink]
(Reporter)

Comment 5

4 years ago
Common use case aside, what we seem to be seeing here is that the gc never reclaims the memory and eventually the LMK kills the whole process.  Memory pressure events aren't seeming to help either.  

|get_about_memory.py --minimize| doesn't seem to do anything to reduce the process memory usage.  

I see '######################### form.js loaded' every time we open a new window, maybe something leaky in there?
(Assignee)

Comment 6

4 years ago
> I see '######################### form.js loaded' every time we open a new window, maybe 
> something leaky in there?

Not to suggest that form.js isn't leaking, but we load lots of stuff whenever we create a new frame.  form.js just happens to be noisy about it.

> |get_about_memory.py --minimize| doesn't seem to do anything to reduce the process memory 
> usage.

Okay, so we're actually holding something alive.  I wonder if there's a simple Gaia leak of the popup iframes.
> One thing I notice here is the increase from [96] to [245] copies of the
> system
> compartment reports.  I don't know what this means, exactly, but I bet njn
> does.
> 
> > ├───5.85 MB (20.15%) ── unused-gc-things [96]
> 
> vs.
> 
> > ├──15.05 MB (24.70%) ── unused-gc-things [245]

In the first case, 96 compartments have a non-zero amount of unused-gc-things.  In the second case, 245 do.  Basically, it just shows that a bunch of GC things (objects, strings, etc) have been GC'd, but in a fragment-y kind of way that doesn't cause their 4 KiB arenas to become empty and thus collectable.

A better question:  why do we have 10s or 100s of "[System Principal]" compartments, and why is that number increasing?  I thought khuey fixed that...
(Assignee)

Comment 8

4 years ago
Michael, can you provide the output of get_gc_cc_log.py (SIGRT2)?  I see a removeChild call in apps/system/js/popup_manager.js, but it's still possible that it's leaking the popup frame somehow.
(Assignee)

Comment 9

4 years ago
> A better question:  why do we have 10s or 100s of "[System Principal]" compartments, and 
> why is that number increasing?  I thought khuey fixed that...

I should have been more specific with my question, although you sort of answered it.

Apparently bug 798491 changed

> ├───7.08 MB (00.83%) ++ compartment([System Principal], chrome://browser/content/browser.xul)
> ├───5.05 MB (00.59%) ++ compartment([System Principal], resource://bar...)
> ├───4.42 MB (00.52%) ++ compartment([System Principal], resource://foo...)

into one compartment([System Principal]) line.  But then I don't get why we have 250 separate memory reporters for some line items.

Even in my desktop browser, I have

> ├───7.08 MB (00.83%) -- compartment([System Principal], chrome://browser/content/browser.xul)
> ├──4.89 MB (00.57%) ++ gc-heap
> ├──1.04 MB (00.12%) ── script-data [33]
> ├──0.52 MB (00.06%) ── shapes-extra/compartment-tables [32]
> ├──0.30 MB (00.04%) ── objects-extra/slots [32]
> ├──0.27 MB (00.03%) ── other-sundries [33]

I imagine that probably holds the key to understanding this bit.
(Assignee)

Comment 10

4 years ago
Sorry, I mis-indented the browser.xul part above.  Everything after the first line in the final quote in comment 9 should be indented one level.
(Reporter)

Comment 11

4 years ago
(In reply to Justin Lebar [:jlebar] from comment #8)
> Michael, can you provide the output of get_gc_cc_log.py (SIGRT2)?  

Yes, but not before noonish tomorrow PST unfortunately.  AFK.   I attached the test app that reproduces this issue to the bug though.
> But then I don't get why we
> have 250 separate memory reporters for some line items.

By far the most likely explanation is that we have 250 compartments with that name.  And the compartment measurements support this theory:

Before:

108 (100.0%) -- js-compartments
├───97 (89.81%) ── system
└───11 (10.19%) ── user

After:

255 (100.0%) -- js-compartments
├──246 (96.47%) ── system
└────9 (03.53%) ── user

Something isn't right.  Based on this evidence I'd suspect that bug 798491 didn't actually merge those compartments, but just gave them all the same name.  But that doesn't square with prior reports about that change, which said that it greatly reduced memory consumption and improved performance.  So I don't know what's going on.
Bug 798491 doesn't merge DOM compartments, only component and module compartments.
> Bug 798491 doesn't merge DOM compartments, only component and module
> compartments.

Here's the output:

│  │  │  ├──52.14 MB (85.58%) -- compartment([System Principal])
│  │  │  │  ├──34.38 MB (56.44%) -- gc-heap
│  │  │  │  │  ├──15.05 MB (24.70%) ── unused-gc-things [245]

What kind of compartment could have a system principal, but not be merged by bug 798491?
(In reply to Nicholas Nethercote [:njn] from comment #14)
> What kind of compartment could have a system principal, but not be merged by
> bug 798491?

Chrome windows, sandboxes, maybe some other stuff that I'm not thinking of right now.
Is the leak reproducible on desktop b2g/gaia?  If so a CC log would be great.
Sounds very similar to bug 822162 which I reported over a month ago.
(Assignee)

Comment 18

4 years ago
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #16)
> Is the leak reproducible on desktop b2g/gaia?  If so a CC log would be great.

We can get those logs from b2g devices too; comment 8.
(Assignee)

Comment 19

4 years ago
(In reply to Hugh Nougher [:Hughman] from comment #17)
> Sounds very similar to bug 822162 which I reported over a month ago.

Could be, but the setup here in B2G is /quite/ different.  Each "popup" does not load browser.xul, but instead loads a remote iframe...
(In reply to Justin Lebar [:jlebar] from comment #18)
> (In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #16)
> > Is the leak reproducible on desktop b2g/gaia?  If so a CC log would be great.
> 
> We can get those logs from b2g devices too; comment 8.

woo!
(Reporter)

Comment 21

4 years ago
Created attachment 705062 [details]
output of get_gc_cc_log.py

(In reply to Justin Lebar [:jlebar] from comment #8)
> Michael, can you provide the output of get_gc_cc_log.py (SIGRT2)?  

Attached now, captured after ~300 cycles
Attachment #705062 - Attachment mime type: text/plain → application/x-tgz
So there are two issues here:

- why do we have ~100 system compartments at the start?

- why are we creating all these new system compartments?
Whiteboard: [MemShrink] → [MemShrink:P1]
I told jlebar how to adjust this to adjust the CC dumping so we get a more complete graph. In the mean time, I started looking at the GC heap. I noticed there are 369 copies of chrome://specialpowers/content/specialpowersAPI.js:53, so I wonder how related it might be to that other test harness leak.
(Assignee)

Comment 24

4 years ago
> I wonder how related it might be to that other test harness leak.

That was bug 825802, for those following along at home.

Mike, are you running this app inside a test harness, or are you running in a perfectly vanilla build?  I'd be surprised if SpecialPowers.js were involved at all in a vanilla build.
Why is SpecialPowers.js anywhere near here?

(Midaired with jlebar :-P)
(Reporter)

Comment 26

4 years ago
Just running the attached app directly from the gaia homescreen, no test harness attached.  Marionette is enabled in this build though.
(Reporter)

Comment 27

4 years ago
Tried the app on build with Marionette disabled and memory leak is still observed.  :(
(Assignee)

Comment 28

4 years ago
If you still have it sitting around, it would be helpful if you could gather the CC log from this Marionette-less build...
(Reporter)

Comment 29

4 years ago
Created attachment 705208 [details]
cc logs from build w/ marionette disabled after 303 iterations
(Assignee)

Comment 30

4 years ago
Comment on attachment 705208 [details]
cc logs from build w/ marionette disabled after 303 iterations

Okay, no special powers here.  I guess that's...good.
(Assignee)

Comment 31

4 years ago
It seems that we have 248 copies of BrowserElementChild loaded.  This jives well with the idea that we're leaking frames.  Or perhaps we're not leaking frames and we still don't free the BrowserElementChild frame script (I think this is less likely, but still possible).

To wit, if I run

> $ xzcat gc-edges.548.log.xz | grep BrowserElement  | sed 's/0x[a-fA-F0-9]\+ /0xdeadbeef/' | sort | uniq -c | sort -n

I see 

>    248 > B BrowserElementChild
>    248 > B BrowserElementPromptService
>    248 B script chrome://global/content/BrowserElementChild.js:1242
>    248 B script chrome://global/content/BrowserElementChild.js:1248
>    248 B script chrome://global/content/BrowserElementChild.js:1254
>    [...]
>    496 B Function BrowserElementChild
>    498 B Function BrowserElementChild.prototype._activateNextPaintListener
>    498 B Function BrowserElementChild.prototype._addMozAfterPaintHandler
>    498 B Function BrowserElementChild.prototype._buildMenuObj
>    [...]

The first number in each line comes from uniq; that's how many times the line appears in the output.
(In reply to Michael Vines [:m1] from comment #26)
> Just running the attached app directly from the gaia homescreen, no test
> harness attached.  Marionette is enabled in this build though.

Given that the specialpowers leak should have been fixed on b2g18 which build were you using here? One which included that fix? If yes it's probably worth spawning a new bug on the remaining issue.
A common way to leak chrome JS is to add callbacks to some service, like observerservice, and
never removing those.

And message manager scripts run with chrome privs, so compartments are probably leaked
because BrowserElementChild ends up leaking.
(Assignee)

Comment 34

4 years ago
BrowserElementChild is careful to add only weak-ref observers.  Do I have to remove system event listeners, or does the CC handle that?

If it's not BEC leaking in this way, it could be any of the other frame scripts.  Presumably the CC log should give us some insight into this...
(Assignee)

Updated

4 years ago
Depends on: 833796
Assignee: nobody → justin.lebar+bug

Updated

4 years ago
blocking-b2g: tef? → tef+
(Assignee)

Comment 35

4 years ago
Based on bug 833796 comment 4, which says that this bug is fixed by bug 833796, I'm going to forward dupe this.  If that turns out to be wrong, we can either un-dupe or file a new bug.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 833796
(Reporter)

Updated

3 years ago
See Also: → bug 998504
(Reporter)

Updated

2 years ago
See Also: → bug 1137875
You need to log in before you can comment on or make changes to this bug.