High memory usage on plus.google.com if "load images automatically" is off

RESOLVED INCOMPLETE

Status

()

RESOLVED INCOMPLETE
7 years ago
11 months ago

People

(Reporter: hksonngan, Assigned: smaug)

Tracking

({crash, memory-leak, testcase})

Trunk
crash, memory-leak, testcase
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [MemShrink:P2], URL)

Attachments

(4 attachments)

(Reporter)

Description

7 years ago
Created attachment 555964 [details]
memhog.jpg

User Agent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.215 Safari/535.1

Steps to reproduce:

When I open plus.google.com my memory hog go crazy with js and heap-unclassified, is not free memory. Crash ID: bp-b2d240cf-e397-41f7-be0a-98b2c2110825


Actual results:

my CPU usage use always 50%, memory go crazy and never free , it overflow my memory and crash firefox


Expected results:

when I close tab which opened plus.google.com, firefox is normal. I don't get with Chrome, IE
(Reporter)

Comment 1

7 years ago
when I logout, never happen. But login, my memory will be overflow. I try with 3 id get the same error
(Reporter)

Updated

7 years ago
Keywords: crash
(Reporter)

Comment 2

7 years ago
ok. i found it error only when i turn off in loading image automatically
Can you please try it again in the Firefox safemode :
http://support.mozilla.com/en-US/kb/Safe+Mode
See also Bug 588742. Maybe JS is not coded poperly for the "Load Images automatically" Option turned off Case.
(Reporter)

Comment 5

7 years ago
(In reply to Matthias Versen (Matti) from comment #3)
> Can you please try it again in the Firefox safemode :
> http://support.mozilla.com/en-US/kb/Safe+Mode

the same problem with safemode, I get same with ff7 and 5
(Reporter)

Comment 6

7 years ago
using AdBlock I found if I blocked this XML request
https://plus.google.com/_/socialgraph/lookup/circles/?ct=2&m=true&_reqid=200697&rt=j
memmory is normal.
may be in function zv but it's hard to understand because of fuzzy googler

in FF 7 mem release after threshold but my cpu use always with 25% for 4 cores , 50% for 2 cores.
I dont get with IE or Chrome
(Reporter)

Comment 7

7 years ago
Created attachment 556264 [details]
error call xml request in google plus code
Whiteboard: [memshrink]
Whiteboard: [memshrink] → [memshrink:P3]
We made this a MemShrink:P3 because it requires you to turn off "load images automatically", which isn't terribly common.
Summary: memory hog like crazy in Firefox 6 with plus.google.com → High memory usage in Firefox 6 with plus.google.com if "load images automatically" is off
Summary: High memory usage in Firefox 6 with plus.google.com if "load images automatically" is off → High memory usage on plus.google.com if "load images automatically" is off

Comment 9

7 years ago
This is very easily reproducible.  I left Firefox 6 running with one Google+ tab with image loading disabled, and a few hours later, I caught it with a 10GB virtual address space, holding 6GB of it live in memory.  Unfortunately my system was swapping like crazy at that time, so I couldn't get a copy of about:memory.  Then I restarted Firefox, and by just leaving it in the same state for 10-20 minutes, I saw that heap-unclassified is growing by ~100MB every few minutes.  I'm going to test this on trunk to see if I can more useful information from about:memory.

Comment 10

7 years ago
It's heap-unclassified on trunk too.  I'll try to gather some information from gdb...

Comment 11

7 years ago
OK, I have a minimized testcase now, look at the URL field.  To reproduce the bug, just disable loading images automatically from the content tab of the preferences window, and load the page in the URL field, and wait.  The heap-unclassified memory continues to grow indefinitely, until Firefox crashes.

I think we should re-evaluate the priority of this bug based on the information I've gathered so far.
Status: UNCONFIRMED → NEW
Component: General → Image Blocking
Ever confirmed: true
Keywords: mlk, testcase
Product: Firefox → Core
QA Contact: general → image-blocking
Whiteboard: [memshrink:P3] → [memshrink]
Version: 6 Branch → Trunk

Comment 12

7 years ago
Upon further investigation, this doesn't have anything to do with image blocking or imagelib.  The objects that we're allocating are nsDOMEvents.  Something is holding on to nsDOMEvent objects, and we keep allocating more and more of these events as we're processing the error event.
Component: Image Blocking → Event Handling
QA Contact: image-blocking → events

Comment 13

7 years ago
I think the bug is happening because we just fail to gc/cc when the tab containing the testcase is open...
I wonder whether we can suppress the load when under onerror firing our src is set to itself...  That's just broken code.  :(

Comment 15

7 years ago
(In reply to Boris Zbarsky (:bz) from comment #14)
> I wonder whether we can suppress the load when under onerror firing our src
> is set to itself...  That's just broken code.  :(

Do you suspect that it's the never-ending load operation which is causing the nsDOMEvent objects to not be collected?

Comment 16

7 years ago
Boris: ping?

Comment 17

7 years ago
Boris: reping?

Comment 18

7 years ago
Actually CCing Boris this time.
> Do you suspect that it's the never-ending load operation which is causing the nsDOMEvent
> objects to not be collected?

No.  It's just causing lots of objects to be allocated.

Why they're not being collected... is a good question.  I tried reproducing using the url in the url field in a current debug build, and I see memory bounce around but keep coming down to the same value (about 48MB), presumably when we GC.

If you can reproduce with that testcase, do you want to see whether we're getting GC/CC happening (via the console logging for those)?  If we _are_ and if events are not being collected, then we have a problem.  If we're not, then we need to fix that.

Comment 20

7 years ago
IIRC I could repro this (but I needed to leave the browser running for some time in order to see this).  Would you mind pointing me to where I need to add that console dumps?
Flip the "javascript.options.mem.log" pref to true, and that will cause each GC and CC to log a message to your error console.

Comment 22

7 years ago
(In reply to Boris Zbarsky (:bz) from comment #21)
> Flip the "javascript.options.mem.log" pref to true, and that will cause each
> GC and CC to log a message to your error console.

Hmm, that doesn't seem to do the trick for me.  Do I need to have a special build?  (I tried this on a debug build)

Comment 23

7 years ago
(In reply to Ehsan Akhgari [:ehsan] from comment #22)
> (In reply to Boris Zbarsky (:bz) from comment #21)
> > Flip the "javascript.options.mem.log" pref to true, and that will cause each
> > GC and CC to log a message to your error console.
> 
> Hmm, that doesn't seem to do the trick for me.  Do I need to have a special
> build?  (I tried this on a debug build)

Turns out that I was looking at stdout, and not the Error Console!  Testing this now.

Comment 24

7 years ago
Created attachment 561562 [details]
Screenshot of the error console

So, here's what happens.  We do run CC and GC runs regularly, but CC doesn't end up collecting anything.  Which is what causes those objects to not be freed, I think.
That's pretty odd looking.  Not only is the CC not collecting anything, but it also seems a little weird that it is running so many times in a row.  Though I see it running 3 times in a row without a GC in my browser, so maybe it isn't that weird.

Ehsan, could you make a cycle collector dump for this?  Instructions here:

https://wiki.mozilla.org/Performance:Leak_Tools#Cycle_collector_heap_dump

I can take a look at it to see what the CC is doing.

Comment 26

7 years ago
Of course!  How long do you want me to leave Firefox running?  (i.e., how big do you want the graph to be?
Any time you see this weird CC behavior where it isn't freeing anything, just run the command to get the graph.  Smaller is probably better but it doesn't really matter, I just want a sample where it isn't finding anything.  I don't know if I'll find anything, but maybe.

Comment 28

7 years ago
Created attachment 561591 [details]
cc-edges-1.log

I noticed that as soon as I evaluated the code to dump CC edges, I got something like this in the error console:

CC timestamp: foo, collected: 2892 (2892 waiting for GC), suspected: 657, duration: 831ms.

Letting Firefox run after it would generate more "collected: 0" messages.
Oh, that's interesting.  So, when you do a CC dump, it does a CC.  But it doesn't do a regular CC, it actually disables an optimization where the CC doesn't trace through marked JS objects.  This optimization isn't supposed to keep objects from being found, so maybe this is a CC bug.

Did the memory usage go down to expected levels after you created the log and the GC has run?  You could also try creating a log a few times and see if the later ones free up anything, and if the memory goes down after a GC.

Failing that, if you are willing to rebuild the browser again, you could patch it so that logging doesn't turn off the optimization.  To do this, go into xpcom/base/nsCycleCollector.cpp and change this:

        mFlags |= nsCycleCollectionTraversalCallback::WANT_DEBUG_INFO |
                  nsCycleCollectionTraversalCallback::WANT_ALL_TRACES;

to this:

        mFlags |= nsCycleCollectionTraversalCallback::WANT_DEBUG_INFO;
I'll look at the log file, Ehsan, and see if I find anything interesting.  I looked at a randomish sample of 6 of the garbage objects. 4 of them were nsGenericElement (XUL) label.  One was a JS Object (XULElement), and one was a nsGenericElement (XUL) vbox.
I whipped up a little script to analyze the classes of the garbage objects in that log you put up, Ehsan:

one each: nsBaseContentList, nsBindingManager, nsDocument (xhtml) about:blank, nsFrameSelection, nsGenericElement (xhtml) body, nsGenericElement (xhtml) head, nsGenericElement (xhtml) html, nsNodeInfo (XUL) slider, nsNodeInfo (XUL) spacer, nsNodeInfo (XUL) thumb, nsNodeInfo ([none]) #document, nsNodeInfo (xhtml) body, nsNodeInfo (xhtml) head, nsNodeInfo (xhtml) html, nsNodeInfoManager

2 each: nsGenericElement (XUL) slider, nsGenericElement (XUL) thumb
10: nsTypedSelection
49: nsGenericElement (XUL) hbox
75 each: nsGenericDOMDataNode, nsGenericElement (XUL) description, nsGenericElement (XUL) image
98: nsGenericElement (XUL) spacer
101: nsEventListenerManager
151: nsGenericElement (XUL) vbox
392: nsGenericElement (XUL) label
421: nsGenericElement (XUL) box
515: XPCWrappedNative (XULElement)
911: JS Object (XULElement) (global=129560f38)

So there's a lot of XUL stuff.  I don't know if that is helpful or not.
For me, there were only something like 31 objects collected in the dump heap CC, not 2892.  I'm not sure what the difference there is.  31 probably isn't enough to get excited about, and could possibly be explained by the junk required to run the dump itself.

I'll set it up to always log the CC, with and without WANT_ALL_TRACES (which is what the dump normally does) and try to compare them.

As Ehsan observed, almost all of the growth is in heap-unclassified, so DMD might do some good here.  The JS heap grows a little, too, but not nearly at the same rate.
Depends on: 676724
Assignee: nobody → continuation
OS: Windows XP → All
Hardware: x86 → All
Whiteboard: [memshrink] → [MemShrink]

Comment 33

7 years ago
I used Instruments to try to get a sense of what's being allocated here.  Most of the memory allocation is being consumed on allocating native event objects, which our DOM event objects are holding on to.  What the test case effectively does it that it causes error events to be fired over and over again, but the page is not really holding on to anything.  So this is an instance where we fall into a cycle of allocating objects and not collecting them at all.

jst asked me to assign this to smaug for investigation, which is what I'm doing right now.
Assignee: continuation → Olli.Pettay
Whiteboard: [MemShrink] → [MemShrink:P2]
(Assignee)

Comment 34

7 years ago
I can't reproduce the high memory usage using the URL.
Memory usage goes up a bit, and then back down.
Does it continually load for you?  That's what happens to me, the little spinny loader thing just goes on forever and ever.
(Assignee)

Comment 36

7 years ago
It does seem to load all the time, since cpu usage is high.

...but investigating some more.
(Assignee)

Comment 37

7 years ago
So I added a simple printf and counter to nsDOMEvent ctor and dtor and we do create lots of
events, as expected, but we also do release those, as far as I see.
AFAICT we no longer have an option in about:preferences to disable loading images automatically and smaug couldn't repro, I'm going to close this. If someone wants to reopen it should be a MemShrink:P3.
Status: NEW → RESOLVED
Last Resolved: 11 months ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.