Closed Bug 701443 Opened 13 years ago Closed 11 years ago

amctv.com causes long CC times, zombie compartment, high heap-unclassified

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: mccr8, Assigned: mccr8)

References

()

Details

Attachments

(1 file)

dvander had a long-running session that was experiencing some similar symptoms to what Asa was seeing in Bug 700645.  He had second-long CC times, and a high amount of heap-unclassified.  Not as bad as Asa, only around 50%.  He thinks he knows what page is causing this [1].  He closed every tab one at a time, and the long CC times persisted.  Interestingly, after he closed the suspect tab, his CC times doubled, permanently, to about 2100ms.  Maybe there's a bunch of gunk that was being marked black when it was live, but when the page was closed it all ended up gray, so the CC had to trace through it?  The suspect tab also caused a zombie compartment taking 131mb, and he had 350mb of heap-unclassified.

In terms of addons, dvander is only using AdblockPlus and about:telemetry, and he only had the latter going for a short while after I asked him to install it.

He also had the suspect page running in Chrome for about a month, and its memory usage had risen to 2gb, so it sounds like there is an actual leak in the page.  In Firefox, I think he said that it took about a week for the browser to become unusable.

He sent me a cycle collector log dump, which I have started analyzing.  Here are the top 10 most commons kinds of elements:

  295228 nsGenericElement (xhtml) div
  214124 nsGenericDOMDataNode
   95280 JS Object (Object)
   79587 JS Object (Array)
   60417 nsGenericElement (xhtml) span
   49583 JS Object (Function)
   49527 nsGenericElement (xhtml) a
   47587 nsEventListenerManager
   44175 nsDOMCSSAttributeDeclaration
   28892 nsDOMAttributeMap

This looks similar to, but not quite the same as the IRCcloud profile, where there were a hundred times as many nsGenericDOMDataNode and "nsGenericElement (xhtml) a" as anything else.  span were holding onto them.

[1] http://blogs.amctv.com/breaking-bad/2011/06/season-4-music.php
He was running a browser he built himself, on Tuesday, so after Peterv's various leak fixes that landed on Monday.
Top 10 holders of nsGenericDOMDataNodes:
   64346 nsGenericElement (xhtml) span
   29323 nsGenericElement (xhtml) div
   28755 nsDOMAttribute
   28474 nsGenericElement (xhtml) a
   16142 nsGenericElement (xhtml) td
   12472 nsGenericElement (xhtml) tr
   12166 EditTxn
    6255 nsGenericElement (xhtml) select
    6103 nsGenericElement (xhtml) option
    3383 nsGenericElement (xhtml) tbody

Spans!!!
Hmm, 12k+ EditTxn's seems really odd. Ehsan, what causes us to create EditTxn objects?
Same list, except for divs:
  545480 nsGenericElement (xhtml) div
  211310 JS Object (HTMLDivElement)
   53354 nsGenericElement (xhtml) span
   31398 nsGenericElement (xhtml) a
   29323 nsGenericDOMDataNode
   12122 nsDOMCSSAttributeDeclaration
    9828 nsGenericElement ([none]) #document-fragment
    8219 EditTxn
    1793 nsRange
    1136 nsGenericElement (xhtml) td

distribution of the sizes of spans (number on the left of the : is the size, number on the right is the number of spans with that size):  {2: 652, 3: 34532, 4: 4492, 5: 8218, 6: 11127, 7: 611, 8: 311, 9: 424, 42: 2, 11: 9, 12: 91, 13: 30, 44: 2, 16: 2, 15: 2, 28: 2, 10: 3}
Many small ones, a few around 40.

same thing for divs:
{128: 1, 2: 93630, 3: 114245, 4: 42151, 5: 677, 6: 11386, 7: 20260, 8: 11123, 9: 322, 10: 18, 11: 92, 12: 7, 13: 111, 14: 9, 15: 133, 16: 29, 17: 17, 18: 14, 19: 15, 20: 1, 21: 3, 23: 8, 24: 4, 25: 3, 26: 1, 27: 3, 29: 15, 32: 1, 33: 7, 34: 1, 35: 1, 40: 1, 44: 2, 45: 2, 46: 2, 55: 1, 132: 1, 443: 1, 61: 1, 140: 1, 207: 1, 90: 1, 59: 1, 109: 1, 114: 1, 124: 1}
Again, lots of small ones.
(In reply to Johnny Stenback (:jst, jst@mozilla.com) from comment #3)
> Hmm, 12k+ EditTxn's seems really odd. Ehsan, what causes us to create
> EditTxn objects?

There are actually a total of 17669 EditTxns.
I'd like to leave this open until I confirm that 702813 is the underlying cause here (well, an underlying cause...), but we can remove the MemShrink tag for the time being.
Whiteboard: [MemShrink]
Here's the sizes of the connected subgraphs in dvander's graph.  The format m=n(k) means that there are n blobs of size m, for a total of k nodes in subgraph of size m.

1=821(821), 2=9924(19848), 3=84(252), 4=145(580), 5=9(45), 6=10(60), 7=8(56), 9=3(27), 10=10278(102780), 11=4(44), 13=2(26), 15=1(15), 16=2(32), 22=2(44), 28=1(28), 29=1(29), 32=1(32), 33=907(29931), 34=411(13974), 35=130(4550), 36=6822(245592), 37=653(24161), 38=84(3192), 39=468(18252), 40=53(2120), 41=73(2993), 42=117(4914), 43=5(215), 44=6(264), 45=31(1395), 46=3(138), 47=2(94), 48=1(48), 51=1(51), 56=1(56), 58=1(58), 59=1(59), 63=1(63), 66=1(66), 167=1(167), 226=1(226), 242=1(242), 266=1(266), 1759=1(1759), 193192=1(193192),

The largest thing here is the one huge blob, but there are even more nodes in subgraphs of size 36, for whatever reason, and a fair number in subgraphs of size 10.
Okay, I forgot that this is actually not the raw graph, but a graph with various things stripped out, including childless nodes and JS...
Here are the first 20 subgraphs of size 10 and 36 from this CC log.  Notice that they are almost all the same (well, within a size class).  As with the stats I gave in the previous comment, this is with nsNodeInfo edges, garbage nodes, JS nodes, nsJSEventListeners and childless nodes stripped from the graph.
When I don't strip the 1.36 million JS nodes from the graph, then almost all of the graph is contained in a single almost 2 million node subgraph.  So, unsurprisingly, JS is somehow gluing together all of the subgraphs into a giant blob.

On the other hand, when I remove marked JS nodes (there are a million that the CC had in its graph), which corresponds more closely to the normal CC behavior, then the largest blob is "only" 400k nodes, and we have 123k nodes in subgraphs of size 12 and 252k nodes in subgraphs of size 37, which is basically the same behavior as from comment 7.
Hopefully the CC speedup work has helped here.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → INCOMPLETE
It hasn't. I can still reproduce 10-20 second browser pause times just by leaving that url open for 2-3 days and then clicking on the tab. I don't know if that deserves a separate bug though because these days my overall browsing experience doesn't seem to be hindered (yet).
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
David, could you check whether those are GC or CC pauses, and if CC pauses,
create CC log using the script in the first gray box
https://wiki.mozilla.org/Performance:Leak_Tools#Cycle_collector_heap_dump
(note, the log may contain URLs of visited pages).
Send the log to me and Andrew.
Over the weekend the page crawled up to using 2GB of memory. It looks like the browser was doing multiple ~500ms CCs and GCs in a row so I got 2-3 second lockups. I got a ~3 minute lockup trying to click on the tab.

I tried to profile it but accidentally killed the process, so I'll post back in a few days.
Depends on: 795128
Reopen if this is still a problem, I guess.
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: