Last Comment Bug 658672 - reduce BlockSize for NodePool and EdgePool in cycle collector
: reduce BlockSize for NodePool and EdgePool in cycle collector
Status: RESOLVED FIXED
[MemShrink:P2]
:
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: Other Branch
: All All
: -- normal (vote)
: mozilla8
Assigned To: Andrew McCreight [:mccr8]
:
Mentors:
Depends on:
Blocks: 638299 mslim-fx5+
  Show dependency treegraph
 
Reported: 2011-05-20 14:56 PDT by Andrew McCreight [:mccr8]
Modified: 2011-07-26 03:56 PDT (History)
11 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
crude patch to track memory usage and fragmentation in CC (6.82 KB, patch)
2011-06-11 14:21 PDT, Andrew McCreight [:mccr8]
no flags Details | Diff | Splinter Review
CC mem usage log for basic browsing around (techcrunch, cnn, facebook, etc.) (1.76 KB, text/plain)
2011-06-11 14:27 PDT, Andrew McCreight [:mccr8]
no flags Details
log mem usage for nodes and edges in CC (6.11 KB, patch)
2011-06-17 15:57 PDT, Andrew McCreight [:mccr8]
no flags Details | Diff | Splinter Review
reduce edge and node block size by 4 to reduce fragmentation (1.67 KB, patch)
2011-06-17 16:23 PDT, Andrew McCreight [:mccr8]
no flags Details | Diff | Splinter Review
reduce edge and node block size by a factor of 4 to reduce fragmentation (2.21 KB, patch)
2011-06-20 10:04 PDT, Andrew McCreight [:mccr8]
peterv: review+
Details | Diff | Splinter Review

Description Andrew McCreight [:mccr8] 2011-05-20 14:56:18 PDT
In the cycle collector, edges and nodes are each allocated in giant blocks.  Currently, the node blocks contain 32768 nodes and the edge blocks contain 65536.  At some point this may have made sense, but currently this seems way too large, and we're probably wasting a lot of space on internal fragmentation.  I suspect that pruning out marked JS objects has drastically reduced the number of objects in the CC graph.

With a few tabs open, there are around 14k-44k nodes and 25k-135k edges.  Even with the Membuster benchmark, I am only seeing around 129k-300k nodes and 360k-580 edges.  Nodes are about 5 pointers large, and edges are a single pointer.  If the number of nodes is randomly distributed, we're getting 16k nodes worth of internal fragmentation on average, which is more than the number of nodes we allocate in some cases.  That's about half a meg on a 64 bit system.

Drawbacks of smaller blocks:
- worsened locality: scanning across nodes and edges may jump around more
- more calls to malloc and free (but maybe faster due to smaller blocks?)
- increased space: a couple of pointers per block (not too much)

I'm not sure what a good block size would be.  Half the current size?  A quarter?  Even smaller?  I suppose there are diminishing returns on shrinking the block size.  Seems worth investigating, at least.
Comment 1 Nicholas Nethercote [:njn] 2011-05-20 16:08:01 PDT
(In reply to comment #0)
> 
> I'm not sure what a good block size would be.  Half the current size?  A
> quarter?  Even smaller?  I suppose there are diminishing returns on
> shrinking the block size.  Seems worth investigating, at least.

My suggestion: come up with a half-plausible metric, and then try a range of sizes.
Comment 2 Andrew McCreight [:mccr8] 2011-06-11 14:21:17 PDT
Created attachment 538718 [details] [diff] [review]
crude patch to track memory usage and fragmentation in CC
Comment 3 Andrew McCreight [:mccr8] 2011-06-11 14:27:32 PDT
Created attachment 538721 [details]
CC mem usage log for basic browsing around (techcrunch, cnn, facebook, etc.)

This is a log of memory usage for edges and nodes in the cycle collector.  The other major usage is the purple buffer, but that only allocates blocks of size 128, so fragmentation probably isn't too bad there.

In this run, fragmentation (edge frag + node frag) is around 800k-1.4m in a cycle collection.

Fragmentation is usually, but not always, worse for nodes than edges.

The number of blocks allocated for edges might be an interesting stat to look at with telemetry (# of node blocks can be inferred from # of nodes).
Comment 4 Andrew McCreight [:mccr8] 2011-06-17 15:57:55 PDT
Created attachment 540166 [details] [diff] [review]
log mem usage for nodes and edges in CC

My previous patch calculated used memory incorrectly (for the final block, it was counting unused memory, not used memory!).
Comment 5 Andrew McCreight [:mccr8] 2011-06-17 16:11:46 PDT
I think dividing the block sizes by 4 might be a reasonable compromise between block size and fragmentation.  We'll only get about 38 blocks for membuster-sized graphs, which doesn't seem too bad, and they are still pretty huge, so cache effects hopefully won't be too bad.

In terms of fragmentation, we can consider the worst case to basically be the size of a block (if we just barely kick over into a new block).

nodes:  32 bytes each
(current) 32768 per block * 32 = 1024k of fragmentation in worst case
(div 4)    8192 per block * 32 =  256k in worst case

edges:  8 bytes each
(current) 65536 per block * 8 = 524k of fragmentation in worst case
(div 4)   16384 per block * 8 = 131k of fragmentation in worst case

So we'd go from 1.5megs of fragmentation to 387k total, in the worst case.  The block size is also around the size of the smallest graphs (ignoring mostly empty shutdown CCs), so fragmentation will "only" be around 100%, instead of 300% or whatever.
Comment 6 Andrew McCreight [:mccr8] 2011-06-17 16:23:13 PDT
Created attachment 540175 [details] [diff] [review]
reduce edge and node block size by 4 to reduce fragmentation

Here's the patch.  It is extremely intricate.

I'll try server it, because you never know.
Comment 7 Andrew McCreight [:mccr8] 2011-06-18 10:51:05 PDT
It passed a try server run.  It doesn't show it on the page for some reason, but I retriggered the Android Mochitest 2 and it passed.

http://tbpl.mozilla.org/?tree=Try&rev=faa1ae6fed84
Comment 8 Andrew McCreight [:mccr8] 2011-06-20 10:04:37 PDT
Created attachment 540513 [details] [diff] [review]
reduce edge and node block size by a factor of 4 to reduce fragmentation

Minor update: I changed the comment that talks about "millions" of PtrInfo to instead refer to "hundreds of thousands" of PtrInfo, to reflect the current state of affairs that I have observed, and that drove this change.
Comment 9 Andrew McCreight [:mccr8] 2011-07-13 09:19:19 PDT
Review ping.  This is a pretty simple patch.
Comment 10 Peter Van der Beken [:peterv] - away till Aug 1st 2011-07-25 01:46:24 PDT
Comment on attachment 540513 [details] [diff] [review]
reduce edge and node block size by a factor of 4 to reduce fragmentation

Review of attachment 540513 [details] [diff] [review]:
-----------------------------------------------------------------

Sure, let's try this.
Comment 11 Andrew McCreight [:mccr8] 2011-07-25 10:08:08 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/4a2758e01ea1
Comment 12 Marco Bonardo [::mak] 2011-07-26 03:56:20 PDT
http://hg.mozilla.org/mozilla-central/rev/4a2758e01ea1

Note You need to log in before you can comment on or make changes to this bug.