Note: There are a few cases of duplicates in user autocompletion which are being worked on.

eliminate mLastChild field from PtrInfo

RESOLVED FIXED in mozilla7

Status

()

Core
XPCOM
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: mccr8, Assigned: mccr8)

Tracking

Other Branch
mozilla7
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [MemShrink:P2])

Attachments

(2 attachments, 4 obsolete attachments)

(Assignee)

Description

6 years ago
Andreas pointed out to me that the mLastChild field isn't really needed in PtrInfo.  Instead, you just look at the mFirstChild field of the next node.  Indeed, this appears to be the standard representation of static graphs.

This saves one pointer per PtrInfo.  When running the MemBuster benchmark, there are about 100k to 300k nodes per cycle collection, so this could save a couple of megs of memory.

Andreas had a patch for this in Bug 548733 a year ago, but that also messes around with EdgePool, I think to lay the groundwork to support insertion (to support WeakMaps, but I guess it wasn't needed in the end?), but I think a much simpler patch can eliminate mLastChild.  I'm currently running a patch on try, I'll clean it up and attach it.
(Assignee)

Comment 1

6 years ago
Separately, I think we should consider reducing BlockSize for the NodePool.  Currently there are 32k nodes allocated at a time.  With the total numbers of nodes I'm seeing, there could be 10% to 25% internal fragmentation, which seems like quite a bit.  My guess is that the numbers weren't tuned after the removal of marked JS objects, which eliminates a huge number of nodes.

I'm also not really clear on why we have both mInternalRefs and mRefCount.  I feel like we could steal 2 more bits from the count to distinguish GC and RC objects, then just keep the ref count.  This would save another 4 bytes per node.  It might make some logging a little less informative, but it doesn't look too bad.
Blocks: 640457
(Assignee)

Updated

6 years ago
Assignee: nobody → continuation
(Assignee)

Comment 2

6 years ago
Created attachment 533794 [details] [diff] [review]
first attempt at a patch

Store mLastChild in mFirstChild of the next node.  Add an extra node to each block to handle this.  I added getter/setter methods from LastChild, so it can be accessed mostly as before.  I should probably add similar getter/setters for FirstChild so it doesn't look as asymmetric.  I'm also not sure if I should put explicit inline annotations on the new methods or not.
(In reply to comment #1)
> Separately, I think we should consider reducing BlockSize for the NodePool. 
> Currently there are 32k nodes allocated at a time.

I've seen that come up in profiles numerous times.  It did seem like surprisingly large chunk of memory and I wondered if it could be improved... custom allocators are usually suspect in my book.
(Assignee)

Comment 4

6 years ago
The patch breaks on the same Moth test on 3 platforms.  I'll have to look into it.
http://tbpl.mozilla.org/?tree=Try&rev=b7df081948ab
(Assignee)

Updated

6 years ago
Blocks: 638299

Comment 5

6 years ago
++andrew, thanks for looking into this
(Assignee)

Comment 6

6 years ago
Created attachment 534315 [details] [diff] [review]
eliminate mLastChild from PtrInfo

Eliminates the mLastChild pointer from PtrInfo, which will reduce space usage.  The mFirstChild field is made private, and getters and setters are added for the first and last child.

One unfortunate aspect of this implementation is that we are still doing just as many stores as before: we set both the first and last child for every node, while we only need to set one of them, except at the end of a block.  In a conventional implementation of a static graph, all nodes are allocated in a single block, so you can just have a FinishNodes() method that fills out the single dummy node.  In the Fx implementation, there are multiple blocks, which requires one dummy node per block.  Probably the best way to fix this would be to add a reference to the EdgePool to the NodePool builder, and then set the last child of a block in NodePool::Builder::Add when you move to the next block.  I'll try to think about whether the added complexity is worth it or not.  I don't think the extra stores are going to cost any additional cache misses, because the same address basically gets the same value written to it twice in a row.

I did some try runs with this patch.  After removing DEBUG_CC, I had a clean run except for OSX Debug Moth which was having intermittent oranges that showed up on somebody else's TM push, but after updating to the latest version, that Moth ran twice without any orange.
Attachment #533794 - Attachment is obsolete: true
(Assignee)

Comment 7

6 years ago
Created attachment 534882 [details] [diff] [review]
eliminate mLastChild from PtrInfo

Should just be some whitespace cleanup compared to the previous version of this patch.
Attachment #534315 - Attachment is obsolete: true
Attachment #534882 - Flags: review?(peterv)
(Assignee)

Comment 8

6 years ago
Created attachment 534885 [details] [diff] [review]
remove redundant stores of child pointers

Only set the child of the next node if we are as far as we are going to go in the current block.  This will be the case if we are at the end of the current block (which required adding a new predicate AtBlockEnd for NodePool enumerators), or if there are no more items in the queue.

This change is a bit awkward, because GCGraphBuilder doesn't know where we are in the NodePool, and nsCycleCollector doesn't know where we are in the EdgePool.  To solve this, I test in nsCycleCollector if we should store the last child, then pass the result of the test into Traverse.  There's a fair amount of redundant testing here between GetNext, AtBlockEnd, and the two calls to IsDone, but because it is all inlined hopefully the compiler can do something smart with it.

Part 1 can be used with or without this, depending on whether trading a redundant store to a location in the cache for (possibly) a few comparisons is worth the little bit of complexity.  I'll put this up for review if it passes a single platform try run.
(Assignee)

Comment 9

6 years ago
Comment on attachment 534885 [details] [diff] [review]
remove redundant stores of child pointers

Try run on this patch came back green for Linux64 opt and debug.
Attachment #534885 - Flags: review?(peterv)
Comment on attachment 534882 [details] [diff] [review]
eliminate mLastChild from PtrInfo

Review of attachment 534882 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/nsCycleCollector.cpp
@@ +531,5 @@
> +
> +    // this PtrInfo must be part of a NodePool
> +    EdgePool::Iterator LastChild()
> +    {
> +        return (this+1)->mFirstChild;

Spaces around +.

@@ +542,5 @@
> +
> +    // this PtrInfo must be part of a NodePool
> +    void SetLastChild(EdgePool::Iterator aLastChild)
> +    {
> +        (this+1)->mFirstChild = aLastChild;

Spaces around +.
Attachment #534882 - Flags: review?(peterv) → review+
Comment on attachment 534885 [details] [diff] [review]
remove redundant stores of child pointers

Review of attachment 534885 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/nsCycleCollector.cpp
@@ +1456,5 @@
>      {
>          return AddNode(s, aParticipant);
>      }
>  #endif
> +    void Traverse(PtrInfo* aPtrInfo, PRBool blockFinalNode);

Arguments are normally prefixed with 'a' in this file, so aBlockFinalNode

@@ +1804,1 @@
>      }

Could avoid doing the queue.IsDone() check for every call to Traverse if you you add a function to the graph builder that does: |mCurrPi->SetLastChild(mEdgeBuilder.Mark());|. Call it in the while-loop if queue.AtBlockEnd() and once after the while-loop if mGraph.mRootCount > 0. Up to you.
Attachment #534885 - Flags: review?(peterv) → review+
(Assignee)

Comment 12

6 years ago
In theory the compiler should be able to merge those redundant queue.IsDone() checks, but I'll try refactoring it a bit and see how it looks, thanks.
(Assignee)

Comment 13

6 years ago
That actually looks pretty nice.  I'll tryserver it and see how it goes.
>    NodePool::Enumerator queue(mGraph.mNodes);
>    while (!queue.IsDone()) {
>        PtrInfo *pi = queue.GetNext();
>        builder.Traverse(pi);
>        if (queue.AtBlockEnd())
>            builder.SetLastChild();
>    }
>    if (mGraph.mRootCount > 0)
>        builder.SetLastChild();
Whiteboard: [MemShrink:P2]
(Assignee)

Comment 14

6 years ago
Created attachment 538540 [details] [diff] [review]
part 1: eliminate mLastChild from PtrInfo

Addressed peterv's comments.
Attachment #534882 - Attachment is obsolete: true
Attachment #538540 - Flags: review?(continuation)
Attachment #538540 - Flags: checkin?
(Assignee)

Comment 15

6 years ago
Created attachment 538542 [details] [diff] [review]
part 2: eliminate redundant stores of PtrInfo child pointers

Addressed peterv's comments.

Carrying forward his r+ on both of these patches.  I tried them out on try and it came out okay.
Attachment #534885 - Attachment is obsolete: true
Attachment #538542 - Flags: review?(continuation)
Attachment #538542 - Flags: checkin?
(Assignee)

Updated

6 years ago
Attachment #538540 - Flags: review?(continuation) → review+
(Assignee)

Updated

6 years ago
Attachment #538542 - Flags: review?(continuation) → review+
(Assignee)

Updated

6 years ago
Keywords: checkin-needed
http://hg.mozilla.org/mozilla-central/rev/6c24a513319a
http://hg.mozilla.org/mozilla-central/rev/3f40708323d9
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla7

Updated

6 years ago
Attachment #538540 - Flags: checkin?

Updated

6 years ago
Attachment #538542 - Flags: checkin?
what turned out to be the actual savings by this patch?
(Assignee)

Comment 18

6 years ago
It saves 1 pointer per node in the CC graph.  With a few tabs open, there's around 15k to 44k nodes in the graph.  With MemBuster, there are 100k to 300k nodes.  The CC graph only exists for the duration of the cycle collection, so it doesn't exist for that long.
You need to log in before you can comment on or make changes to this bug.