Closed Bug 1488413 Opened 6 years ago Closed 6 years ago

Investigate if CycleCollector could use a cache on top of the graph's hashtable

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox64 --- fixed

People

(Reporter: smaug, Assigned: smaug)

Details

Attachments

(1 file, 1 obsolete file)

Hashtable lookups are slow. Especially the access in CCGraphBuilder::AddNode shows up in profiles (these days it is inlined, but it is still there).

Initial results hint that a cache on top of hashtable could reduce hashtable usage. This is based on the guess that when traversing the objects, we often end up dealing with same objects during a short period of time.

The cache would use some memory, but only while CC is running.
Summary: Investigate if CycleColletor could use a cache on top of the graph hashtable → Investigate if CycleCollector could use a cache on top of the graph hashtable
Attached patch graph_cache.diff (obsolete) — Splinter Review
Based on local testing this helps quite a bit.
Cache hit rate in child process is > 50% and >40% in parent and performance profiles look quite different with this.


Hopefully the static_assert is ok for all the platforms.

remote: View your change here:
remote:   https://hg.mozilla.org/try/rev/8b5bedad2300830b45a2a63ed6b9b054233f3141
remote: 
remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=8b5bedad2300830b45a2a63ed6b9b054233f3141
remote: recorded changegroup in replication log in 0.012s
Attachment #9006277 - Flags: review?(continuation)
Summary: Investigate if CycleCollector could use a cache on top of the graph hashtable → Investigate if CycleCollector could use a cache on top of the graph's hashtable
Comment on attachment 9006277 [details] [diff] [review]
graph_cache.diff

Review of attachment 9006277 [details] [diff] [review]:
-----------------------------------------------------------------

This is a neat idea! I never thought about how there might be this kind of pattern we could take advantage of.

Did you see any performance improvement on something like the big CC you get when cleaning up the HTML spec or TechCrunch?

It looks like the patch is more or less implementing an ad-hoc fixed-size hash table. Is the expected benefit of the patch because the memory locality is a lot better when looking up in a small inline hash table rather than the big hash table?

Is there some existing standard data structure that we could use here? I guess you can't just use a regular hash table because you want to displace existing entries rather than extending out a bucket or growing the table. Failing that, I'd really like you to define a new data structure here (HashCache maybe?), with lookupForAdd() and remove() methods, so we don't have all of this hashing and lookup logic scattered around in the regular cycle collector code.
Attachment #9006277 - Flags: review?(continuation) → review-
I see AddNode often taking 25% of CC slice without the patch and ~15% with the patch.

We have exactly similar approaches elsewhere because our hashtables are slow.
There is no standard data structure for this.
(In reply to Olli Pettay [:smaug] from comment #3)
> We have exactly similar approaches elsewhere because our hashtables are slow.

If this is a common pattern, it would be worth filing a separate bug about consolidating them into a single implementation that can be shared. (That doesn't need to block this bug.)
Comment on attachment 9006277 [details] [diff] [review]
graph_cache.diff

Would you be ok taking this until we have some generic data structure for this.

Need to look at all the cases where we use the approach and figure out what the generic API would look like. At least the underlying hashtable is different, and sizes and hash calculation...
(Atoms use this, and ContentLists and something else, can't recall all cases.)
Attachment #9006277 - Flags: review?(continuation)
(In reply to Olli Pettay [:smaug] from comment #5)
> Would you be ok taking this until we have some generic data structure for
> this.

What I'm interested in here is something that is just the exact same code as you have now, except wrapped up in a HashCache class (no template parameters or anything).

You'd just have three methods (oops, not 2 like I said before):

  // This either returns a pointer if present, or an index, if it isn't.
  Variant<PtrInfo*, uint32_t> lookupForAdd(void* aPtr);

  void add(uint32_t aIndex, PtrInfo* aPtrInfo);

  void remove(void* aPtr);
Ended up using Variant even though it makes code quite unreadable, since it doesn't hint what the values mean.
Attachment #9006277 - Attachment is obsolete: true
Attachment #9006277 - Flags: review?(continuation)
Attachment #9006349 - Flags: review?(continuation)
Comment on attachment 9006349 [details] [diff] [review]
graph_cache_2.diff

Review of attachment 9006349 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks.
Attachment #9006349 - Flags: review?(continuation) → review+
And does that still compile
remote: View your change here:
remote:   https://hg.mozilla.org/try/rev/e557aafeb0611a4a38a650f9b97bcf7ebf830688
remote: 
remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=e557aafeb0611a4a38a650f9b97bcf7ebf830688
remote: recorded changegroup in replication log in 0.014s
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e1e44835bf32
To improve performance, CycleCollector could use a cache on top of the graph's hashtable, r=mccr8
(In reply to Olli Pettay [:smaug] from comment #5)
> Comment on attachment 9006277 [details] [diff] [review]
> graph_cache.diff
> 
> Would you be ok taking this until we have some generic data structure for
> this.
> 
> Need to look at all the cases where we use the approach and figure out what
> the generic API would look like. At least the underlying hashtable is
> different, and sizes and hash calculation...
> (Atoms use this, and ContentLists and something else, can't recall all
> cases.)

We do something similar for eTLD [1], nsNodeInfoManager [2], and as mentioned atoms [3].

[1] https://searchfox.org/mozilla-central/rev/5a18fb5aeeec99f1ca1c36a697082c221189a3b9/netwerk/dns/nsEffectiveTLDService.h#50-73
[2] https://searchfox.org/mozilla-central/rev/5a18fb5aeeec99f1ca1c36a697082c221189a3b9/dom/base/nsNodeInfoManager.h#166
[3] https://searchfox.org/mozilla-central/rev/5a18fb5aeeec99f1ca1c36a697082c221189a3b9/xpcom/ds/nsAtomTable.cpp#224-226
And in ContentList https://searchfox.org/mozilla-central/rev/5a18fb5aeeec99f1ca1c36a697082c221189a3b9/dom/base/nsContentList.cpp#173
(I added 3 of those 4 :p )
https://hg.mozilla.org/mozilla-central/rev/e1e44835bf32
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
(In reply to Olli Pettay [:smaug] from comment #12)
> And in ContentList
> https://searchfox.org/mozilla-central/rev/
> 5a18fb5aeeec99f1ca1c36a697082c221189a3b9/dom/base/nsContentList.cpp#173
> (I added 3 of those 4 :p )

Filed bug 1491151.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: