Closed Bug 1379282 Opened 6 years ago Closed 6 years ago

Improve XPCOM's pointer hashing functions for pointers to neighboring memory locations

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: ehsan.akhgari, Assigned: ehsan.akhgari)

References

Details

Attachments

(1 file, 1 obsolete file)

The simplistic shift-based hashing function creates a lot of collisions
for pointers pointing to arrays as it doesn't do a great job at distributing
the data randomly based on the input bytes.
The simplistic shift-based hashing function creates a lot of collisions
for pointers pointing to arrays as it doesn't do a great job at distributing
the data randomly based on the input bytes.
Attachment #8884426 - Flags: review?(bugs)
Blocks: 1377999
Comment on attachment 8884426 [details] [diff] [review]
Improve XPCOM's pointer hashing functions for pointers to neighboring memory locations

Perhaps erahm can review too?
Attachment #8884426 - Flags: review?(erahm)
Comment on attachment 8884426 [details] [diff] [review]
Improve XPCOM's pointer hashing functions for pointers to neighboring memory locations

I guess rs+. Hard to say whether that hashing function is particularly good, but can't probably be worse than what the current one is.
Attachment #8884426 - Flags: review?(bugs) → review+
Comment on attachment 8884426 [details] [diff] [review]
Improve XPCOM's pointer hashing functions for pointers to neighboring memory locations

Review of attachment 8884426 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/ds/PLDHashTable.cpp
@@ -71,5 @@
>  
>  /* static */ PLDHashNumber
>  PLDHashTable::HashVoidPtrKeyStub(const void* aKey)
>  {
> -  return uintptr_t(aKey) >> 2;

The old hash here is clearly terrible!

::: xpcom/ds/nsPointerHashKeys.h
@@ +40,5 @@
> +    auto ptr = reinterpret_cast<uintptr_t>(aKey);
> +    uint32_t low = ptr & 0xffffffff;
> +    uint32_t hi = (ptr & 0xffffffff00000000) >> 32;
> +
> +    uint32_t hash_hi = mozilla::HashGeneric(hi);

You should be able to call HashGeneric() directly on aKey. The comments in HashFunctions.h say HashGeneric works with "uint32_t, types which can be implicitly cast to uint32_t, data pointers, and function pointers."
Assignee: nobody → ehsan
(In reply to Nicholas Nethercote [:njn] from comment #4)
> ::: xpcom/ds/nsPointerHashKeys.h
> @@ +40,5 @@
> > +    auto ptr = reinterpret_cast<uintptr_t>(aKey);
> > +    uint32_t low = ptr & 0xffffffff;
> > +    uint32_t hi = (ptr & 0xffffffff00000000) >> 32;
> > +
> > +    uint32_t hash_hi = mozilla::HashGeneric(hi);
> 
> You should be able to call HashGeneric() directly on aKey. The comments in
> HashFunctions.h say HashGeneric works with "uint32_t, types which can be
> implicitly cast to uint32_t, data pointers, and function pointers."

Sure, but the problem with doing that is that you'd get a 32-bit number back, which you'd need to expand into a 64-bit hash key, which means you'd need to synthesize 32 bits of the key from the other 32 which seems poor.  This is why I would like to fix bug 1377947, once that happens, the #ifdef branch here can simply be removed.  Until then I'd prefer have better hash key distribution here.

(Also HashGeneric() is extremely unsafe to use in the face of 64-bit input because of this choice: <https://searchfox.org/mozilla-central/rev/f1472e5b57fea8d4a7bbdd81c44e46958fe3c1ce/mfbt/HashFunctions.h#164>.  If you screw things up and for example cast the void* to a uintptr_t before passing it to HashGeneric(), it will "hash" it to 32-bits by dropping the high 32-bits.  :-(  So I prefer to not play with that fire.  I should probably file that bug also.)
(In reply to :Ehsan Akhgari (Away 7/10-7/17; needinfo please, extremely long backlog) from comment #5)
> (Also HashGeneric() is extremely unsafe to use in the face of 64-bit input
> because of this choice:
> <https://searchfox.org/mozilla-central/rev/
> f1472e5b57fea8d4a7bbdd81c44e46958fe3c1ce/mfbt/HashFunctions.h#164>.  If you
> screw things up and for example cast the void* to a uintptr_t before passing
> it to HashGeneric(), it will "hash" it to 32-bits by dropping the high
> 32-bits.  :-(  So I prefer to not play with that fire.  I should probably
> file that bug also.)

(Filed bug 1379290 with a fix.)
FWIW I am no longer 100% sure that the fix to bug 1377333 was the right thing to do.  Perhaps we should back that out, and just pass the pointer to HashGeneric() directly here.
Comment on attachment 8884426 [details] [diff] [review]
Improve XPCOM's pointer hashing functions for pointers to neighboring memory locations

Review of attachment 8884426 [details] [diff] [review]:
-----------------------------------------------------------------

I'm going to r- for now until we have a bit more discussion. I don't think this is fundamentally wrong I just want to make sure we're fixing the right thing.

::: xpcom/ds/nsPointerHashKeys.h
@@ -34,5 @@
>  
>    static KeyTypePointer KeyToPointer(KeyType aKey) { return aKey; }
>    static PLDHashNumber HashKey(KeyTypePointer aKey)
>    {
> -    return uintptr_t(aKey) >> 2;

I think we might be missing the point here, this is supposed to be a key that gets hashed, we don't really need to do the hashing here. Just something rather unique that makes quick comparisons easy. For pointers that can just be the pointer.

IIRC the general idea is that for allocated things the allocator is always going to give you at least 4-byte aligned addresses (so the lower 2-bits are zero), we chop those off in hopes of adding more entropy (or something like that). Underneath the hood we expect PLDHashTable to do the actual hashing.

First it takes the |HashKey| value and multiplies by a golden ratio and masks some bits [1], ideally so we get better distribution. Next it uses a 2-stage hash to index into the hash table [2,3] and iterates if necessary [4].

It's possible we could improve that algorithm, but it should be functioning okay. Are you actually seeing collisions or is this all hypothetical? Can you reference some code that has problems with this?

[1] http://searchfox.org/mozilla-central/rev/b7e531410ebc1c7d74a78e86a894e69608db318c/xpcom/ds/PLDHashTable.cpp#506-521
[2] http://searchfox.org/mozilla-central/rev/b7e531410ebc1c7d74a78e86a894e69608db318c/xpcom/ds/PLDHashTable.cpp#250-254
[3] http://searchfox.org/mozilla-central/rev/b7e531410ebc1c7d74a78e86a894e69608db318c/xpcom/ds/PLDHashTable.cpp#256-277
[4] http://searchfox.org/mozilla-central/rev/b7e531410ebc1c7d74a78e86a894e69608db318c/xpcom/ds/PLDHashTable.cpp#371-402
Attachment #8884426 - Flags: review?(erahm) → review-
To answer my own question it looks like this is related to bug 1377999, comment 11.
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #9)
> To answer my own question it looks like this is related to bug 1377999,
> comment 11.

Yes, this isn't a theoretical fix, it wins several points on Speedometer on a build of Firefox that is affected by that issue.  Does that make you reconsider your review?  :-)
Flags: needinfo?(erahm)
Comment on attachment 8884426 [details] [diff] [review]
Improve XPCOM's pointer hashing functions for pointers to neighboring memory locations

Review of attachment 8884426 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/ds/nsPointerHashKeys.h
@@ +43,5 @@
> +
> +    uint32_t hash_hi = mozilla::HashGeneric(hi);
> +
> +    return uintptr_t(mozilla::HashGeneric(low)) |
> +           (uintptr_t(hash_hi) << 32);

You WONTFIX'd expanding PLDHashNumber to 64-bits in bug 1377333, so you should probably remove this left shift << 32.
The simplistic shift-based hashing function creates a lot of collisions
for pointers pointing to arrays as it doesn't do a great job at distributing
the data randomly based on the input bytes.
Attachment #8887982 - Flags: review?(erahm)
Attachment #8884426 - Attachment is obsolete: true
Comment on attachment 8887982 [details] [diff] [review]
Improve XPCOM's pointer hashing functions for pointers to neighboring memory locations

Review of attachment 8887982 [details] [diff] [review]:
-----------------------------------------------------------------

Given the improvement you saw I'm going to r+ this, I'm still concerned that PLDHashTable's internal hash performs so poorly (and that we're double, maybe triple hashing at this point).

::: xpcom/ds/nsPointerHashKeys.h
@@ +35,5 @@
>  
>    static KeyTypePointer KeyToPointer(KeyType aKey) { return aKey; }
>    static PLDHashNumber HashKey(KeyTypePointer aKey)
>    {
> +    return mozilla::HashGeneric(aKey);

Side note: We use the shift-hash in PLDHashTable's default void ptr hash [1] as well, I wonder if we should change that in a follow up.

[1] http://searchfox.org/mozilla-central/search?q=symbol:_ZN12PLDHashTable18HashVoidPtrKeyStubEPKv&redirect=false
Attachment #8887982 - Flags: review?(erahm) → review+
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #13)
> Given the improvement you saw I'm going to r+ this, I'm still concerned that
> PLDHashTable's internal hash performs so poorly (and that we're double,
> maybe triple hashing at this point).

I think you are perhaps misunderstanding how PLDHashTable performs its hashing, since there is no double hashing involved.

Let me try to explain:

 * For example, let's say the user calls PLDHashTable::Search().
 * ComputeKeyHash() will call the function I'm modifying in the patch here (which does one level of hashing) in https://searchfox.org/mozilla-central/rev/a83a4b68974aecaaacdf25145420e0fe97b7aa22/xpcom/ds/PLDHashTable.cpp#511 and multiplies the result by the golden ratio (why?! this makes little sense to me.  I think perhaps you're thinking of this as a real hash function, but it isn't, since it's not changing the distribution of the input data in any way.  Perhaps the code really meant to use this well-known trick <https://searchfox.org/mozilla-central/rev/88180977d79654af025158d8ebeb8c2aa11940eb/mfbt/HashFunctions.h#78>, but at any rate what it is doing now is as good as not doing anything whatsoever.  Indeed if we removed the multiplication on that line nothing would really change about the quality of our hashtable implementation, as can easily be demonstrated in the pathological case I ran into in bug 1377999, comment 11.)
 * This hash value is then passed to Hash1() which, despite its name, computes the index into the open addressing table using the precomputed hash.  This index is what Hash1() returns. https://searchfox.org/mozilla-central/rev/a83a4b68974aecaaacdf25145420e0fe97b7aa22/xpcom/ds/PLDHashTable.cpp#251

Basically, before my patch, the hash function that we use is the shift to right by 2 bits function.  After, it is mozilla::HashGeneric().  The former is a terrible hash function, because it doesn't randomly distribute the input space onto the output space, the latter is a much better hash function.

> ::: xpcom/ds/nsPointerHashKeys.h
> @@ +35,5 @@
> >  
> >    static KeyTypePointer KeyToPointer(KeyType aKey) { return aKey; }
> >    static PLDHashNumber HashKey(KeyTypePointer aKey)
> >    {
> > +    return mozilla::HashGeneric(aKey);
> 
> Side note: We use the shift-hash in PLDHashTable's default void ptr hash [1]
> as well, I wonder if we should change that in a follow up.

Ah yeah, that is broken in the exact same way.  Will submit a follow-up.
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5eeb5b9f118b
Improve XPCOM's pointer hashing functions for pointers to neighboring memory locations; r=erahm
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #14)
> > ::: xpcom/ds/nsPointerHashKeys.h
> > @@ +35,5 @@
> > >  
> > >    static KeyTypePointer KeyToPointer(KeyType aKey) { return aKey; }
> > >    static PLDHashNumber HashKey(KeyTypePointer aKey)
> > >    {
> > > +    return mozilla::HashGeneric(aKey);
> > 
> > Side note: We use the shift-hash in PLDHashTable's default void ptr hash [1]
> > as well, I wonder if we should change that in a follow up.
> 
> Ah yeah, that is broken in the exact same way.  Will submit a follow-up.

Err, what am I talking about?!  My patch already fixes it. ;-)
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #14)
> (In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment
> #13)
> > Given the improvement you saw I'm going to r+ this, I'm still concerned that
> > PLDHashTable's internal hash performs so poorly (and that we're double,
> > maybe triple hashing at this point).
> 
> I think you are perhaps misunderstanding how PLDHashTable performs its
> hashing, since there is no double hashing involved.

I get how it works (see comment 8). I'm not saying it's working correctly, just that it's silly we're doing rather involved hash now and then a rather poor attempt at improving distribution (what I'm referring to as the second hash) at a lower level which we've long assumed was good enough.

I'm not convinced it does such a poor job in a generic sense, but maybe you've hit a pathological case where a series of 8-byte aligned 64-bit pointers don't get great distribution. My guess is this probably used to be okay when pointers were 32-bit, it's possible just adopting what the spider monkey hash does (shifting the lower zero bits out and xor'ing the upper bits in) would help out and be slightly less heavy handed change. Anyhow all of that can be a follow up investigation and you certainly don't need to do it :)
Flags: needinfo?(erahm)
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #17)
> (In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from
> comment #14)
> > (In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment
> > #13)
> > > Given the improvement you saw I'm going to r+ this, I'm still concerned that
> > > PLDHashTable's internal hash performs so poorly (and that we're double,
> > > maybe triple hashing at this point).
> > 
> > I think you are perhaps misunderstanding how PLDHashTable performs its
> > hashing, since there is no double hashing involved.
> 
> I get how it works (see comment 8). I'm not saying it's working correctly,
> just that it's silly we're doing rather involved hash now and then a rather
> poor attempt at improving distribution (what I'm referring to as the second
> hash) at a lower level which we've long assumed was good enough.

Ah yeah OK fair enough.  So we are probably in violent agreement there!

> I'm not convinced it does such a poor job in a generic sense, but maybe
> you've hit a pathological case where a series of 8-byte aligned 64-bit
> pointers don't get great distribution. My guess is this probably used to be
> okay when pointers were 32-bit, it's possible just adopting what the spider
> monkey hash does (shifting the lower zero bits out and xor'ing the upper
> bits in) would help out and be slightly less heavy handed change. Anyhow all
> of that can be a follow up investigation and you certainly don't need to do
> it :)

FWIW I looked into it a bit more.  This problem wasn't only an issue with addresses with 8-byte differences, all you'd need was for them to be relatively close to each other, aka most of the incoming bits being the same.  In general multiplication alone doesn't really redistribute the incoming bits that well, so the distribution of the output space won't change much, multiplication + shift however works super well (and in fact is suggested as an effective universal hashing algorithm for hashing integers <https://arxiv.org/abs/1504.06804>).

I may spend some more time looking into this and see if I can convince myself that I should remove that multiplication altogether in a follow-up.  It's kind of hard to know 100% certainly if that is the right thing to do, I'm not quite sure how to evaluate the impact of such a change.  Doing something like what SpiderMonkey does is another option but I suspect hashing once using HashGeneric() should be all that is needed really.

(My future grand plans also include eventually switching HashGeneric to use universal hashing to improve our hashing distribution for all of our hashtables but I have yet to figure out how to test the impact of such a change...)
FWIW this comment seems to explain the reason for this multiplication: https://searchfox.org/mozilla-central/rev/3a3af33f513071ea829debdfbc628caebcdf6996/js/public/Utility.h#529  It is used to add entropy to the high bits in the number.
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #19)
> FWIW this comment seems to explain the reason for this multiplication:
> https://searchfox.org/mozilla-central/rev/
> 3a3af33f513071ea829debdfbc628caebcdf6996/js/public/Utility.h#529  It is used
> to add entropy to the high bits in the number.

So it's this part that has me I'm all sorts of confused:

>     * This value is especially good if values with consecutive hash codes
>     * are stored in a hash table; see Knuth for details.

That seems to *not* be the case for you.
https://hg.mozilla.org/mozilla-central/rev/5eeb5b9f118b
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #20)
> (In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from
> comment #19)
> > FWIW this comment seems to explain the reason for this multiplication:
> > https://searchfox.org/mozilla-central/rev/
> > 3a3af33f513071ea829debdfbc628caebcdf6996/js/public/Utility.h#529  It is used
> > to add entropy to the high bits in the number.
> 
> So it's this part that has me I'm all sorts of confused:
> 
> >     * This value is especially good if values with consecutive hash codes
> >     * are stored in a hash table; see Knuth for details.
> 
> That seems to *not* be the case for you.

Yes, that comment seems wrong to me...  :/
(In reply to Ehsan from comment #14)
> Basically, before my patch, the hash function that we use is the shift to
> right by 2 bits function.  After, it is mozilla::HashGeneric().  The former
> is a terrible hash function, because it doesn't randomly distribute the
> input space onto the output space, the latter is a much better hash function.

I guess what I don't understand is why the multiplication by the golden ratio:
https://searchfox.org/mozilla-central/rev/6326724982c66aaeaf70bb7c7ee170f7a38ca226/xpcom/ds/PLDHashTable.cpp#507-522
didn't fix that distribution problem.  That's what it's supposed to do.
You need to log in before you can comment on or make changes to this bug.