Closed Bug 1377949 Opened 7 years ago Closed 7 years ago

Switch SpiderMonkey to use 64-bit hash keys on 64-bit platforms

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: ehsan.akhgari, Unassigned)

References

Details

See bug 1377333 where I did this for PLDHashTable. SpiderMonkey's HashNumber <https://searchfox.org/mozilla-central/rev/5bb6dfcb3355a912c20383578bc435c1d4ecf575/js/public/Utility.h#505> could use similar treatment. The work I am going to do will make this as simple as switching HashNumber to be size_t and fixing the SpiderMonkey code that assumes HashFunctions.h APIs return 32-bit numbers.
So it turns out that without doing this, my patches cause assertions like https://searchfox.org/mozilla-central/rev/a3a739de04ee6134c11546568a33dbb6a6a29907/js/src/jsatominlines.h#173 to fire because we'd end up comparing a 64-bit number against a trimmed 32-bit version of it. Is there any interest in fixing this bug on the JS side alongside my patches in bug 1377947? (Not sure who specifically is the right person to ask...) Without that, I'm not sure how to proceed there.
The atom assertion is easy to fix - atoms store a HashNumber but fortunately we have 32 bits of padding we could steal (see NormalAtom and FatInlineAtom). We could use an #ifdef to only add the padding on 32 bit. HashTableEntry in js/public/HashTable.h stores the HashNumber, we have to be careful not to regress memory usage. I think in most cases the data following it will be pointer-aligned though so there should be no difference in size. If this was an improvement for PLDHashTable it's definitely worth trying for JS. We have some super perf sensitive hash tables that might benefit.
Flags: needinfo?(ehsan)
(In reply to Jan de Mooij [:jandem] from comment #2) > The atom assertion is easy to fix - atoms store a HashNumber but fortunately > we have 32 bits of padding we could steal (see NormalAtom and > FatInlineAtom). We could use an #ifdef to only add the padding on 32 bit. > > HashTableEntry in js/public/HashTable.h stores the HashNumber, we have to be > careful not to regress memory usage. I think in most cases the data > following it will be pointer-aligned though so there should be no difference > in size. > > If this was an improvement for PLDHashTable it's definitely worth trying for > JS. We have some super perf sensitive hash tables that might benefit. The impact of this is a bit hard to judge unfortunately. I saw some improvements in the performance of the cycle collector hashtable which tends to get really large as a result of the fix for bug 1377333 for example, and this is sort of along the same lines. The situation right now is that for some data going into PLDHashTables on 64-bit builds (such as data hashed on pointers) it is very sad to go down to 32-bits and then back to 64-bits as this increases the chances of hashtable collisions. (We also have some unfortunate bugs in HashFunctions.h if you pass something like a int64_t or a uint64_t to HashGeneric for example where as far as I can tell we hash it by dropping the high 32 bits currently which of course could be fixed in other ways too.) Generally I do expect this to improve the performance of things that currently suffer from poor hashing (such as hashtables suffering from too many collisions.) I have looked a lot at the performance of PLDHashTable but not at all at JS's equivalent, but a quick skimming of <https://searchfox.org/mozilla-central/rev/e8f4f51cd543f203e9cb861cecb7545ac43c836c/js/public/HashTable.h#1395> shows we also use open addressing and double chaining there, so basically same mistakes (or choices depending on how you look at it!) have happened in SpiderMonkey as well. :-) So I would expect same observations to mostly carry over there as well assuming there are large hashtable with keys such as pointers. But even if not, hashtable performance is a huge problem on the Gecko side, so help with this is certainly appreciated for Gecko. In places where we do want 32-bit hashes for reasons such as memory usage or compat with some size constraints, I've been condensing the 64-bit hashes down into 32-bits with a simple formula like |(hash64 >> 32) ^ (hash64 & 0x00000000ffffffff)| and I think that's OK as long as it happens at the very end after we have computed our hash right before we store/consume it. We can use this trick in the cases above too. I've already used it in the yet to be posted patches I have for bug 1377947.
Flags: needinfo?(ehsan)
Turns out we won't need to do this because bug 1377333 was wontfixed. Sorry for the extra noise. :-/
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.