Closed Bug 524632 Opened 15 years ago Closed 15 years ago

nanojit: utilise odd-numbered slots in CseFilter hash table

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey)

Attachments

(1 file)

patch 15 years ago Nicholas Nethercote [inactive] 7.78 KB, patch	edwsmith : review+	Details \| Diff \| Splinter Review

Nicholas Nethercote [inactive]

Assignee

Description

•

15 years ago

CseFilter uses a hash set, call LInsHashSet. It has a size that is a power of two and so the modulo arithmetic in the hash calculation can be done with a bitmask. The bitmask used is this: const uint32_t bitmask = (cap - 1) & ~0x1; The '& ~0x1' forces the LSB to zero, which forces the hash values to be even, which means that the odd-numbered slots in the hash table are never used. This leads to more collisions and premature table resizing. There's apparently no hidden subtlety underlying the calculation, it's just been botched.

Brendan Eich [:brendan]

Comment 1

•

15 years ago

IIRC division (remainder) hash performs poorly without prime table size, compared to multiplicative hash, which is strength-reduced to a multiply feeding into an unsigned right shift instead of &. See jsdhash.h and the Knuth ref cited there. The multiply does cost, though. /be

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

15 years ago

Is this the reference you mean? * Double hashing, a la Knuth 6. It's too terse for me to understand. However, nanojit/LIR.cpp has a different reference, one which claims to best Knuth's best: // inlined/separated version of SuperFastHash // This content is copyrighted by Paul Hsieh, For reference see : http://www.azillionmonkeys.com/qed/hash.html Having said that, the point of this bug is merely to fix an obvious performance bug in the existing code. If people want to argue about changing the hashing algorithm that can go in a new bug.

Brendan Eich [:brendan]

Comment 3

•

15 years ago

I'm not talking about hash functions (how you map a key to an integer), rather how you map that integer to an array index. But yeah, separate bug. I will file it if it's not already on file -- it may have been filed already against our old Knuth-based stuff. /be

Nicholas Nethercote [inactive]

Assignee

Comment 4

•

15 years ago

This doesn't really depend on bug 502778, but if this change is committed first it will cause all sorts of conflicts, so I'm marking it as dependent.

Depends on: 502778

Nicholas Nethercote [inactive]

Assignee

Comment 5

•

15 years ago

Attached patch patch — Details — Splinter Review

This patch tweaks the hash tables in three ways: - The bogus bitmask is fixed, so odd-numbered slots are utilised. This reduced the number of probes required by a factor of 6.4x for 3d-raytrace.js. - I've changed the load factor from 0.50 (well, it was effectively 1.0 because the odd-numbered slots weren't being used, which caused lots of probing) to 0.75. This resulted in roughly 50% more probes but half as many (expensive) table expansions. I played with some other load factors, it's hard to evaluate accurately because any differences are well below SunSpider's noise level... 0.75 seemed pretty good based on the counts of the relevant operations. (In order to make these measurements, I inserted some instrumentation that isn't in the attached patch.) - I've changed the quadratic probe function. When fiddling with the load factor, I was sometimes getting infinite loops, which was worrying. I'm now using the sequence h, h+1, h+3, h+6, h+10 which is "full period" for power-of-two sized tables, ie. it's guaranteed to explore all slots and thus terminate. It may also exhibit better locality as the steps are smaller than the old probe function. Again, perf differences due to changes in the probe function are below the noise threshold, but the termination guarantee makes this worth doing. With all three changes, I've seen SS speedups of 0--15ms... SS is being very noisy for me at the moment. But good speedups (eg. 3%, 6%) are seen in 3d-raytrace and crypto-md5 which are the tests that should benefit most. And Cachegrind confirms that many fewer instructions are executed in the find*() functions.

Attachment #410727 - Flags: review?(edwsmith)

Edwin Smith

Comment 6

•

15 years ago

Comment on attachment 410727 [details] [diff] [review] patch 0.75 sounds fine, In other similar hashtables and hashsets in tamarin we've settled on a load factor of 0.8 (similarly easy math)

Attachment #410727 - Flags: review?(edwsmith) → review+

Nicholas Nethercote [inactive]

Assignee

Comment 7

•

15 years ago

http://hg.mozilla.org/projects/nanojit-central/rev/df9b5fec8e0c

Whiteboard: fixed-in-nanojit

Steven Johnson

Comment 8

•

15 years ago

http://hg.mozilla.org/tamarin-redux/rev/d1bb609bb3fe

Whiteboard: fixed-in-nanojit → fixed-in-nanojit fixed-in-tamarin

Nicholas Nethercote [inactive]

Assignee

Comment 9

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/3e1ec91bf418

Whiteboard: fixed-in-nanojit fixed-in-tamarin → fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey

Robert Sayre

Comment 10

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/3e1ec91bf418

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

nanojit: utilise odd-numbered slots in CseFilter hash table

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey)

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Attachment

General

Description

File Name

Content Type