Closed
Bug 515812
Opened 15 years ago
Closed 15 years ago
Double hashing template
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: jorendorff, Assigned: luke)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(5 files, 2 obsolete files)
21.34 KB,
patch
|
Details | Diff | Splinter Review | |
63.37 KB,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
648 bytes,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
4.95 KB,
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
6.91 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
Just about every hash table we create in js/src has pointer-sized keys and values. Sometimes the key is the value. Double hashing should win for these tables.
The attachment is a first cut. Clearly needs tuning judging by early results (below-- note the weird curve, suggesting a silly fixed cost-- the test case is from bug 506410 comment 3 and is the kind of case where double hashing should win). Something is wrong, but I don't have time for it at the moment.
Outer 100000, Inner 100
[unordered_map] 1.29971
[HashMap] 0.80612
[HashSet] 0.76645
Outer 1000, Inner 10000
[unordered_map] 1.39445
[HashMap] 0.47037
[HashSet] 0.36216
Outer 100, Inner 100000
[unordered_map] 1.58923
[HashMap] 0.86247
[HashSet] 0.60625
![]() |
Assignee | |
Comment 1•15 years ago
|
||
This is an initial version of a fairly direct translation of JS_DHashTable algorithm into slightly more modern C++ style. In my benchmarks, it is 1.5x - 3.2x faster than JS_DHashTable. Basic use looks like:
typedef js::HashMap<int, double> HM;
HM hm(cx); // default allocator uses JSContext::malloc
if (!hm.init()) // fallible initialization :-(
return false;
HM::AddPtr p = hm.lookupForAdd(1);
if (p)
cout << p->key << ' ' << p->value << '\n';
else
if (!hm.add(p, 1, 3.0))
return false;
if (HM::Ptr p = hm.lookup(1))
hm.remove(p);
for (HM::Range r = hm.all(); !r.empty(); r.popFront())
cout << r.front().key << ' ' << r.front().value << '\n';
with shorthand:
if (!hm.put(5, 30.2))
return false;
hm.has(5);
hm.remove(5);
Syntactic and other comments welcome!
A few notes on this implementation vs. the one in attachment 399878 [details] [diff] [review]:
- I borrowed the HashTable/HashMap/HashSet design, so there is also a HashSet :)
- I initially dropped the whole collision-bit business (since it adds a small overhead to all operations), however one of my synthetic benchmarks from bug 506410 showed js::HashMap to be 30% slower than JS_DHashTable. Shark showed that practically all this extra time was spent following collision paths. Adding a collision-bit resulted in a 2.5x speedup, so I'm going leaving it in.
- Recomputing hashes (instead of storing them) is not an option when hashing is expensive. I initially added a compile-time constant to the HashingPolicy that allowed the user to either request cached hashes or provide an interface, like attachment 399878 [details] [diff] [review], for detecting/marking live/free/tombstoned values. In general, this did not show up in benchmarks except for specific iteration sizes where, presumably, the 2x smaller version allowed the table to fit in cache. When I added the collision-bit, I took this whole feature out since the two do not go well together.
- Having an inline buffer would avoid the need for a fallible constructor/init. However, after working with them in bug 506410, it seemed like a waste of 16x2x4 bytes per hash table since they hash tables are usually long-lived and big.
Assignee: general → lw
Status: NEW → ASSIGNED
Comment 2•15 years ago
|
||
(In reply to comment #1)
> for (HM::Range r = hm.all(); !r.empty(); r.popFront())
> cout << r.front().key << ' ' << r.front().value << '\n';
The Range class (or its version) needs a method to remove the key under enumeration. That is used in a few places with the current hashtable enumeration callbacks.
![]() |
Assignee | |
Comment 3•15 years ago
|
||
(In reply to comment #2)
> (In reply to comment #1)
> > for (HM::Range r = hm.all(); !r.empty(); r.popFront())
> > cout << r.front().key << ' ' << r.front().value << '\n';
>
> The Range class (or its version) needs a method to remove the key under
> enumeration. That is used in a few places with the current hashtable
> enumeration callbacks.
Right, I had been thinking about this, but there were some issues, so I left it out of this initial "release". But you're right; this should be dealt with now.
Simply sticking a "removeFront" member in either the table or the Range runs into the problem that the Range is invalidated if the table is resized (even if the Range is re-pointed to the new table). To address this, JS_DHashTableEnumerate avoids resizing until the enumeration is complete, and then possibly resizes/rehashes. However, with external iteration, the closest thing we have is the Range destructor. This could work, but then the Range destructor is a lot more "active" than the average destructor, which might lead to unexpected behavior.
This behavior could be made more explicit by having two enumeration types:
struct Range { /* unchanged */ };
struct Enumeration : Range { void removeFront(); void endEnumeration(); };
Range all() const;
Enumeration beginEnumeration();
and having ~Enumeration() call endEnumeration() if it hasn't already been.
Thoughts, anyone?
![]() |
Assignee | |
Comment 4•15 years ago
|
||
(In reply to comment #3)
Upon a little more discussion with dvander, I'm further inclined toward the Enumeration/beginEnumeration strategy, since it allows Range to have a copy constructor and Enumeration to hide its copy constructor.
![]() |
Assignee | |
Comment 5•15 years ago
|
||
This patch includes the ability to remove during enumeration like so:
HashSet<int> hs;
for (HashSet<int>::Enum e(hs.enumerate()); !e.empty(); e.popFront())
if (dontlike(e.front()))
e.removeFront();
As a use case, the patch also replaces JS_DHashTable usage in jstracer.cpp and jsarray.cpp. In each case, the resulting code is, IMHO, easier to read. None of these are on hot paths so TSS is not affected.
Attachment #415527 -
Attachment is obsolete: true
![]() |
Assignee | |
Comment 6•15 years ago
|
||
Looking at the patch in bug 453929 (thanks Brendan), I noticed a slight generalization was needed to capture existing JS_DHashTable usage where the type of the value used to lookup an entry was different from the type of the key stored in the entry. This is easy to do with an extra "Lookup" associated type in the template hashing policy.
The patch try's fine; about ready for review now.
Attachment #415900 -
Attachment is obsolete: true
Comment 7•15 years ago
|
||
Comment on attachment 416317 [details] [diff] [review]
added "lookup type"
Great patch, it's like when your silver-age comic heroes are reimagined by a non-wanker hawt new artist, with new costumes, gnarlier powers, and more sax&violins ;-) -- stylish!
> TraceRecorder::guardShape(LIns* obj_ins, JSObject* obj, uint32 shape, const char* guardName,
> LIns* map_ins, VMSideExit* exit)
> {
> // Test (with add if missing) for a remembered guard for (obj_ins, obj).
>+ GuardedShapeTable::AddPtr p = guardedShapeTable.lookupForAdd(obj_ins);
>+ if (p) {
>+ JS_ASSERT(p->value == obj);
> return RECORD_CONTINUE;
>+ } else {
>+ if (!guardedShapeTable.add(p, obj_ins, obj))
>+ return RECORD_ERROR;
>+ }
Nit: else after return non-sequitur.
/be
![]() |
Assignee | |
Updated•15 years ago
|
Attachment #416317 -
Flags: review?(brendan)
Comment 8•15 years ago
|
||
Comment on attachment 416317 [details] [diff] [review]
added "lookup type"
I like what I've read, but I'm a terrible reviewer. I only play at C++ hacker now and then. Jason, could you give this a proper review? Thanks, or bounce back if you can't get to it soonish.
/be
Attachment #416317 -
Flags: review?(brendan) → review?(jorendorff)
Reporter | ||
Comment 9•15 years ago
|
||
Comment on attachment 416317 [details] [diff] [review]
added "lookup type"
In jstracer.cpp:
>+ if (!set.init())
>+ abort();
Call OutOfMemoryAbort just for consistency?
Also, this patch has some tab characters in jstracer.h.
r=me, sorry for the slow review.
Attachment #416317 -
Flags: review?(jorendorff) → review+
![]() |
Assignee | |
Comment 10•15 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Whiteboard: fixed-in-tracemonkey
Comment 11•15 years ago
|
||
Bugs aren't supposed to be marked RESOLVED FIXED until checked into mozilla-central.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
![]() |
Assignee | |
Comment 12•15 years ago
|
||
Oops, got a bit carried away.
![]() |
Assignee | |
Comment 13•15 years ago
|
||
Apparently stdarg.h was being included for debug builds, but not opt.
http://hg.mozilla.org/tracemonkey/rev/584af5971ccf
![]() |
Assignee | |
Comment 14•15 years ago
|
||
Disregard comment 13, meant for a different bug.
![]() |
Assignee | |
Comment 15•15 years ago
|
||
Backed out, as this patch seems to cause PGO to die
http://hg.mozilla.org/tracemonkey/rev/fcf42ef466a1
![]() |
Assignee | |
Comment 16•15 years ago
|
||
This patch fixes the unintuitive behavior of the HashMap::put convenience function to overwrite when the key is found.
Attachment #427373 -
Flags: review?(jorendorff)
Reporter | ||
Updated•15 years ago
|
Attachment #427373 -
Flags: review?(jorendorff) → review+
![]() |
Assignee | |
Comment 17•15 years ago
|
||
![]() |
Assignee | |
Comment 18•15 years ago
|
||
Initially I was returning const Entry & to HashMap users thinking that 'mutable' would allow mutation of Entry::value, but, I failed to realize, this only works for Entry members.
Attachment #427801 -
Flags: review?(dvander)
![]() |
||
Updated•15 years ago
|
Attachment #427801 -
Flags: review?(dvander) → review+
Comment 19•15 years ago
|
||
The patch eliminates the need to call an extra enumerate() function when looping over hashtable elements changing usage pattern from:
for (HashMapType::Enum e(hashmap.enumerate()); !e.empty(); e.popFront()) ...
to less verbose version:
for (HashMapType::Enum e(hashmap); !e.empty(); e.popFront()) ...
Attachment #429099 -
Flags: review?(lw)
![]() |
Assignee | |
Comment 20•15 years ago
|
||
Comment on attachment 429099 [details] [diff] [review]
less verbose enumeration
Shorter and simpler; what's not to like. Thanks!
Attachment #429099 -
Flags: review?(lw) → review+
Comment 21•15 years ago
|
||
https://hg.mozilla.org/tracemonkey/rev/7a2c87fe31b7 - the enumeration change from the comment 19.
Comment 22•15 years ago
|
||
The same change that was done for HashTable::Enum could also be done for HashTable::Range allowing to do non-deleting enumerations over Hashtable without calling the all method changing a pattern like:
for (HM::Range r = h.all(); !r.empty(); r.popFront()) ...
into
for (HM::Range r(h); !r.empty(); r.popFront()) ...
luke, any objections?
![]() |
Assignee | |
Comment 23•15 years ago
|
||
(In reply to comment #22)
While that is attractive since it makes Range like Enum, they are rather different: an Enum object has identity and its lifetime has semantic meaning while a Range is a normal value type like an iterator that can be initialized and copied around with =. So, to me, its fine that they look different; they are different. Also, for value types, people generally seem to prefer initializer syntax over constructor syntax.
Comment 24•15 years ago
|
||
Enumerate implicitly seems to suggest exhaustiveness; Range doesn't to me. First change is nice, second doesn't quite seem right to me. But I'm not looking too closely here, maybe I'm missing the flavor of the second case.
Comment 25•15 years ago
|
||
Small problems with warnings in the Range class.
I might silence with #pragma for now but this needs fixing...
#
js\src\jshashtable.h(164) : warning C4512: 'js::detail::HashTable<T,HashPolicy,AllocPolicy>::Range' : assignment operator could not be generated
#
with
#
[
#
T=JSObject *const ,
#
HashPolicy=js::HashSet<JSObject *>::SetOps,
#
AllocPolicy=js::ContextAllocPolicy
#
]
#
js\src\jshashtable.h(134) : see declaration of 'js::detail::HashTable<T,HashPolicy,AllocPolicy>::Range'
#
with
#
[
#
T=JSObject *const ,
#
HashPolicy=js::HashSet<JSObject *>::SetOps,
#
AllocPolicy=js::ContextAllocPolicy
#
]
#
js\src\jshashtable.h(176) : see reference to class template instantiation 'js::detail::HashTable<T,HashPolicy,AllocPolicy>::Range' being compiled
#
with
#
[
#
T=JSObject *const ,
#
HashPolicy=js::HashSet<JSObject *>::SetOps,
#
AllocPolicy=js::ContextAllocPolicy
#
]
#
js\src\jshashtable.h(537) : see reference to class template instantiation 'js::detail::HashTable<T,HashPolicy,AllocPolicy>::Enum' being compiled
#
with
#
[
#
T=JSObject *const ,
#
HashPolicy=js::HashSet<JSObject *>::SetOps,
#
AllocPolicy=js::ContextAllocPolicy
#
]
#
js\src\jshashtable.h(873) : see reference to class template instantiation 'js::detail::HashTable<T,HashPolicy,AllocPolicy>' being compiled
#
with
#
[
#
T=JSObject *const ,
#
HashPolicy=js::HashSet<JSObject *>::SetOps,
#
AllocPolicy=js::ContextAllocPolicy
#
]
#
js/src/jscntxt.h(1361) : see reference to class template instantiation 'js::HashSet<T>' being compiled
#
with
#
[
#
T=JSObject *
#
]
![]() |
Assignee | |
Comment 26•15 years ago
|
||
Ah, thanks Mike. No default operator= when one member is a const.
http://hg.mozilla.org/tracemonkey/rev/bc0a50164c21
Comment 28•15 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 15 years ago → 15 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•