Closed Bug 32184 Opened 25 years ago Closed 24 years ago

PL_HashTableEnumerateEntries isn't stable over PL_HashTableRawLookup

Categories

(NSPR :: NSPR, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: beard, Assigned: wtc)

References

Details

Attachments

(1 file)

If a client of PLHashTable calls PL_HashTableEnumerateEntries(), and the 
PLHashEnumerator function calls PL_HashTableRawLookup(), then not all of the 
entries in the PLHashTable will be successfully visited. The reason is that 
PL_HashTableRawLookup() "splays" the hash bucket chains, to provide MRU caching.

Does this qualify as a bug in PL_HashTableEnumerateEntries() or is this a bug in 
our usage of it? This was discovered to cause PLHashEntry leaks when using 
PL_HashTableEnumerateEntries() to delete all of the entries of XPCOM's service 
hashtable.

A possible fix for this problem would be to add state to PLHashTable to indicate 
that an enumeration is taking place, in order to inhibit the splaying behavior. 
Here's a patch that I've written that corrects the problem:

Index: mozilla/nsprpub/lib/ds/plhash.h
===================================================================
RCS file: /cvsroot/mozilla/nsprpub/lib/ds/plhash.h,v
retrieving revision 3.4
diff -u -2 -r3.4 plhash.h
--- plhash.h	1999/04/21 21:37:48	3.4
+++ plhash.h	2000/03/17 02:34:52
@@ -75,4 +75,5 @@
     const PLHashAllocOps *allocOps;     /* allocation operations */
     void                *allocPriv;     /* allocation private data */
+    PRBool                enumerating;
 #ifdef HASHMETER
     PRUint32              nlookups;       /* total number of lookups */

Index: mozilla/nsprpub/lib/ds/plhash.c
===================================================================
RCS file: /cvsroot/mozilla/nsprpub/lib/ds/plhash.c,v
retrieving revision 3.3
diff -u -2 -r3.3 plhash.c
--- plhash.c	1999/04/21 21:37:48	3.3
+++ plhash.c	2000/03/17 02:35:09
@@ -133,4 +133,5 @@
     ht->allocOps = allocOps;
     ht->allocPriv = allocPriv;
+    ht->enumerating = PR_FALSE;
     return ht;
 }
@@ -180,4 +181,7 @@
     while ((he = *hep) != 0) {
         if (he->keyHash == keyHash && (*ht->keyCompare)(key, he->key)) {
+            /* don't reorder buckets during enumeration. */
+            if (ht->enumerating)
+                return hep;
             /* Move to front of chain if not already there */
             if (hep != hep0) {
@@ -361,5 +365,10 @@
     int rv, n = 0;
     PLHashEntry *todo = 0;
+    PRBool wasEnumerating;
 
+    /* guard against side-effects that break enumeration. */
+    wasEnumerating = ht->enumerating;
+    ht->enumerating = PR_TRUE;
+    
     nbuckets = NBUCKETS(ht);
     for (i = 0; i < nbuckets; i++) {
@@ -369,4 +378,5 @@
             n++;
             if (rv & (HT_ENUMERATE_REMOVE | HT_ENUMERATE_UNHASH)) {
+                PR_ASSERT(*hep  == he);
                 *hep = he->next;
                 if (rv & HT_ENUMERATE_REMOVE) {
@@ -388,4 +398,5 @@
         PL_HashTableRawRemove(ht, hep, he);
     }
+    ht->enumerating = wasEnumerating;
     return n;
 }
JSHashTable (mozilla/js/src/jshash.c) has exactly the same "feature" in it (the 
two pieces of code are nearly identical).
jshash.c was cloned by fur from NSPR plhash.c, which was NSPR1.0 prhash.c when 
kipp and I wrote it in 1995.

There are no enumerators I know of that do lookups.  I'm curious why XPCOM's 
enumerator needed to look for a key.

The patch looks good, better if you use PRPackedBool (JSPackedBool).  But I 
don't think jshash.c needs this sort of patch, over against a warning comment.  
Time spent on jshash.c might be better spent elsewhere in js/src and js/js2 ;-)

/be
The patch adds a new member to the PLHashTable structure.
Because the definition of the PLHashTable structure is
exposed in the public plhash.h header, this patch breaks
binary compatibility.  If we accept this patch, we will
have to bump the NSPR major version number.

I'm afraid that you'll have to rewrite your code to work
around this limitation.
I think it is better to avoid calls to PL_HashTableRawLookup() from the 
enumerator function, if feasible.

Otherwise, a different solution that doesn't break backward compatibility would 
be to define a new lookup function similar to PL_HashTableRawLookup, except that 
it doesn't reorder hash entries. The client code can then be modified to call 
the new lookup function, instead of PL_HashTableRawLookup, from the enumerator 
function.

I don't understand.  Are you telling me that there are deployments of NSPR out 
there that expect to upgrade just the binary library without recompiling?  And if 
so, how many?  How many times have they sustained an upgrade of just the NSPR 
library, and to what effect?  What's wrong with bumping the major version number?  
If this is a bug, which it seems to me that it is, it should be fixed, yes?  We 
could work around it, but it's better just to have a hash table that works.

Binary compatibility is not an end in itself.  Tell me the real cost of fixing 
this bug so we can weigh it against the value.  If the bug cost is greater than 
the value, we'll use a work-around.  If the value is greater than the cost, 
you'll implement the fix.
From an engineering perspective, bumping a version number isn't a big deal, but 
from a product perspective, it carries certain expectations. Saying we went to 
nspr5 just to fix a hashtable bug seems a bit excessive. I'm sure we can find an 
unused bit in the existing PLHashTable structure somewhere, or find another way 
to solve this problem.
Yes, there are NSPR clients that expect binary compatibility
in a minor version or patch level upgrade.

I like Srinivas's suggestion: adding a new lookup function
that doesn't reorder hash entries.  The reordering of hash
entries in the lookup function is not obvious and has caught
our other clients by surprise before.

How about adding these two new lookup functions?
PL_HashTableLookupConst
PL_HashTableRawLookupConst

The 'Const' in the function names suggests that
these function do not modify the internals of
the hash table (inspired by C++ const member
functions).

If we follow our policy strictly, we'll need to
bump the minor version number (i.e., NSPR 4.1)
when we add new functions.  So it'd be great if
you could avoid calling PL_HashTableRawLookup
from the enumerator function.  Alternatively,
you could implement the non-reordering lookup
function in your code for now (which is possible
since all the hash table data types are exposed)
and then switch to PL_HashTableRawLookupConst
when NSPR 4.1 is released.
Reassigned.
Assignee: srinivas → wtc
Adding an element to a structure at the end doesn't break binary compatibility, 
as long as the API clients correctly use the PL_NewHashTable/PL_HashTableDestroy 
functions. So, I'd argue we can do this legitimately, as clients should only be 
storing PLHashTables as pointers.
Does your hash enumerator function really need to
call PL_HashTableRawLookup?  Is it possible to call
PL_HashTableLookup instead?  If so, then we just
need to add a non-reordering version of
PL_HashTableLookup:
PR_EXTERN(void *)
PL_HashTableLookupConst(const PLHashTable *ht, const void *key);

Cf. PL_HashTableLookup:
PR_EXTERN(void *)
PL_HashTableLookup(PLHashTable *ht, const void *key);

If you really need to call PL_HashTableRawLookup, then
we'll have to add a non-reordering version of it too:
PR_EXTERN(PLHashEntry *const *)
PL_HashTableRawLookupConst(const PLHashTable *ht, PLHashNumber keyHash,
                           const void *key);

Note the 'const' qualifiers in the return type and
the first argument.

Cf. PL_HashTableRawLookup:
PR_EXTERN(PLHashEntry **)
PL_HashTableRawLookup(PLHashTable *ht, PLHashNumber keyHash, const void *key);
Status: NEW → ASSIGNED
Attached patch Proposed patch.Splinter Review
I don't like this approach. Just fix the existing API. My change is MUCH less 
drastic, and doesn't complicate the API.
Adding the 'const' versions of the hash table lookup
functions will also fix the other problem -- that
lookup is not a const operation.  You would think that
once you populate a table with entries, multiple threads
should be able to do lookups without having to lock the
table. This is not true with the current implementation of
the lookup functions because of the hash entry reordering.
So, you end up having to lock the table even though
you know the table's contents won't change.

I checked in patch 6906.
/cvsroot/mozilla/nsprpub/lib/ds/plhash.h, revision 3.5
/cvsroot/mozilla/nsprpub/lib/ds/plhash.c, revision: 3.4
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Target Milestone: --- → 4.1
Blocks: 32185
No longer blocks: 32185
But are hashtables aren't really thread-safe anyhow. Adding an element to a 
hashtable doesn't use locks, so we'd be forced to use global external locks 
anyway.
Blocks: 32185
Merged the new "const" plhash lookup functions into the
NSPRPUB_CLIENT_BRANCH.
/cvsroot/mozilla/nsprpub/lib/ds/plhash.h, revision 3.4.72.1
/cvsroot/mozilla/nsprpub/lib/ds/plhash.c, revision 3.3.72.1

Merged the new "const" plhash lookup functions into the
NSPRPUB_RELEASE_4_0_BRANCH.
Checking in plhash.h;
/cvsroot/mozilla/nsprpub/lib/ds/plhash.h, revision 3.4.58.1
/cvsroot/mozilla/nsprpub/lib/ds/plhash.c, revision 3.3.58.1
Target Milestone: 4.1 → 4.0.1
Target Milestone: 4.0.1 → 4.0.2
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: