Last Comment Bug 763636 - GC: Do not OOM when rekeying a hashtable
: GC: Do not OOM when rekeying a hashtable
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla16
Assigned To: Terrence Cole [:terrence]
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on: 759991
Blocks: 744270
  Show dependency treegraph
 
Reported: 2012-06-11 12:59 PDT by Terrence Cole [:terrence]
Modified: 2012-06-16 06:51 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
v0 (10.89 KB, patch)
2012-06-13 14:36 PDT, Terrence Cole [:terrence]
luke: review+
Details | Diff | Splinter Review

Description Terrence Cole [:terrence] 2012-06-11 12:59:52 PDT
When we rekey a table during GC, we can potentially fill it up with tombstones.  This means that we (may) need to rebuild the table after the rekeying, or we will iloop on our next lookup, because we have violated the table's invariants.  Unfortunately, our current rebuild depends on mallocing a new table and copying to it: this can fail, resulting in OOM during GC, which is completely invalid.

What we should do here is rebuild the table in-place to kill the unneeded tombstones.  This should be fairly trivial to implement, based on the earlier rekey method I wrote that worked on the full table at once.
Comment 1 Terrence Cole [:terrence] 2012-06-12 16:46:22 PDT
I have finally convinced myself that the state above is guaranteed to have at least one tombstone value that is not protecting any collision path and could thus be removed.  Here is my reasoning:

Given hashtable HT with size S.
Given n entries with collision paths C0-Cn and nT tombstones (ts).
Given "E crosses i" is defined as: the set of entries in E that has a collision path which transits the location i in HT.
Given N as the size of the set E in the above.

Iff an insertion will insert into the first free (not removed entry) then we can have a worst case situation that looks like:
  N=1 for 1 loc in HT
  N=2 for 1 loc in HT
  ...
  N=n for 1 loc in HT

In my head, I have been calling this the "Towers of Hanoi" (ToH) property: if you think of a collision path as a layer, then there is one long layer, one layer one element shorter, one layer two elements shorter etc.  These layers must be stacked: the collision path that flows through n elements must also flow over all elements in the collision path of length n-1, obviously.

Alert readers will note that this view is not accurate: insertion (and putNew now) inserts into the first removed entry, not the first free entry.  Because of this, the above gives us an extreme edge case: the reality is always rosier than this.

Given our worst case as above.
Given a new entry e to insert into HT, making the size n'.
Given that N' is the size of "E crosses i" after insertion.

  Insertion of e will fall into a ts s.t. N<=n at loc i.
  N'<=n'-1, because e is not on its own collision path
  Therefore the maximum N after any putNew is n-1.
  Therefore there exists a loc j with N=0 because of the ToH property.

I.e if you would have had an entry that flowed through nT-1 tombstones to get to the nT'th entry, then the insertion would have just happened into the 0th tombstone, leaving us with N=n-1 as a max.

Resolution:
  Case 0: j is a ts -> find it and setFree on it
  Case 1: j is an entry ->
    N=0 for this entry, therefore it created not collision path,
    Therefore the real Nmax is n-2
    Therefore there exist 2 entries with N=0

I've also been running a fuzzer against the trivial rehashing I'm doing to see if it basically works: we're through several hundred thousand seeds of a 64K element table.  I'm not sure what the implications of this proof are for that algorithm: I need to do a bit more thinking to see if it is applicable or if I should change to a new algorithm.
Comment 2 Terrence Cole [:terrence] 2012-06-12 17:25:17 PDT
One small typo fix.
The statement:
Given "E crosses i" is defined as: the set of entries in E that has a collision path which transits the location i in HT.
Should read:
Given "E crosses i" is defined as: the set of entries in HT that have a collision path which transits the location i in HT, not including the entry that may be stored at i.
Comment 3 Terrence Cole [:terrence] 2012-06-12 18:25:03 PDT
I have convinced myself that the algorithm I have is not okay:

* When we move an entry we only set collision bits on filled entries.
* Worst case: every entry we visit sets the collision bits on all filled slots in the table, then moves to a non-tombstoned slot.
* The 1st entry sets n-1 collision bits (it doesn't collide with itself), the second entry sets 1 collision, the third through nth repeat what the second has done.
* Thus, the number of collided cells at the end of the algorithm will be: n-1 + n-1 = 2*n-2
* The table can be 0.75 full for a size of 4n/3.
* If the table is less than 1/2 full (2*n), then there will be >=2 isFree() slots, otherwise we may run out.
Comment 4 Terrence Cole [:terrence] 2012-06-13 08:59:38 PDT
> Resolution:
>   Case 0: j is a ts -> find it and setFree on it
>   Case 1: j is an entry ->
>     N=0 for this entry, therefore it created not collision path,
>     Therefore the real Nmax is n-2
>     Therefore there exist 2 entries with N=0

Sorry, this is wrong.  The unstated assumption is that the fill rate is less than 0.5.  If this is not true, then there is no guarantee that the entry with N=0 is a tombstone.  This path is a deadend.  I have some ideas, but I need to do some benchmarking first.
Comment 5 Terrence Cole [:terrence] 2012-06-13 12:04:23 PDT
Steve came up with a working algorithm and proof that it works.

The analysis above is an existence proof: for our table, there exists some initial layouts s.t. collision bits are required on all non-live slots to maintain the hashtable's invariants.

Clearly, however, if we just reallocate the table and re-insert all entries in a completely arbitrary ordering, then we will never run into a situation where all empty cell are tombstones.  Steve's algorithm uses the collision bits to tell whether an entry is "inserted" or not and simply uses the current cell as swap space so that we can always make an insertion into a cell that is filled, but not "inserted" yet.  What makes this work is the post-processing pass which sets all of the collision bits correctly.  Thus, what you end up with is a table layout exactly equivalent to what you would have gotten with insertion into a new table with an arbitrary ordering of the original elements.

Steve's algorithm is simple:
1) clear collision bits
2) for (src = first entry; src != last entry;)
       for (tgt = first link of collision chain of src;; doublehash)
           if (tgt is empty or collision is unset)
               Swap(src, tgt)
               set collision on tgt
       if (src is empty)
           src++
3) clear collision bits
4) for (pos = each non-empty entry in table)
       for (chain = first link in collision chain; chain != pos; doublehash)
           set collision on chain

Simple, fast, and quite elegant.
Comment 6 Steve Fink [:sfink] [:s:] 2012-06-13 12:36:24 PDT
(In reply to Terrence Cole [:terrence] from comment #5)
> What makes this work is the
> post-processing pass which sets all of the collision bits correctly.  Thus,
> what you end up with is a table layout exactly equivalent to what you would
> have gotten with insertion into a new table with an arbitrary ordering of
> the original elements.

Actually, I think the post-processing pass is optional.

> Steve's algorithm is simple:
> 1) clear collision bits
> 2) for (src = first entry; src != last entry;)
>        for (tgt = first link of collision chain of src;; doublehash)
>            if (tgt is empty or collision is unset)
>                Swap(src, tgt)
>                set collision on tgt
>        if (src is empty)
>            src++
> 3) clear collision bits
> 4) for (pos = each non-empty entry in table)
>        for (chain = first link in collision chain; chain != pos; doublehash)
>            set collision on chain

Technically, steps #3 and #4 are optional. If you omit them, then the table is in the state as if you started with an empty table and inserted the entries in 'src' order, and then did a lookupForAdd on a fictional entry that happened to traverse all filled entries on its lookup chain (setting all of their collision bits), but then did not actually add the entry. (It really only needs to visit all entries that collided, not the ones whose initial hash slot was available, but it doesn't matter.)

Previously, I thought that after performing steps 3 and 4, more collision bits could have been set than "properly" would have been with the sequence of insertions in 'src' order. (Which would been ok, because you can always use the fictional lookupForAdd trick to produce a valid sequence of operations to construct the exact table.) But that's wrong; this *is* the exact set of collision bits resulting from inserting into an empty table in 'src' order.

Proof by contradiction: say you have a collision bit that is set in step 4 that wouldn't have been set from a 'src'-ordered insertion sequence. That means it was set due to a collision with an entry that shouldn't have existed yet, namely it came later in the insertion order and so we erroneously marked it as colliding when in fact the slot would have been empty. But if it had been empty, we would have inserted into it, so the live entries would have been in a different order.
Comment 7 Terrence Cole [:terrence] 2012-06-13 14:36:42 PDT
Created attachment 632888 [details] [diff] [review]
v0

And the implementation is pretty much trivial.
Comment 8 Luke Wagner [:luke] 2012-06-14 09:24:26 PDT
Comment on attachment 632888 [details] [diff] [review]
v0

Review of attachment 632888 [details] [diff] [review]:
-----------------------------------------------------------------

Beautiful!

So I see that you chose to skip steps 3 and 4.  This results in extra collision bits which could result in more tombstones, which would lead to more rehashes and more probing, so the question is how the two balance.  I am happy to delay this measurement until we have everything working so we can measure real workloads, but could you put a small comment at the end saying something to the effect of: TODO: this algorithm leaves collision bits on *all* elements, even if they are on no collision path. We should measure the effects of this (more probing, more rehashing) and consider adding an extra step here to accurately recompute collision bits.

::: js/public/HashTable.h
@@ +649,5 @@
> +                    tgt->setCollision();
> +                    break;
> +                }
> +
> +                DoubleHash dh = hash2(keyHash, hashShift);

I believe this statement can be hoisted out of the loop.
Comment 9 Terrence Cole [:terrence] 2012-06-14 14:03:26 PDT
It might be worth making checkOverRemoved check if we are completely filled with tombstones instead of merely overloaded before running this algorithm.  In cases where the table is almost but not quite full, subsequent missing lookups are going to be slow, but the next add will empty out the tombstones in the correct way.  Worth benchmarking with the others at least.
Comment 10 Terrence Cole [:terrence] 2012-06-14 16:09:01 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/35c79f50d85f
Comment 11 Terrence Cole [:terrence] 2012-06-14 18:36:19 PDT
And backed out in:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4bf6b709f383

Not sure what changed since I tested this yesterday evening, will look into it tomorrow.
Comment 12 Terrence Cole [:terrence] 2012-06-15 13:24:49 PDT
I just totally borked this one: should quite obviously be Swap(*src, *tgt), not Swap(src, tgt).

https://hg.mozilla.org/integration/mozilla-inbound/rev/bd016e0569f5
Comment 13 Ryan VanderMeulen [:RyanVM] 2012-06-16 06:51:27 PDT
https://hg.mozilla.org/mozilla-central/rev/bd016e0569f5

Note You need to log in before you can comment on or make changes to this bug.