Closed Bug 927705 Opened 6 years ago Closed 6 years ago

Allow greater than 8M capacity in pldhash

Categories

(Core :: XPCOM, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla27

People

(Reporter: njn, Assigned: njn)

References

Details

Attachments

(4 files, 3 obsolete files)

A pldhash cannot have a capacity of more than 1<<24 (~16 million).

In bug 902922, this caused the cycle collector to lose track of some stuff and cause some leaks.

The 1<<24 limit is due to the use of 8-bit values for minAlphaFrac and maxAlphaFrac (see bug 902922 comment 46 for details).  These could be made larger, but then sizeof(PLDHashTable) would increase on 32-bit platforms from 32 bytes to slightly more than 40 bytes, which, if heap-allocated, jemalloc would round up to 48.  Which would be sad-making because we have a lot of pldhashes.

But it turns out that nobody uses the customizable minAlpha/maxAlpha values;  everyone uses the default values (0.25 and 0.75).  So let's just remove the customizability.  This simplifies the implementation, reduces sizeof(PLDHashTable), and allows the 1<<24 limit to be raised.
Blocks: 902922
This patch removes the (unused) ability to adjust the min/max loads of a
pldhash table.  This will allow us to increase the maximum capacity of a table
beyond 1<<24 without increasing sizeof(PLDHashTable).

The patch makes the following specific changes.

- It removes PL_DHashTableSetAlphaBounds().

- It removes PLDHashTable's {min,max}AlphaFrac members.

- It changes the {MIN,MAX}_LOAD macros into inline functions.

- It renames PL_DHASH_DEFAULT_{MIN,MAX}_ALPHA as PL_DHASH_{MIN,MAX}_ALPHA.
  This caused a clash with the existing, unused, 2-arg PL_DHASH_MIN_ALPHA
  macro, so I removed it.

- It removes the PL_DHASH_DEFAULT_CAPACITY and PL_DHASH_CAPACITY macros.
  Although potentially useful, they are unused.

- Stat:  2 files changed, 17 insertions(+), 102 deletions(-)
Attachment #818206 - Flags: review?(brendan)
This patch lifts the max capacity of a pldhash table from 16M (1<<24) to 1B
(1<<30).  I added a comment explaining why I chose that number.
Attachment #818209 - Flags: review?(brendan)
Summary: Allow more than (1<<24) entries in pldhash → Allow greater than (1<<24) capacity in pldhash
Comment on attachment 818206 [details] [diff] [review]
(part 1) - Remove PL_DHashTableSetAlphaBounds() and the supporting machinery.

>+static inline uint32_t MaxLoad(uint32_t size) {
>+    return PL_DHASH_MAX_ALPHA * size;
>+}
>+static inline uint32_t MinLoad(uint32_t size) {
>+    return PL_DHASH_MIN_ALPHA * size;
> }

How about
  return size - (size >> 2);
and
  return size >> 2;
to avoid the floating-point arithmetic? Or are compilers enough smart to optimize these functions?
> How about
>   return size - (size >> 2);
> and
>   return size >> 2;
> to avoid the floating-point arithmetic?

Nice idea.  I measured how often these functions are called.  In about 30 seconds worth of browsing they were called over 2 million times, which is probably enough to be worth optimizing.
This version has the additional change that emk suggested -- it uses bit
arithmetic to implement the multiplications by 0.25 and 0.75.

It removes PL_DHASH_{MIN,MAX}_ALPHA in the process;  they were in pldhash.h but
that information didn't need to be exposed anyway.
Attachment #818249 - Flags: review?(brendan)
Attachment #818206 - Attachment is obsolete: true
Attachment #818206 - Flags: review?(brendan)
And I just checked js/public/HashTable.h.  It has hard-coded 0.25 min and 0.75 max loadings.  It also uses the same 8-bit fixed point arithmetic as pldhash (see s{Min,Max}AlphaFrac) and so has the same 1<<24 capacity limit.
I just looked into pldhash's behaviour in these extreme cases some more, and
learned some things worth sharing.

Let's consider the pre-patch case, where PL_DHASH_SIZE_LIMIT is 1<<24 (16M).

- I thought the table could grow to a capacity of 16M.  But we fail to grow if
  the new capacity would be >= 16M.  So the max capacity is actually 8M.

  I guess the use of |LIMIT| in the name should have twigged me to this, but
  I'm used to |LIMIT| meaning |MAX + 1|, not |MAX * 2|.  I'm tempted to change
  that test to > 16M and rename the constant PL_DHASH_MAX_SIZE.  (This would
  nicely mirror PL_DHASH_MIN_SIZE.)

- I thought that if we failed to grow when the loading hits 0.75 the table
  insertion would immediately fail.  But that doesn't happen.  If growing
  fails, we still add the new element so long as the loading doesn't hit 1.0.
  But as the loading gets close to 1.0, there will be ever more collisions and
  so insertions will get slower.  (Indeed, they absolutely crawl near the end.)
  With the new, more generous capacity limit, this behaviour is less likely to
  be hit, but it may have caused some slowness in large CC operations, i.e.
  those dealing with between 6M and 8M elements.

- PLDHashAllocTable() takes a byte size, which is a uint32_t.  There's no
  protection against overflow of that value.  With the old capacity limit of
  8M, you'd need to have 512 byte entries to overlow.  With the new capacity
  limit, it's much easier.  (Note that the smallest common entry size is two
  words, which is eight bytes on 32-bit platforms.)  Knowing that, the max
  capacity of 1<<30 is almost certainly too high.  1<<27 might be more
  reasonable.

We really need a unit test for this stuff.  Does one already exist?
This version:

- Changes the max capacity size to 1<<26.  
  
- Renames PL_DHASH_SIZE_LIMIT as PL_DHASH_MAX_SIZE and adjusts the size tests,
  to make things more obvious.

- Adds overflow checks when computing the byte size of a new entry store.
Attachment #818781 - Flags: review?(brendan)
Attachment #818209 - Attachment is obsolete: true
Attachment #818209 - Flags: review?(brendan)
This version actually compiles.

BTW, I'm working on a unit test.  That'll be part 3.
Attachment #818788 - Flags: review?(brendan)
Attachment #818781 - Attachment is obsolete: true
Attachment #818781 - Flags: review?(brendan)
Comment on attachment 818788 [details] [diff] [review]
(part 2) - Increase pldhash's max capacity from 1<<23 to 1<<26.

>-/* Table size limit, do not equal or exceed. */
>-#undef PL_DHASH_SIZE_LIMIT
>-#define PL_DHASH_SIZE_LIMIT     ((uint32_t)1 << 24)
>+/*
>+ * Table size limit; do not equal or exceed.  The max capacity used to be 1<<23

"Table size limit; do not exceed"?
> "Table size limit; do not exceed"?

Good catch.  I've fixed it locally.  Maybe I should ask you to do the reviews...
Comment on attachment 818249 [details] [diff] [review]
(part 1) - Remove PL_DHashTableSetAlphaBounds() and the supporting machinery.

Brendan said he's busy, and suggested jorendorff for reviews.
Attachment #818249 - Flags: review?(brendan) → review?(jorendorff)
Summary: Allow greater than (1<<24) capacity in pldhash → Allow greater than 8M capacity in pldhash
Attachment #818788 - Flags: review?(brendan) → review?(jorendorff)
Attachment #818249 - Flags: review?(jorendorff) → review+
Comment on attachment 818788 [details] [diff] [review]
(part 2) - Increase pldhash's max capacity from 1<<23 to 1<<26.

Review of attachment 818788 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/glue/pldhash.h
@@ +32,5 @@
> + * Table size limit; do not equal or exceed.  The max capacity used to be 1<<23
> + * but that occasionally that wasn't enough.  Making it much bigger than 1<<26
> + * probably isn't worthwhile -- tables that big are kind of ridiculous.  Also,
> + * the growth operation will (deliberately) fail if (capacity * entrySize)
> + * overflows a uint32_t, and entrySize is usually at least 8 bytes.

Is it "usually" or "always"?
Attachment #818788 - Flags: review?(jorendorff) → review+
> Is it "usually" or "always"?

Yeah, "always".  I'll fix.
The only thing I'm unhappy about is that test_pldhash_grow_to_max_capacity()
takes about 40s in a debug build on my fast desktop machine :(  That is
excessive, but I don't see how else to test the capacity limit.
Attachment #818905 - Flags: review?(jorendorff)
That's surprisingly quick, considering all those calls through function pointers. :-P

It would be OK with me to make that a white-box test and add a debug-only pldhash API to speedily pack the table with stuff.

I'll review today. Please file a followup bug for HashTable.h.
Comment on attachment 818905 [details] [diff] [review]
(part 3) - Add a C++ unit test for pldhash.

Review of attachment 818905 [details] [diff] [review]:
-----------------------------------------------------------------

Well, the tests look great. And looking back at the pldhash code I see this test forces the code to operate in an "over MaxLoad" region that it normally doesn't. So it's a good test to have.

r=me. But it seems a shame to burn 40sec/build on this.

I bet eliminating that region (propagating ChangeTable failure to the caller as an error, rather than swallowing it and overloading the table) would make this test run a lot faster, and maybe that's worth doing for its own sake. What do you think?

::: xpcom/tests/TestPLDHash.cpp
@@ +12,5 @@
> +// which are unlikely to be hit during normal execution.
> +
> +namespace TestPLDHash {
> +
> +#include <malloc.h>

Is this right? #include <malloc.h> inside the namespace?
Attachment #818905 - Flags: review?(jorendorff) → review+
> Well, the tests look great. And looking back at the pldhash code I see this
> test forces the code to operate in an "over MaxLoad" region that it normally
> doesn't. So it's a good test to have.
> 
> r=me. But it seems a shame to burn 40sec/build on this.
> 
> I bet eliminating that region (propagating ChangeTable failure to the caller
> as an error, rather than swallowing it and overloading the table) would make
> this test run a lot faster, and maybe that's worth doing for its own sake.
> What do you think?

The new max capacity is 64M and filling up the last 1M takes almost 2/3 of the time.  Doing the change you suggest would probably get the time down to 10s and would avoid a performance fault.  I'll try doing it on Monday. 


> > +namespace TestPLDHash {
> > +
> > +#include <malloc.h>
> 
> Is this right? #include <malloc.h> inside the namespace?

Oh, that was only needed for some temporary debugging code (where I used malloc_usable_size()).  I'll remove it.
This stops the table from being overloaded past 75% full when it can't get any
bigger.  In a debug -O0 build, this reduces the test time from 45s to 10s.  IN
an opt build, it reduces the test time from 29s to 5.5s.  Good enough, I hope.

I'll merge these two changes into parts 2 and 3 before landing.
Attachment #819523 - Flags: review?(jorendorff)
Comment on attachment 819523 [details] [diff] [review]
(part 2a/3a) - Don't overload a table beyond 75% full.

Review of attachment 819523 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks.
Attachment #819523 - Flags: review?(jorendorff) → review+
> Please file a followup bug for HashTable.h.

What should that bug cover?

1. HashTable.h doesn't have adjustable min/max loads, so there's nothing to do there.

2. HashTable.h has a 16M element limit and 8-bit fixed point arithmetic to compute min/max loads, but it's not currently causing a problem.

3. AFAICT, HashTable.h never overloads beyond 75% full (search for |RehashFailed|).

4. HashTable.h can get a uint32_t overflow when computing the store size (in createTable()), e.g. if |capacity| is 16M and sizeof(Entry) is 256 or more.

Point 4 is the only one that seems like a real problem at the moment.
It's good to see that 3 is the case.

I meant 2 and 4. It's not super pressing, but it's a code hygiene thing. There shouldn't be gratuitous differences between HashTable and jsdhash.
Something I just thought of... I have a vague recollection of bsmedberg or dbaron telling me that the layout of the PLDHashTable class members is relevant to binary compatibility somehow.  Is this true?  If so, part 1 just broke that binary compatibility, though I could get it back by reinstating two dummy uint8_t fields.
Flags: needinfo?(dbaron)
Flags: needinfo?(benjamin)
It's not relevant across releases.
Flags: needinfo?(dbaron)
Oh, I just discovered this:

  JS_STATIC_ASSERT((sMaxCapacity * sizeof(Entry)) <= UINT32_MAX);

which prevents overflow.  HashTable.h can do this thanks to its use of templates.  pldhash cannot, though it could have a dynamic assertion in PL_DHashTableInit() to complain about entry sizes that might cause overflow... but then the test I added would fail that assertion in debug builds.  So leaving overflow as a dynamic on-insertion failure seems reasonable.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #25)
> It's not relevant across releases.

Great, thanks!


(In reply to Jason Orendorff [:jorendorff] from comment #23)
> There shouldn't be gratuitous differences between HashTable and jsdhash.

I just filed bug 929620 for this.
Comment on attachment 818788 [details] [diff] [review]
(part 2) - Increase pldhash's max capacity from 1<<23 to 1<<26.

Review of attachment 818788 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/glue/pldhash.h
@@ +30,5 @@
>  
> +/*
> + * Table size limit; do not equal or exceed.  The max capacity used to be 1<<23
> + * but that occasionally that wasn't enough.  Making it much bigger than 1<<26
> + * probably isn't worthwhile -- tables that big are kind of ridiculous.  Also,

Tables larger than 1<<23 seemed ridiculous 12 years ago and then we leaked and we didn't know about it...

Could you file a follow-up bug for removing the remaining size limit?

Either that, or advertise really loudly that pldhash is a 26-bit hash table, not a 32-bit hash table.
> Could you file a follow-up bug for removing the remaining size limit?

No.  PLDHashNumber is a uint32_t.  Changing that to uint64_t would affect huge amounts of code, and also slow down very frequent hash operations.  Also, it's still possible for an insertion to fail due to failure to resize the table.  So code that does insertion always needs to be able to handle failure.  (Even if the CC doesn't handle it very well.)
 
> Either that, or advertise really loudly that pldhash is a 26-bit hash table,
> not a 32-bit hash table.

Why do you assume a hash table can hold 1<<32 elements?
Flags: needinfo?(benjamin)
(In reply to Benoit Jacob [:bjacob] from comment #29)
> Either that, or advertise really loudly that pldhash is a 26-bit hash table,
> not a 32-bit hash table.

Perhaps by writing to stderr when it happens.

To be fair, the pldhash code *did* advertise its limitations, at runtime, by returning false. If it also wrote to stderr, would it have saved you any time debugging?

> Why do you assume a hash table can hold 1<<32 elements?

I think that's fair. A vector can't hold 1<<32 elements, on a 32-bit platform. It's not a reasonable expectation.
(In reply to Jason Orendorff [:jorendorff] from comment #31)
> I think that's fair. A vector can't hold 1<<32 elements, on a 32-bit
> platform. It's not a reasonable expectation.

(In reply to Nicholas Nethercote [:njn] from comment #30)
> Why do you assume a hash table can hold 1<<32 elements?

I wasn't talking about 1<<32 *elements* but about 1<<32 *bytes*. Vectors can definitely grow arbitrarily close to 1<<32 bytes on 32bit platforms, until they run out of address space or out of memory.

I've never looked into internals of hashtables so maybe that is not a reasonable expectation for hashtables, but before bug 103990 landed, there was no MAX_SIZE constant around this code... so was this code then able to grow much closer to 1<<32 bytes, or was it unsafe?

In the use case discussed here (the cycle collector, bug 902922) even 1<<32 bytes would at best buy us time (maybe decades!) but that use case will eventually want to grow beyond 1<<32 bytes on 64bit machines, eventually. There already are many workstations with far more than 4G of RAM today, and so we're probably not far away from it being feasible to build a CC graph exceeding 4G. So if we want for the CC graph to be stored in a hashtable, then we'll need a hashtable whose byte size is a size_t, no?

(In reply to Jason Orendorff [:jorendorff] from comment #31)
> To be fair, the pldhash code *did* advertise its limitations, at runtime, by
> returning false. If it also wrote to stderr, would it have saved you any
> time debugging?

A little bit perhaps, but not nearly as much as a plain crash in a debug build; and also, I understand that this is not the job of the data container (pldhash) but rather the job of the user code (the CC). Attachment 817853 [details] [diff] makes the CC MOZ_ASSERT on these errors, and that will save a lot of time. I even wonder if this should crash even release builds, so that we know when this happens in the wild (and nobody wants to use a browser on which they hit multi-gigabyte leaks anyway), I'll file a separate bug for that.
Oh, above I didn't understand that the 8M limit we were talking about here was a number-of-elements, not a number-of-bytes.
Another thing that motivated my choice of 1<<26 for the capacity limit was testability.  Creating a code path that fails incredibly rarely is a recipe for bugs because that code path is unlikely to be tested.  With 1<<26 as the limit I could write a test that filled up the table without taking a ridiculous amount of time;  ~10s on a debug build.
You need to log in before you can comment on or make changes to this bug.