Closed Bug 1038601 Opened 10 years ago Closed 10 years ago

Shrink js::HashTable

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

Attachments

(1 file, 3 obsolete files)

js::HashTable can be made smaller.
Attached patch Shrink js::HashTable (obsolete) — Splinter Review
Attachment #8456030 - Flags: review?(luke)
Comment on attachment 8456030 [details] [diff] [review]
Shrink js::HashTable

Review of attachment 8456030 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/public/HashTable.h
@@ +997,2 @@
>      uint32_t    removedCount;   // removed entry sentinels in table
> +    uint16_t    gen;            // entry storage generation number

I'm a little bit worried that this increases the probability of the ABA problem (between reading generation() and checking again, UINT16_MAX mutations occur).  For small hash tables and repeated inserts/removes, this could feasibly happen.

Truly, I've never liked this generation counter.  There are very few uses and only one looks potentially hot (array_join via AutoCycleDetector, but really it's the body of array_join that's hot), so I wonder if we can just remove it.  If we do that, we can remove the uint16.  If we change removedCount to a 24 bit bitfield (we can, I think, because sMaxCapacity is 2^24), then we can shave off another word!
Observations:

- I see three uses: AutoCycleDetector, AutoEntryHolder, and HeapReverser.

- generation() counts table resizings, not mutations. Having an exact multiple of 65536 resizings between calls to generation() -- which is what would have to happen for a bug to manifest -- seems unlikely. But we could increase |gen| to 24 bits to make it less likely.

- Just to check, I instrumented AutoCycleDetector. The largest difference between gen1 and gen2 I saw during some basic browsing was 20.

- There were 121,388 gen1==gen2 checks in AutoCycleDetector during this time. 99.4% of them had matching generations.

Options:

- Increase |gen| to 24 bits.

- Remove |gen|. The current callers would just have to repeat their lookups. 

Which do you prefer?
Flags: needinfo?(luke)
(In reply to Nicholas Nethercote [:njn] from comment #3)
> - generation() counts table resizings, not mutations. Having an exact
> multiple of 65536 resizings between calls to generation() -- which is what
> would have to happen for a bug to manifest -- seems unlikely. But we could
> increase |gen| to 24 bits to make it less likely.

"Less likely" is a pretty uncomfortable place to be in when it comes to correctness.  Recall that, at small table sizes, it doesn't take many add/removes to trigger a resize.

> - There were 121,388 gen1==gen2 checks in AutoCycleDetector during this
> time. 99.4% of them had matching generations.

I think the more relevant measurement: if we redo the lookup, does it measurably hurt performance.

> - Remove |gen|. The current callers would just have to repeat their lookups. 
> 
> Which do you prefer?

Well, the latter :)  We'd win another word.
Flags: needinfo?(luke)
Actually, we can keep the generation optimization, and make it 100% safe (which it currently isn't even with the 32-bit value) while also saving space -- just have a single "hasBeenResized" bit, and let the caller clear it as well as read it.
Good idea!  I think we'd need two bits though to handled nested uses of generation.
Hmm, I hadn't thought about nested uses.... and I can't see how two bits could be enough to handle multiple nestings.
This version keeps |generation| as a uint32_t and shrinks |removedCount|
instead.

Removing |generation| can possibly be done as a follow-up. On 32-bit (which is
the more interesting case) it would reduce the size from 16 to 12. For
heap-allocated tables this wouldn't be any improvement due to jemalloc rounding
up request sizes. It would get the 64-bit size down from 24 to 16, though.
Attachment #8456653 - Flags: review?(luke)
Attachment #8456030 - Attachment is obsolete: true
Attachment #8456030 - Flags: review?(luke)
Comment on attachment 8456653 [details] [diff] [review]
Shrink js::HashTable

Review of attachment 8456653 [details] [diff] [review]:
-----------------------------------------------------------------

Great!  Does jemalloc *need* to round up to 16 or could we introduce a 12-byte allocation class.  It seems like 12-byte allocations would be common.

::: js/public/HashTable.h
@@ -1016,5 @@
>  #endif
>  
> -    friend class mozilla::ReentrancyGuard;
> -    mutable mozilla::DebugOnly<bool> mEntered;
> -    mozilla::DebugOnly<uint64_t>     mutationCount;

Ugh, this is even my fault for r+'ing the switch from #ifdef DEBUG in bug 722946.  I don't think it matters much for AddPtr since they are temporary and on the stack and the SRA optimization should throw away the field anyway, but I should've caught this in HashTable.

@@ +1080,5 @@
>    public:
>      explicit HashTable(AllocPolicy ap)
> +      : AllocPolicy(ap)
> +      , table(nullptr)
> +      , gen(0)

This definitely isn't SM style and I don't think it's mfbt style either, according to mfbt/STYLE.  (I actually prefer this style, though; maybe we could switch.)
Attachment #8456653 - Flags: review?(luke) → review+
> Great!  Does jemalloc *need* to round up to 16 or could we introduce a
> 12-byte allocation class.

Not easily, unfortunately.

> It seems like 12-byte allocations would be common.

In my various slop investigations I found that 24-byte allocations (which are rounded up to 32) are far more common, though admittedly these investigations have all been on 64-bit. Not that we can do much about those, either.

> @@ +1080,5 @@
> >    public:
> >      explicit HashTable(AllocPolicy ap)
> > +      : AllocPolicy(ap)
> > +      , table(nullptr)
> > +      , gen(0)
> 
> This definitely isn't SM style and I don't think it's mfbt style either,
> according to mfbt/STYLE.  (I actually prefer this style, though; maybe we
> could switch.)

But it *is* standard Mozilla/Gecko style (https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#Classes) which MFBT now uses -- see the very first paragraph of mfbt/STYLE. I'm halfway through a final patch to finish the MFBT conversion, the most recent part of which was done in bug 1036789.

Furthermore, this style is already used for the existing constructors in HashTable.h that involve conditional compilation, so I'll argue that I'm just following local style :)
https://hg.mozilla.org/mozilla-central/rev/8cf3f3b925a3
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
This change breaks our build with these errors:

out/target/product/msm8610/obj/objdir-gecko/dist/include/js/HashTable.h: In member function 'void js::detail::HashTable<T, HashPolicy, AllocPolicy>::remove(js::detail::HashTable<T, HashPolicy, AllocPolicy>::Entry&)':
out/target/product/msm8610/obj/objdir-gecko/dist/include/js/HashTable.h:1379:9: error: 'mutationCount' was not declared in this scope

Seems like a bunch of references to mutationCount weren't wrapped with the ifdef check.
> This change breaks our build with these errors:

I just checked the file. Every use of |mutationCount| is wrapped within a |#ifdef DEBUG| (either directly, or indirectly via MOZ_ASSERT) or within |#ifdef JS_DEBUG|. (The one on line 1379 has a |#ifdef DEBUG|.)

And it's been compiling successfully on the main repos for almost a week.

So the question is: what does "our build" refer to and how does it differ from the builds done on the main repos?
Flags: needinfo?(mschwart)
Hi Nicholas, thanks for quick response.

> So the question is: what does "our build" refer to and how does it differ from the builds done on the main repos?

I looked at it in more depth and the problem is because HashTable::mutationCount is declared conditionally by JS_DEBUG [1] but used conditionally by DEBUG [2].  In our builds, DEBUG is defined but JS_DEBUG is not.

That seems like a bug in this patch but perhaps there is a relationship between DEBUG and JS_DEBUG that our build is not preserving?  Please advise.

[1] https://github.com/mozilla/gecko-dev/blob/256a1f168b345758d1ce83f6575d6543f8bec738/js/public/HashTable.h#L1005
[2] https://github.com/mozilla/gecko-dev/blob/256a1f168b345758d1ce83f6575d6543f8bec738/js/public/HashTable.h#L1089
Flags: needinfo?(mschwart) → needinfo?(n.nethercote)
The distinction between DEBUG and JS_DEBUG appears complex. Bug 939505 has some details; it suggests that the JS engine should use JS_DEBUG through, but in practice it only is used in a few places whereas DEBUG appears in many places.

jorendorff, do you know what's supposed to be done here? HashTable.h used a mix of DEBUG and JS_DEBUG prior to this bug, but one that didn't cause problems if they didn't have the same value. Given that it's not used much, eliminating JS_DEBUG entirely in favour of DEBUG seems like a good idea, and I'm happy to do that in a separate bug if you agree.

In the meantime, changing the JS_DEBUG occurrences to DEBUG in HashTable.h should be enough to fix the compile error.
Flags: needinfo?(n.nethercote) → needinfo?(jorendorff)
This replaces every DEBUG in HashTable.h with JS_DEBUG. It also removes some
unnecessary DebugOnly<> usage. This should fix compile problems for embedders.
Attachment #8462113 - Flags: review?(jwalden+bmo)
Waldo suggested that changing all DEBUG occurrences in that file to JS_DEBUG should fix things, so I've done that.

Michael, does this fix the problem for you?
Flags: needinfo?(jorendorff) → needinfo?(mschwart)
Unfortunately it doesn't because the MOZ_ASSERT uses DEBUG.  You can emulate my environment by #undef JS_DEBUG at the top of your file.  A simpler, perhaps naive change is:

diff --git a/js/public/HashTable.h b/js/public/HashTable.h
index 4c27308..888fc34 100644
--- a/js/public/HashTable.h
+++ b/js/public/HashTable.h
@@ -1001,9 +1001,11 @@ class HashTable : private AllocPolicy
     uint32_t    removedCount:CAP_BITS;  // removed entry sentinels in table
     uint32_t    hashShift:8;            // multiplicative hash shift
 
-#ifdef JS_DEBUG
+#if defined(DEBUG) || defined(JS_DEBUG)
     mozilla::DebugOnly<uint64_t>     mutationCount;
     mutable mozilla::DebugOnly<bool> mEntered;
+#endif
+#ifdef JS_DEBUG

Note also that I've filed a new bug if it's needed for the follow-up patch https://bugzilla.mozilla.org/show_bug.cgi?id=1043605
Flags: needinfo?(mschwart) → needinfo?(n.nethercote)
I'll wait for Waldo's review for suggestions.
Flags: needinfo?(n.nethercote)
Comment on attachment 8462113 [details] [diff] [review]
(part 2) - Fix up DEBUG/JS_DEBUG confusion in HashTable.h

Review of attachment 8462113 [details] [diff] [review]:
-----------------------------------------------------------------

Fine enough as things go, but as comment 20 notes not complete.  Shouldn't be too bad to ifdef the remaining assertion-only uses, tho.

::: js/public/HashTable.h
@@ +894,5 @@
>  
>          void popFront() {
>              MOZ_ASSERT(!empty());
>              MOZ_ASSERT(generation == table_->generation());
>              MOZ_ASSERT(mutationCount == table_->mutationCount);

Things like this also need to be inside #ifdef JS_DEBUG.  Not sure how many more there are like this, but this is the sort of thing that would be problematic as in comment 20.  Realistically, you'll have to ifdef all of them.
Attachment #8462113 - Flags: review?(jwalden+bmo) → review+
Michael, does this patch work?
Attachment #8463175 - Flags: feedback?(mschwart)
Attachment #8462113 - Attachment is obsolete: true
Comment on attachment 8463175 [details] [diff] [review]
(part 2) - Fix up DEBUG/JS_DEBUG confusion in HashTable.h

Review of attachment 8463175 [details] [diff] [review]:
-----------------------------------------------------------------

(Carrying over Waldo's r+)
Attachment #8463175 - Flags: review+
Michael: feedback ping!
(In reply to Nicholas Nethercote [:njn] from comment #25)
> Michael: feedback ping!

Applying: Bug 1038601 - Shrink js::HashTable.
error: patch failed: js/public/HashTable.h:202
error: js/public/HashTable.h: patch does not apply
error: patch failed: js/src/jsapi.h:285
error: js/src/jsapi.h: patch does not apply
Patch failed at 0001 Bug 1038601 - Shrink js::HashTable.


bitrot?
(In reply to Michael Vines [:m1] [:evilmachines] from comment #26)
>
> bitrot?

Oh, nm.  That attachment has been landed but the other one has not.
Comment on attachment 8463175 [details] [diff] [review]
(part 2) - Fix up DEBUG/JS_DEBUG confusion in HashTable.h

../../../../../../../../gecko/js/src/asmjs/AsmJSModule.cpp:72: error: undefined reference to 'MozTaggedAnonymousMmap'
../../../../../../../../gecko/js/src/asmjs/AsmJSModule.cpp:72: error: undefined reference to 'MozTaggedAnonymousMmap'
../../../../../../../../gecko/js/src/asmjs/AsmJSModule.cpp:72: error: undefined reference to 'MozTaggedAnonymousMmap'
../../../../../../../../gecko/js/src/gc/Memory.cpp:437: error: undefined reference to 'MozTaggedAnonymousMmap'
collect2: error: ld returned 1 exit status
make[6]: *** [js] Error 1
make[5]: *** [js/src/shell/target] Error 2
Attachment #8463175 - Flags: feedback?(mschwart) → feedback-
Comment on attachment 8463175 [details] [diff] [review]
(part 2) - Fix up DEBUG/JS_DEBUG confusion in HashTable.h

Lets ! that.  Looks like I had other cruft in my workspace.   Land this patch fix bug 1043605?
Attachment #8463175 - Flags: feedback- → feedback+
Attachment #8463175 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: