Open Bug 1815266 Opened 2 years ago Updated 1 year ago

Experiment: cache string hash values

Tracking

()

Status:

NEW

People

(Reporter: sfink, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Attachments

(4 files)

Bug 1815266 - Allow storing prepared hash values in 64-bit linear strings (extensible and plain) 1 year ago Steve Fink [:sfink] [:s:] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1815266 - Load string hashes in the JIT 1 year ago Steve Fink [:sfink] [:s:] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1815266 - Use stored hashes in string comparisons 1 year ago Steve Fink [:sfink] [:s:] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1815266 - Use stored string hashes when atomizing 1 year ago Steve Fink [:sfink] [:s:] 48 bytes, text/x-phabricator-request		Details \| Review

Steve Fink [:sfink] [:s:]

Reporter

Description

•

2 years ago

We compute a string's hash when atomizing, and when looking it up for various purposes. We could use a bit in the header to say whether it has been hashed, and store the hash after the string data.

It would cost space if we need to bump up the allocation size. Though note that sometimes we'll have enough slop bytes in the allocation to fit it in for free.

Also note that almost every string will get hashed if it survives to be tenured, since the string deduplication code computes its hash to look it up.

Nicolas B. Pierron [:nbp — off until 29-09]

Comment 1

•

2 years ago

This would be useful to start doing that from the stencil, as the hashes, while being the same are re-computed instead of being reused.

Jan de Mooij [:jandem]

Comment 2

•

2 years ago

If a longer string is atomized repeatedly, we avoid rehashing thanks to the StringToAtomCache.

At some point I prototyped storing either a hash code (for atoms) or a pointer to the corresponding atom (for non-atoms) in JSString. This is pretty nice because it removes the branching to get the hash out of a NormalAtom vs FatInlineAtom and is faster than looking it up in the StringToAtomCache (and works better from JIT code). I'm not sure if it's worth the memory overhead though.

Matthew Gaudet (he/him) [:mgaudet]

Updated

•

2 years ago

Severity: -- → S3

Priority: -- → P3

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

2 years ago

Whiteboard: [sp3]

Jira Integration Bot

Updated

•

2 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/SP3-252

Steve Fink [:sfink] [:s:]

Reporter

Updated

•

2 years ago

Depends on: 1825675

Steve Fink [:sfink] [:s:]

Reporter

Updated

•

1 year ago

Depends on: 1848884

Steve Fink [:sfink] [:s:]

Reporter

Comment 3

•

1 year ago

Attached file Bug 1815266 - Allow storing prepared hash values in 64-bit linear strings (extensible and plain) — Details

Steve Fink [:sfink] [:s:]

Reporter

Comment 4

•

1 year ago

Attached file Bug 1815266 - Load string hashes in the JIT — Details

Steve Fink [:sfink] [:s:]

Reporter

Comment 5

•

1 year ago

Attached file Bug 1815266 - Use stored hashes in string comparisons — Details

Steve Fink [:sfink] [:s:]

Reporter

Comment 6

•

1 year ago

Attached file Bug 1815266 - Use stored string hashes when atomizing — Details

Steve Fink [:sfink] [:s:]

Reporter

Comment 7

•

1 year ago

•

Edited

This didn't make much of a difference in speedometer. I wrote some microbenchmarks and looked at what was going on.

One thing I ran into quickly is that theStringToAtomCache that jandem mentioned in comment 2 is also used to indicate that a string should be atomized during deduplication. So the obvious microbenchmark of creating a bunch of string keys and repeatedly looking them up in an object doesn't take advantage of this, because when they get tenured they all get atomized anyway. If I am following along correctly, that means that anytime you create a nursery string and then use it to look up a property, it'll end up getting atomized, which seems like it'll cover a large majority of cases—it'll probably be fairly rare to create a string, hang onto it long enough for it to survive a minor GC, and only then use it as a key.

A microbenchmark to look at (non-matching) string comparisons shows improvements (24 -> 15ns/comparison) but only if I make the strings pretty long. An example string is "a really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really long key with a number at the end: 13". If I replace all of those "really"s with just one, then the performance difference disappears. (And if the string is even shorter, it'll fit in an inline string, and I am not storing hashes for those.)

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Experiment: cache string hash values

Categories

(Core :: JavaScript Engine, task, P3)

Tracking

()

People

(Reporter: sfink, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Crash Data

Security

(public)

User Story

Attachments

(4 files)

Description

Comment 1

Comment 2

Updated

Updated

Updated

Updated

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Attachment

General

Description

File Name

Content Type