Open Bug 1782493 Opened 2 years ago Updated 2 years ago

Add a test case to check that ParserAtom and JSAtom share the same hashes independently of the encoding.

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: nbp, Unassigned)

References

Details

As of today, it is impossible to compare a JSAtom and TaggedParserAtomIndex, or a TaggedParserAtomIndex with another TaggedParserAtomIndex from a different ParserAtomTable.

The only option is to internInto the name to be compared in the context of the names which are compared against.

Doing so, can be costly, as this implies manipulating strings.
However, when manipulating TaggedParserAtom, as well as JSAtom, we are both hashing the string content, to atomize the representation. The hash is kept around, either by the hash table, or by the JSAtom to reduce the recomputation.

If we were to guarantee that this hash is identical for both TaggedParserAtomIndex and JSAtom, we could reuse this hash as a fast way to lookup / compare strings from different JSContext / ParserAtomTable.

In addition to using the same hashing scheme, we should also ensure that the hash function used is stable across character types: Latin1Char, Utf8Unit and the char16_t. The is necessary as TaggedParserAtomIndex are hashed given the source encoding, while JSAtom are optimized for size and attempt to always use the Latin1Char when possible.

Don't we already rely on this today to speed up instantiation of JSAtoms by (re)using ParserAtom::hash_?

(In reply to Jan de Mooij [:jandem] from comment #1)

Don't we already rely on this today to speed up instantiation of JSAtoms by (re)using ParserAtom::hash_?

You are right, for JSAtom initialized from a TaggedParserAtomIndex (js::frontend::ParserAtom::instantiateAtom), we do initialize the JSAtom with the hash of the ParserAtom, and this is verified with an assertion.

However, JSAtom are using HashString and TaggedParserAtomIndex are using SpecificParserAtomLookup which are inflating the sequence of character before computing the hash.

Technically, this sounds like they should be identical, but this is hard to guarantee at first sight.

This property would be easier to guarantee by having a single entry point into hashing and a proper test case to ensure that the hashing code is independent of the encoding.

Assignee: nicolas.b.pierron → nobody
Severity: -- → N/A
Status: ASSIGNED → NEW
Priority: P1 → P3
Summary: Ensure TaggedParserAtomIndex and JSAtom are using the same hashing scheme. → Add a test case to check that ParserAtom and JSAtom sare the same hashes independently of the encoding.
Summary: Add a test case to check that ParserAtom and JSAtom sare the same hashes independently of the encoding. → Add a test case to check that ParserAtom and JSAtom share the same hashes independently of the encoding.
You need to log in before you can comment on or make changes to this bug.