Open Bug 1849490 Opened 1 year ago Updated 9 months ago

Optimize SpecificParserAtomLookup::equalsEntry

Categories

(Core :: JavaScript Engine, task, P3)

task

Tracking

()

ASSIGNED

People

(Reporter: arai, Assigned: arai)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

https://searchfox.org/mozilla-central/rev/19500c006ebf8dc4587eef55357ae26b772391e1/js/src/frontend/ParserAtom.h#858,864-869

virtual bool equalsEntry(const WellKnownAtomInfo* info) const override {
...
  InflatedChar16Sequence<CharT> seq = seq_;
  for (uint32_t i = 0; i < info->length; i++) {
    if (!seq.hasMore() || char16_t(info->content[i]) != seq.next()) {
      return false;
    }
  }

Here's the generated code for Latin1Char specialization, on aarch64, with --enable-optimize and --disable-debug, non-PGO (so this can be completely different on PGO tho), after removing the hash check (bug 1802568).

It performs many unnecessary or redundant operations, inside and outside of the loop.

js`js::frontend::SpecificParserAtomLookup<unsigned char>::equalsEntry:
    0x100409c9c <+0>:  ldp    x8, x9, [x0, #0x10]
    0x100409ca0 <+4>:  ldr    w10, [x1]
    0x100409ca4 <+8>:  cbz    w10, 0x100409cec          ; <+80> [inlined] js::InflatedChar16Sequence<unsigned char>::hasMore() at Text.h:209:34
    0x100409ca8 <+12>: mov    x11, #0x0
    0x100409cac <+16>: sub    w12, w10, #0x1
    0x100409cb0 <+20>: add    x12, x12, x8
    0x100409cb4 <+24>: add    x12, x12, #0x1
    0x100409cb8 <+28>: subs   x13, x9, x8
    0x100409cbc <+32>: csel   x13, xzr, x13, lo
    0x100409cc0 <+36>: cmp    x13, x11
    0x100409cc4 <+40>: b.eq   0x100409cf8               ; <+92> at ParserAtom.h
    0x100409cc8 <+44>: ldr    x14, [x1, #0x8]
    0x100409ccc <+48>: ldrsb  w14, [x14, x11]
    0x100409cd0 <+52>: ldrb   w15, [x8, x11]
    0x100409cd4 <+56>: cmp    w15, w14
    0x100409cd8 <+60>: b.ne   0x100409cf8               ; <+92> at ParserAtom.h
    0x100409cdc <+64>: add    x11, x11, #0x1
    0x100409ce0 <+68>: cmp    x10, x11
    0x100409ce4 <+72>: b.ne   0x100409cc0               ; <+36> at ParserAtom.h:863:26
    0x100409ce8 <+76>: mov    x8, x12
    0x100409cec <+80>: cmp    x8, x9
    0x100409cf0 <+84>: cset   w0, hs
    0x100409cf4 <+88>: ret
    0x100409cf8 <+92>: mov    w0, #0x0
    0x100409cfc <+96>: ret

Given well-known atom contains only ASCII printable, this variant should be more optimized, possibly by simply using memcmp after length comparison.

Same for mozilla::Utf8Unit variant. where we don't have to check non-ASCII case.

char16_t variant still needs one by one comparison, but the number of load can be reduced

Assignee: nobody → arai.unmht
Status: NEW → ASSIGNED
Summary: Optimize SpecificParserAtomLookup::equalsEntry for WellKnownAtomInfo → Optimize SpecificParserAtomLookup::equalsEntry
Depends on: 1872226
You need to log in before you can comment on or make changes to this bug.