Closed Bug 1803855 Opened 1 year ago Closed 6 months ago

js::SubstringKernel should avoid making small ropes

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED FIXED
121 Branch
Tracking Status
firefox121 --- fixed

People

(Reporter: tcampbell, Assigned: anba)

References

(Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Attachments

(7 files, 1 obsolete file)

When creating substrings of ropes we don't check if the result fits in a linear string in all cases.

https://searchfox.org/mozilla-central/rev/c33879cb4884c08c11980847cd84ccd76f485894/js/src/builtin/String.cpp#609

If we fix this, we can avoid allocating the two extra JSDependentString. It also would let us enforce that all ropes are a minimum length which can rule out some cases like static strings.

Assignee: nobody → tcampbell

Good find. Would be nice to add an assertion on minimum rope length to catch this in the future.

(The newRope testing function might make this a little more complicated...)

Severity: -- → S4
Priority: -- → P2

Performance is actually quite bad for this in a micro-benchmark. https://jsfiddle.net/1yauhpsm/
https://share.firefox.dev/3F7Jg3z

Whiteboard: [sp3]
Assignee: tcampbell → nobody

So I attached a WIP patch to this for feedback. I don't love the approach, but it works:

Limitations:

  1. Both sides of the rope need to be linear to support using mozilla::PodCopy
  2. I'm only 70% sure the templating makes any sense here.

My local testing shows that on a microbenchmark akin to what Ted posted in Comment #2 we improve performance by 10% on Latin1 strings and improve performance on Char16 strings by 20%.

I'm not sure if that's worth it for the extra complexity here, but it would be a step towards being able to assert that ropes are only used for strings longer than the inline string length.

Assignee: nobody → mgaudet
Status: NEW → ASSIGNED

I'm going to unassign myself here -- I'm not sure when I'll get back to this, and if someone wants to fix my broken patch that'd be cool with me.

Assignee: mgaudet → nobody
Status: ASSIGNED → NEW

NewInlineString function where the amount of characters to be copied can be
determined at compile-time. This allows the compiler to inline the memcpy call
using inline assembly, because the number of bytes is a const.

Part 4 will also call this new function.

Assignee: nobody → andrebargull
Status: NEW → ASSIGNED

This function is called from CharSplitHelper for two-byte strings where the
input string is guaranteed to be a linear string. We could directly call
StaticStrings::getUnitString(), but testing has shown it doesn't improve
performance, probably because the compiler can't inline this on its own.

Depends on D192189

PodCopy is called directly from other functions in "StringType.cpp" and
"StringType-inl.h", so for consistency call PodCopy everywhere instead of
going through FillChars. The FillChars version which converts from
char16 to Latin-1 is only called in AutoStableStringChars::copyAndInflateLatin1Chars,
directly inline it there. And finally move FillFromCompatible closer to its
callers, so it's next to CanStoreCharsAsLatin1.

Depends on D192190

The actual change to avoid creating small ropes in SubstringKernel.

The existing calls to NewDependentString on the left and right child nodes
were already linearising the child nodes. Move this linearisation into
SubstringKernel, so we can copy the characters more easily. Then if the final
length fits into inline strings, call SubstringInlineString to create a new
inline string.

Depends on D192191

Use a lambda function to avoid duplicating the code for Latin-1 and Two-byte
strings. This is in preparation for the next part, which will add new code to
this function.

Depends on D192192

Add assertions to ensure all small strings are created as inline strings. Then
update the testing functions, so they won't trigger the new assertions and fix
tests which created small ropes.

Depends on D192193

Drive-by change noticed while reading through this file. This loop can be
replaced with std::none_of, which Clang is able to optimise more easily to
read multiple characters at once.

Depends on D192194

Pushed by andre.bargull@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/073936638f57
Part 1: Add NewInlineString with array parameter. r=jandem
https://hg.mozilla.org/integration/autoland/rev/12eb211fc288
Part 2: Add StaticStrings::getUnitStringForElement with linear string parameter. r=jandem
https://hg.mozilla.org/integration/autoland/rev/fd0cb08b2b5d
Part 3: Inline calls to PodCopy. r=jandem
https://hg.mozilla.org/integration/autoland/rev/1a6dcbf80a6e
Part 4: Avoid creating small ropes in SubstringKernel. r=jandem
https://hg.mozilla.org/integration/autoland/rev/228e3b2ffba5
Part 5: Avoid duplicate code in NewString testing function. r=jandem
https://hg.mozilla.org/integration/autoland/rev/29cb9f950aab
Part 6: Disallow creating small non-inline or rope strings. r=jandem
https://hg.mozilla.org/integration/autoland/rev/6f44ed5f3e4b
Part 7: Prefer std::none_of to search for characters. r=jandem
Regressions: 1863390
Attachment #9337919 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: