Latin1 strings: support string.match, string.search and string.replace

RESOLVED FIXED in mozilla33

Status

()

Core
JavaScript Engine
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: jandem, Assigned: jandem)

Tracking

unspecified
mozilla33
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(8 attachments)

(Assignee)

Description

4 years ago
These are the more complicated functions, because they have to work with regular expressions and have all kinds of optimizations for flat strings and rope matching.

I'm going to handle the simple cases first and we'll see what's needed.
(Assignee)

Updated

4 years ago
Blocks: 998392
(Assignee)

Comment 1

4 years ago
Created attachment 8433331 [details] [diff] [review]
Part 1 - StringMatch case

Just enough changes to make the StringMatch case work.
Attachment #8433331 - Flags: review?(luke)
(Assignee)

Comment 2

4 years ago
Created attachment 8433469 [details] [diff] [review]
Part 2 - RopeMatch case

Uses rope matching if all leaf nodes have the same encoding. This made it pretty easy to convert the code to handle Latin1 ropes/patterns.

I verified the test calls RopeMatchImpl (and removing the all-leaf-nodes-have-same-encoding check triggers assertion failures).
Attachment #8433469 - Flags: review?(luke)

Comment 3

4 years ago
Comment on attachment 8433331 [details] [diff] [review]
Part 1 - StringMatch case

Review of attachment 8433331 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsstr.cpp
@@ +1827,5 @@
>  
>  /* Result of a successfully performed flat match. */
>  class FlatMatch
>  {
> +    RootedAtom pat;

Pre-existing, but can you append _ to pat?  Mixed use of _ looks tacky to me.
Attachment #8433331 - Flags: review?(luke) → review+

Comment 4

4 years ago
Comment on attachment 8433469 [details] [diff] [review]
Part 2 - RopeMatch case

Review of attachment 8433469 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsstr.cpp
@@ +1398,4 @@
>              return false;
>          while (!r.empty()) {
> +            if (threshold-- == 0 ||
> +                r.front()->hasLatin1Chars() != text->hasLatin1Chars() ||

Might be worth it to hoist text->hasLatin1Chars(); I expect the compiler can't automatically.
Attachment #8433469 - Flags: review?(luke) → review+
(Assignee)

Comment 5

4 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/b3c9dcbc34c0
https://hg.mozilla.org/integration/mozilla-inbound/rev/f2b79650ac7b

There's more to do here, but it depends on the regexp engine. Yarr will hopefully be gone next week so I'll wait for that to avoid breaking it.
Keywords: leave-open
(Assignee)

Comment 7

4 years ago
Created attachment 8440792 [details] [diff] [review]
Part 3 - Fix JSSubString

This patch changes JSSubString from a (chars, length) struct to a (base, offset, length) struct. JSSubString isn't used across GC calls as far as I can see.
Attachment #8440792 - Flags: review?(terrence)
Comment on attachment 8440792 [details] [diff] [review]
Part 3 - Fix JSSubString

Review of attachment 8440792 [details] [diff] [review]:
-----------------------------------------------------------------

And now that it contains a raw JSString*, we'll hear about it from Hf if we ever do accidentally introduce GC while one is live. Nice!
Attachment #8440792 - Flags: review?(terrence) → review+
(Assignee)

Comment 11

4 years ago
Created attachment 8442100 [details] [diff] [review]
Part 4 - FindDollarIndex
Attachment #8442100 - Flags: review?(terrence)
(Assignee)

Comment 12

4 years ago
Created attachment 8442101 [details] [diff] [review]
Part 5 - BuildDollarReplacement
Attachment #8442101 - Flags: review?(terrence)
(Assignee)

Comment 13

4 years ago
Created attachment 8442102 [details] [diff] [review]
Part 6 - RegExp replacement
Attachment #8442102 - Flags: review?(terrence)
(Assignee)

Comment 14

4 years ago
Created attachment 8442103 [details] [diff] [review]
Part 7 - flattenPattern
(Assignee)

Updated

4 years ago
Attachment #8442103 - Flags: review?(terrence)
(Assignee)

Comment 15

4 years ago
Created attachment 8442104 [details] [diff] [review]
Part 8 - StrReplaceRegexpRemove

And of course there's an optimization for the case where the replacement is the empty string... So many paths str_replace can take, but I think I got all of them.
Attachment #8442104 - Flags: review?(terrence)
Comment on attachment 8442100 [details] [diff] [review]
Part 4 - FindDollarIndex

Review of attachment 8442100 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8442100 - Flags: review?(terrence) → review+
Comment on attachment 8442101 [details] [diff] [review]
Part 5 - BuildDollarReplacement

Review of attachment 8442101 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit-test/tests/latin1/replace.js
@@ +38,5 @@
> +    assertEq(s.replace(pat, "A$`\u1300"), "FooAFoo\u1300baz123");
> +    assertEq(s.replace(pat, "A$&\u1300"), "FooAbar\u1200\u1300baz123");
> +    assertEq(s.replace(pat, "A$'\u1300"), "FooAbaz123\u1300baz123");
> +}
> +testDollarReplacement();

Nice!
Attachment #8442101 - Flags: review?(terrence) → review+
Comment on attachment 8442102 [details] [diff] [review]
Part 6 - RegExp replacement

Review of attachment 8442102 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8442102 - Flags: review?(terrence) → review+
Comment on attachment 8442103 [details] [diff] [review]
Part 7 - flattenPattern

Review of attachment 8442103 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8442103 - Flags: review?(terrence) → review+
Comment on attachment 8442104 [details] [diff] [review]
Part 8 - StrReplaceRegexpRemove

Review of attachment 8442104 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8442104 - Flags: review?(terrence) → review+
You need to log in before you can comment on or make changes to this bug.