Closed Bug 847682 Opened 11 years ago Closed 11 years ago

AppendSubstrings has bad allocating behaviour

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: h4writer, Assigned: h4writer)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

Added in bug 820124, AppendSubstrings actually has bad behaviour by using RopeBuilder. 

Every append will create a new string and the contents needs to get copied. But we can really easy calculate the length of the new string and pre-allocate and fill it linearly.
Assignee: general → hv1989
Blocks: 806646
When appending using RopeBuilder and length of both strings combined is lower than JSShortString::MAX_SHORT_LENGTH, the strings get copied. This is particular bad for small matches.

This patch improves the following benchmark from 240s to 180s. That is on par with v8.
  var re11 = /\./g;
  var str = "3.5.0.0";
  for (var i = 0; i < 1000000; i++) {
    str.replace(re11, '');
  }
Attachment #720960 - Attachment is obsolete: true
Attachment #721163 - Flags: review?(sstangl)
Comment on attachment 721163 [details] [diff] [review]
AppendSubstrings should fill linearly before using RopeBuilder

Review of attachment 721163 [details] [diff] [review]:
-----------------------------------------------------------------

Nice! Locally this is a 5-7% speedup on v8-regexp. r+ with comments addressed.

The patch might be cleaner if it were restructured so that AppendSubstrings asks FlattenSubstrings whether a flattening can occur, but it's not a big deal.

::: js/src/jsstr.cpp
@@ +2362,5 @@
>  };
>  
> +static inline JSShortString *
> +FlattenSubstrings(JSContext *cx, Handle<JSStableString*> stableStr,
> +                  const StringRange *ranges, size_t rangesLen, size_t len)

"len" isn't a descriptive name. Perhaps "outputLen"?

@@ +2380,5 @@
> +    for (size_t i = 0; i < rangesLen; i++) {
> +        PodCopy(buf + pos, chars + ranges[i].start, ranges[i].length);
> +        pos += ranges[i].length;
> +
> +        JS_ASSERT(pos <= len);

This assert might look nicer as |JS_ASSERT(pos == outputLen)| after the loop.

@@ +2397,5 @@
>      /* For single substrings, construct a dependent string. */
>      if (rangesLen == 1)
>          return js_NewDependentString(cx, stableStr, ranges[0].start, ranges[0].length);
>  
> +    /* Concat Strings by filling JSShortStrings */

This comment should still read "/* Collect substrings into a rope. */", since it's still doing the same action, just with an additional optimization.

@@ +2404,2 @@
>      RopeBuilder rope(cx);
> +    while (end < rangesLen) {

|start| and |end| are always equal at the top of the loop. With that in mind, I would rename |start| to |i|, change the condition to |i < rangesLen|, and move the definition of |end| into the loop body.

It would also be OK to write |for (size_t i = 0; i < rangesLen; )|.

@@ +2404,5 @@
>      RopeBuilder rope(cx);
> +    while (end < rangesLen) {
> +
> +        /* Find maximum range that fits in JSShortString */
> +        size_t len = 0;

Perhaps "substrLen"?

@@ +2411,5 @@
> +                break;
> +            len += ranges[end].length;
> +        }
> +
> +        RootedString part(cx, NULL);

The Rooted should be moved outside of the loop -- there's a small cost to creating a Rooted object, but reassignation is free. This bug was preexisting. It should probably go above the "/* Concat Strings..." comment, with its own comment.

@@ +2414,5 @@
> +
> +        RootedString part(cx, NULL);
> +
> +        if (start == end) {
> +            /* If even one range doesn't fit JSShortString, use DependentString */

The wording in this comment is ambiguous -- better as "If not even one range fits JSShortString...".

@@ +2419,5 @@
> +            const StringRange &sr = ranges[start];
> +            part = js_NewDependentString(cx, stableStr, sr.start, sr.length);
> +            end++;
> +        } else {
> +            /* Create string by linearly copying stableStr into */

Comment appears cut off.

@@ +2429,5 @@
>  
>          /* Appending to the rope permanently roots the substring. */
> +        rope.append(part);
> +
> +        start = end;

With the change to |i < rangesLen|, this chunk moves into the else{} block.
Attachment #721163 - Flags: review?(sstangl) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/e057918ed405

(In reply to Sean Stangl [:sstangl] from comment #3)
> Comment on attachment 721163 [details] [diff] [review]
> Nice! Locally this is a 5-7% speedup on v8-regexp. r+ with comments
> addressed.

That is a bit optimistic. I measured 2%-3% locally and awfy seems to agree.
https://hg.mozilla.org/mozilla-central/rev/e057918ed405
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
(In reply to Hannes Verschore [:h4writer] from comment #4)
> That is a bit optimistic. I measured 2%-3% locally and awfy seems to agree.

It wasn't optimistic -- I actually measured it :)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: