Closed Bug 897305 Opened 11 years ago Closed 11 years ago

String.prototype.repeat() is slower than it could be

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla25

People

(Reporter: e_mayilme, Assigned: e_mayilme)

References

Details

(Whiteboard: [js:t] [js:perf])

Attachments

(2 files, 2 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:24.0) Gecko/20130723 Firefox/24.0 (Nightly/Aurora)
Build ID: 20130723004004

Steps to reproduce:

1. Repeated a string.


Actual results:

2. Found very slow performance.



Expected results:

3. Should have performed at least as good as a pure JavaScript string repetition solution. http://stackoverflow.com/questions/17823876/is-the-upcoming-firefox-24-javascript-string-repetition-algorithm-actually-broke
Here's an algorithm from 13 years ago which is much faster (at least 300 times in the test case), although a native solution (not implemented in JavaScript) would be ideal, of course, much faster than anything based in JavaScript.

function repeat(x, n) {
/* e.g., repeat("abc", 2) === "abcabc" */
    var s = '';
    for (;;) {
        if (n & 1) s += x;
        n >>= 1;
        if (n) x += x;
        else break;
    }
    return s;
}

http://jsperf.com/repeating-strings/2
Assignee: nobody → general
Component: Untriaged → JavaScript Engine
Product: Firefox → Core
If someone points me to the source file where this has been implemented, I would be happy to design a more efficient algorithm using an optimal series of memcpy calls of increasing size after memory allocation, similar to the design of the JavaScript algorithm.
Attached file benchmark
Here is a SpiderMonkey shell benchmark, and the built-in method is significantly slower. Here are the results:

test repeat took 0.010986328125ms
built-in repeat took 52.85888671875ms

CC-ing Till. I would like to take this up.
Assignee: general → sankha93
Status: UNCONFIRMED → NEW
Ever confirmed: true
The implementation is in js/src/builtin/String.js, as String_repeat.  

Don't be too sure a native implementation would be fastest; the more you stay in JS the language, the more the JITs know about and can optimize (because native functions presumptively can modify whole bunches of things the JITs can't make assumptions about).

Your implementation doesn't handle the preliminaries like CheckObjectCoercible or ToString, but I doubt those matter much.  What probably matters more is that your version avoids floating-point division (although it doesn't work correctly on counts >= 2**32).  A little preliminary work to shunt off such bad cases to code along the current lines, and a fast path when the length is < 2**32 (or maybe 2**31, because of the way we represent numbers), is probably the big thing to look at.  Certainly floating-point divisions and comparisons are to be avoided, and we make no effort to do that now.  That seems like the first place to tackle; once that's changed you can evaluate again.
Thanks so much, Jeff!

Sankha, I would be happy to collaborate with you in working on this. Let me know if there is anything you want me to help you do.
Joseph, thanks for reporting this. I did some very quick testing and it indeed looks like using your algorithm in our self-hosted implementation of String.prototype.repeat makes it much faster (~40x). We talked about there probably being optimization potential, but didn't follow up as it didn't seem to be worth it. It clearly is!

I have assigned the bug to you and would welcome a patch. You should be able to just stick your algorithm in the String_repeat function without changing the semantics at all.

I don't know how well you're acquainted with the development process for SpiderMonkey, so here's a couple of links that will help get you started should you not already have a working build environment:

https://developer.mozilla.org/en-US/docs/SpiderMonkey/Build_Documentation

In following that documentation, make sure to opt for the "advanced build". It's not really harder to set up than the "easy build", and is much, much better to work with.

Documentation specifically about the self-hosting infrastructure can be found here:

https://developer.mozilla.org/en-US/docs/SpiderMonkey/Internals/self-hosting

The part about testing without recompilation might be of interest.

If you still need any help, feel free to ask here, or in the #jsapi IRC channel. For more information on that (and I apologize if all of this is old news to you), check the documentation here: https://wiki.mozilla.org/IRC

@sankha: thanks for volunteering to work on this. I don't however think that there's much potential for collaboration here, and it's only fair to let Joseph have a go at this, given that he reported the bug and has the solution in hand.
Assignee: sankha93 → e_mayilme
Status: NEW → ASSIGNED
OS: Windows 8 → All
Hardware: x86_64 → All
Whiteboard: [js:t] [js:perf]
Version: 24 Branch → Trunk
Actually, come to think of it, in our engine the maximum string length is 2**28 - 1, so there's no need for a bigger-than-2**32 slow-path.  Just throwing if the count is 2**28 or greater -- if the string is non-empty -- should handle all the big-number cases.  (And if the string's empty, any non-negative finite count should be fine, will just need special-case handling to return "" without doing O(count) work then.)
Summary: New String.prototype.repeat() method seems to be problematic → String.prototype.repeat() is slower than it could be
Thank you. I'll plan to work on this tonight and get a patch created.
My patch is finished. I'm in the process of reading how to submit my patch, which will probably happen after I'm back from work today or tomorrow. Thanks.
Awesome!

You can, if you want, already attach it to this bug to get feedback. Correct Mercurial setup for patch submission and all of that can come later.
At Till's encouragement I submitting what I have, although I have not had time yet to do the proper steps in preparing a patch. As soon as I have time I hope to learn how to do that as well. Thanks!
Attachment #781500 - Attachment is patch: true
Comment on attachment 781500 [details] [diff] [review]
Incorporate Joseph Myers' public domain algorithm to improve performance of String_repeat

Review of attachment 781500 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this looks pretty good, already.

For the next version, can you mark the attachment as a patch? Additionally, you can look at https://developer.mozilla.org/en-US/docs/Mercurial_FAQ#How_can_I_generate_a_patch_for_somebody_else_to_check-in_for_me.3F to see how you set author info. Don't worry about that too much, though: I'll gladly fix it up for you.

Can you also throw an error for over-long strings as requested by Waldo in comment 7? One nit about that: you should throw if `S.length * count >= 2**28`, not only if `count >= 2**28`.

I have some additional comments inline, below:

::: String.js
@@ +20,4 @@
>      if (n < 0 || n === std_Number_POSITIVE_INFINITY)
>          ThrowError(JSMSG_REPEAT_RANGE);
>  
> +    // Public domain algorithm: October 2000 / August 2008 by Joseph Myers

We don't really do this type of attribution in our source code. The patch will contain your author info, so the change will be attributed that way.

@@ +23,5 @@
> +    // Public domain algorithm: October 2000 / August 2008 by Joseph Myers
> +    /* Note: do not convert this to a while (n > 0) loop!
> +     * The infinite for loop, the break statement, and the preceding
> +     * if (n) condition are exactly the the way they are in order to
> +     * prevent an extra step of exponential string multiplication.

I'm not convinced. How would `while (n)` with the `else break` removed behave any different at all?

@@ +25,5 @@
> +     * The infinite for loop, the break statement, and the preceding
> +     * if (n) condition are exactly the the way they are in order to
> +     * prevent an extra step of exponential string multiplication.
> +     */
> +    var result = '';

The "// Step .." comment should be left intact.

Is there any particular reason why you removed the "if (n === 0)" early return? From your comment, it sounds like you'd want to have something like it, still.

@@ +27,5 @@
> +     * prevent an extra step of exponential string multiplication.
> +     */
> +    var result = '';
> +    for (;;) {
> +        if (n & 1) result += S;

Nit: body of `if` statement on new line and indented 4 spaces

@@ +29,5 @@
> +    var result = '';
> +    for (;;) {
> +        if (n & 1) result += S;
> +        n >>= 1;
> +        if (n) S += S;

Nit: body of `if` statement on new line and indented 4 spaces

@@ +30,5 @@
> +    for (;;) {
> +        if (n & 1) result += S;
> +        n >>= 1;
> +        if (n) S += S;
> +        else break;

Nit: body of `else` statement on new line and indented 4 spaces
> I'm not convinced. How would `while (n)` with
> the `else break` removed behave any different at all?

If we could take if (n) out of the loop body, then while (n) would be justified. However, I have to check the value of n in the body of the loop because the result S += S is only used in the next loop iteration, and if I don't check the value of n in the body of the previous loop iteration, then it will exponentially double the string every time. The computational cost of this doubling doubles at every iteration, so avoiding this last unnecessary doubling is definitely a contributor to the algorithm's performance.

(I made the comment because so many modified versions of this algorithm are floating around with worse performance, many of them altered to use a while (n > 0) loop (and changing the if (n) / else statements to simply S += S) by people who thought they were simplifying and improving the code. But the result was unnecessarily creating a string almost twice as long as the one actually needed. In case anyone may be using the source code to learn from, I thought that the comment would be helpful. But if you feel I should remove the comment and let the algorithm speak for itself, then I would be happy to do so.)

Yes, you *could* do a while (n > 0) loop and only eliminate the else break statement and leave the if (n) conditional statement there, but that would be redundant. If the value of n is already known to be zero after if (n) has been checked, I might as well break right there with an "else break;" statement rather than move back to the beginning of the loop plus check the value of n again.

> Is there any particular reason why you removed
> the "if (n === 0)" early return?

Yes, my reason is that this algorithm works even if n === 0, and there is no reason to create an extra branch and slow down performance of repeating characters for the purpose of imperceptibly increasing the performance of generating an empty string. Rare examples exist where increasing performance of creating empty strings would be useful, for instance, if String.repeat was used to perform space padding in a terminal emulator and all the text was exactly 80 characters wide already. But even then, the only place it makes sense to optimize that kind of situation is in outside code, when the optimization would also have the benefit of avoiding the higher costs of invoking a method and the overhead of sanity checking within the String_repeat function. Optimizing inside the code can at most saves only two extremely small operations (three operations saved minus one additional operation) at the cost of a few big invocation operations and two other small overhead operations. Optimizing outside the code would only take one extremely small operation, plus it could also cover the case of not calling String.repeat for both an empty string and a non-repeated string in one step. If we wanted to optimize that, we would need to add yet another line, i.e., if (n === 1) return S. You see where I'm going?

I'll work on addressing your other comments and marking it as a patch while following the directions at the link you gave.

Your help is much appreciated!
(In reply to Joseph Myers from comment #13)
> Yes, you *could* do a while (n > 0) loop and only eliminate the else break
> statement and leave the if (n) conditional statement there, but that would
> be redundant. If the value of n is already known to be zero after if (n) has
> been checked, I might as well break right there with an "else break;"
> statement rather than move back to the beginning of the loop plus check the
> value of n again.

Ok, I guess removing the extra check makes sense. I do think that our JIT would remove one of them, but there's no need to make its analysis work harder than it has to be.

> 
> > Is there any particular reason why you removed
> > the "if (n === 0)" early return?
> 
> Yes, my reason is that this algorithm works even if n === 0, and there is no
> reason to create an extra branch and slow down performance of repeating
> characters for the purpose of imperceptibly increasing the performance of
> generating an empty string. Rare examples exist where increasing performance
> of creating empty strings would be useful, for instance, if String.repeat
> was used to perform space padding in a terminal emulator and all the text
> was exactly 80 characters wide already. But even then, the only place it
> makes sense to optimize that kind of situation is in outside code, when the
> optimization would also have the benefit of avoiding the higher costs of
> invoking a method and the overhead of sanity checking within the
> String_repeat function. Optimizing inside the code can at most saves only
> two extremely small operations (three operations saved minus one additional
> operation) at the cost of a few big invocation operations and two other
> small overhead operations. Optimizing outside the code would only take one
> extremely small operation, plus it could also cover the case of not calling
> String.repeat for both an empty string and a non-repeated string in one
> step. If we wanted to optimize that, we would need to add yet another line,
> i.e., if (n === 1) return S. You see where I'm going?

I do, yes, and that makes sense. Checking for empty input strings, however, is something we should definitely add, as repeating those isn't cut short otherwise.

> 
> I'll work on addressing your other comments and marking it as a patch while
> following the directions at the link you gave.
> 
> Your help is much appreciated!

So is yours :)
Just to be clear, I am reading the specification, and I see these 9 steps:

/* begin 9 steps */
The following steps are taken:
1.	Let O be CheckObjectCoercible(this value).
2.	Let S be ToString(O).
3.	ReturnIfAbrupt(S).
4.	Let n be the result of calling ToInteger(count).
5.	ReturnIfAbrupt(n).
6.	If n < 0, then throw a RangeError exception.
7.	If n is  +Infinity, then throw a RangeError exception.
8.	Let T be a String value that is made from n copies of S appended together. If n is 0, T is the empty String.
9.	Return T.

/* end 9 steps*/

Do you want me to

(1). replace Infinity in step 7 with "If n*S.length >= 2**28 -1" and yet still return the error "repeat count must be positive and less than inifinity [sic]" (note there is a typo in js.msg right now)??

Or do you want me to

(2). introduce a step 7' and a new error message defined in js.msg that says something more specific like "the product of N and String.length must be less than 2^28 - 1"

I like simplicity, and simplicity to me means that the most String_repeat should do before beginning its algorithm is to the ensure the premises on which the algorithm works are both valid: i.e.

* to coerce S to a string (this is sufficient in and of itself, because strings always have a length >= 0 and any string length is fine for the algorithm, including 0), and
* ensure n is an integer from 0 to (1<<28)-1.

So I would prefer the first of the two options given above rather than going to the effort of adding a new error message to js.msg and so forth.

However, at the other extreme, one could say that the 9 steps in the algorithm have left out the obvious consideration of what to do if the requested result is much too big to fit in memory. And so perhaps now would be the time to both change the specification and add a new error message (or just use an existing one, like a message having to do with insufficient memory to complete an operation).
(In reply to Joseph Myers from comment #15)
> Do you want me to
> 
> (1). replace Infinity in step 7 with "If n*S.length >= 2**28 -1" and yet
> still return the error "repeat count must be positive and less than
> inifinity [sic]" (note there is a typo in js.msg right now)??
> 
> Or do you want me to
> 
> (2). introduce a step 7' and a new error message defined in js.msg that says
> something more specific like "the product of N and String.length must be
> less than 2^28 - 1"

I agree with you: option 1 makes more sense. In my opinion, we should just throw a RangeError with a different message and be done with it. Maybe changing the spec to something like this makes sense:

7. If n * S.length is larger than the maximum supported string length, then throw a RangeError exception.

Waldo, what do you think?

(Note that the OOM situation is handled during attempts to actually allocate the string, and we can't meaningfully prevent the allocations from happening in the first place.)
Flags: needinfo?(jwalden+bmo)
Here is what I am thinking about making into a patch and submitting--any thoughts?

/* ES6 20121122 draft 15.5.4.21. */
function String_repeat(count) {
    // Steps 1-3.
    CheckObjectCoercible(this);
    var S = ToString(this);

    // Steps 4-5.
    var n = ToInteger(count);

    // Steps 6-7.
	if (!(n >= 0 && n*S.length < (1<<28)))
        ThrowError(JSMSG_REPEAT_RANGE);

    // Step 8-9.
    var T = '';
    for (;;) {
        if (n & 1)
            T += S;
        n >>= 1;
        if (n)
            S += S;
        else
            break;
        }
    return T;
}

There are eight cases to be considered in steps 6-9, and this handles them all correctly:

String length = 0, count < 0: throws error
String length = 0, count = 0: returns empty string
String length = 0, count = finite positive integer: returns empty string
String length = 0, count = Infinity: throws error*
String length = finite positive integer, count < 0: throws error
String length = finite positive integer, count = 0: returns empty string
String length = finite positive integer, count = finite positive integer and (String length) * count < 2^28: runs algorithm and returns count copies of the String concatenated together
String length = finite positive integer, count = Infinity: throws error


*To make this case work correctly is why I had to use the conditional
if (!(n >= 0 && n*S.length < (1<<28)))

rather than
if (n < 0 || n*S.length > (1<<28)-1))
because Infinity*0 = NaN. Note that NaN is not > 2^28 -1 and also NaN is not < 2^28.
(In reply to Joseph Myers from comment #17)
> Here is what I am thinking about making into a patch and submitting--any
> thoughts?

Yes, one: very nice! Please do submit a patch with that code, and if Waldo is ok with the RangeError-throwing like this, we'll land it.
(In reply to Till Schneidereit [:till] from comment #16)
> In my opinion, we should just throw a RangeError with a different message
> and be done with it. Maybe changing the spec to something like this makes
> sense:
> 
> 7. If n * S.length is larger than the maximum supported string length, then
> throw a RangeError exception.
> 
> Waldo, what do you think?

So the behavior change would be from js_ReportAllocationOverflow after a bunch of work (and not even, for embeddings where OOM is fatal), to throwing a RangeError immediately?  Sounds good to me.
Flags: needinfo?(jwalden+bmo)
Regarding error messages, I'd tend to split out the different cases and provide clearer error messages for each.  So then:

  if (n < 0)
      ThrowError(JSMSG_NEGATIVE_REPETITION_COUNT); // a RangeError
  if (n * S.length >= (1 << 28))
      ThrowError(JSMSG_RESULTING_STRING_TOO_LARGE); // a RangeError

This is marginally slower, sure.  But we're talking one or two math ops, which really is about nothing.  The value of better messages to the people that hit them, outweighs the value of a couple less math ops to the minuscule number of uses that can even notice the extra operation.

Also, we should add explicit steps to make clear to range analysis that |n| will be in the range [0, 1 << 28).

  // Communicate |n|'s possible range to the compiler.
  n = n & ((1 << 28) - 1);

  var T = "";
  ...
Joseph, it would be great if you could incorporate the suggestions Waldo made in comment 20, and attach a proper patch.

The next uplift to Aurora will happen on monday, so if we want to get this into Firefox 25, we need a patch before then. I'm happy to land it on the weekend. And I'm also happy to create a patch with proper author info for you, if the email address you're using here is ok for that.
...and of course ignore the aspect of my suggestions that doesn't handle "".repeat(Infinity), natch.  ;-)  But this is why we'll have tests for all these edge cases in our test suite, of course.  \o/
Yes, I can do this. I followed the instructions for making a proper patch, except that I am having the worst time actually trying to build SpiderMonkey and test my patch. The configure scripts didn't detect GCC properly even though it is installed (I'm using FreeBSD 9.1), and many other issues.

I am attaching my patch in a couple of minutes, but it is NOT tested in the context of the actual build. There may be typos I didn't notice in my updates to js.msg, for example, or other places. The patch looks good to me, and the algorithm has been working fine throughout the Internet for more than a decade, but I hate to submit anything in a form that I haven't fully tested myself.

I also installed the Mozilla tools on Windows 8, but it's really frustrating working in a MINGW32 window, and I ran into some issues there as well. (Plus this is finals week and I'm rather busy or else I would enjoy the chance to play with this!)
Please test in an actual build and verify that all the surrounding code and error message definitions are properly defined and used. Also test the algorithm and the return values for the main eight cases of arguments that may be passed to it, as described in my recent comment.
Attachment #781500 - Attachment is obsolete: true
(In reply to Joseph Myers from comment #23)
> Plus this is finals week and I'm rather busy or else I would enjoy the
> chance to play with this!

Slacker, that never stopped me.  Though, come to think of it, maybe it should... ;-)  Good luck with finals, catch you when you're done.
Attachment #784793 - Flags: review?(till)
(In reply to Joseph Myers from comment #23)
> Yes, I can do this. I followed the instructions for making a proper patch,
> except that I am having the worst time actually trying to build SpiderMonkey
> and test my patch. The configure scripts didn't detect GCC properly even
> though it is installed (I'm using FreeBSD 9.1), and many other issues.

My guess is that building on FreeBSD works for the entire browser, but that it's been quite a while since anybody tried building just the shell. Don't know, though. Sorry it's giving you trouble.
> 
> I am attaching my patch in a couple of minutes, but it is NOT tested in the
> context of the actual build. There may be typos I didn't notice in my
> updates to js.msg, for example, or other places. The patch looks good to me,
> and the algorithm has been working fine throughout the Internet for more
> than a decade, but I hate to submit anything in a form that I haven't fully
> tested myself.

Don't worry about it at all. It's looking pretty good, and I'll add some test cases and push this thing.

> 
> I also installed the Mozilla tools on Windows 8, but it's really frustrating
> working in a MINGW32 window, and I ran into some issues there as well. (Plus
> this is finals week and I'm rather busy or else I would enjoy the chance to
> play with this!)

Oh, you should definitely not spend any time on this, then! :)

Good luck!
Thank you, Jeff and Till and all!
Attached patch completed patchSplinter Review
I have completed Joseph's patch and added a test also.
Attachment #784793 - Attachment is obsolete: true
Attachment #784793 - Flags: review?(till)
Attachment #785335 - Flags: review?(till)
Comment on attachment 785335 [details] [diff] [review]
completed patch

https://hg.mozilla.org/integration/mozilla-inbound/rev/7c3cb4883157

I very slightly juggled things around for the final version, but it's substantially still the same patch.


Thanks again for working on this, and doing it so thoroughly!
Attachment #785335 - Flags: review?(till) → review+
https://hg.mozilla.org/mozilla-central/rev/7c3cb4883157
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: