Last Comment Bug 715265 - don't allocate a JSString per element in array_sort when there is no comparator
: don't allocate a JSString per element in array_sort when there is no comparator
Status: RESOLVED FIXED
[good first bug][mentor=luke@mozila.c...
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: mozilla12
Assigned To: general
:
Mentors:
Depends on: 720695
Blocks: 715422
  Show dependency treegraph
 
Reported: 2012-01-04 12:07 PST by Luke Wagner [:luke]
Modified: 2012-01-24 06:56 PST (History)
7 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch addressing all int32s case (5.70 KB, patch)
2012-01-08 15:44 PST, Santiago Gimeno
no flags Details | Diff | Review
Address Luke Wagner comments (5.79 KB, patch)
2012-01-09 15:09 PST, Santiago Gimeno
no flags Details | Diff | Review
Handle INT32_MIN (6.01 KB, patch)
2012-01-10 10:12 PST, Santiago Gimeno
luke: review+
Details | Diff | Review
Address stype nits and initial version avoiding tmp JSStrings (10.29 KB, patch)
2012-01-16 07:38 PST, Santiago Gimeno
no flags Details | Diff | Review
Address style nits and initial version avoiding tmp JSStrings (10.29 KB, patch)
2012-01-16 07:40 PST, Santiago Gimeno
no flags Details | Diff | Review
Address last Luke's comments (13.51 KB, patch)
2012-01-17 06:47 PST, Santiago Gimeno
no flags Details | Diff | Review
Fix typo from previous patch (patch #589181: Address last Luke's comments) (13.53 KB, patch)
2012-01-17 06:51 PST, Santiago Gimeno
no flags Details | Diff | Review
More fixes (13.57 KB, patch)
2012-01-19 02:25 PST, Santiago Gimeno
luke: review+
Details | Diff | Review

Description Luke Wagner [:luke] 2012-01-04 12:07:18 PST
When there is no comparator, there is no reason to call ToString to produce string values as this does a GC-thing allocation per array element that becomes garbage as soon as the function returns.  The fix should be pretty simple and have an enormous speedup (see bug 715181 comment 2): build all the jschar data in a single StringBuffer (calling ValueToStringBuffer) and have a second vector of indexes into this buffer.  Lastly, we should consider adding a special case (analogous to allStrings) for when all the values are int32s in which case we can just do a lexicographic compare directly on the ints (v8 does this).

Here's a test-case:

  var arr = [];
  for (var i = 0; i < 200000; ++i)
    arr.push(Math.floor(Math.random() * 200000));

  var bef = new Date();
  arr.sort();
  var aft = new Date();
  print(aft - bef);

a rough experiement takes us from 111ms to 13ms on my local machine.
Comment 1 Santiago Gimeno 2012-01-08 15:44:19 PST
Created attachment 586857 [details] [diff] [review]
Patch addressing all int32s case

This is my first patch for the JS engine so apologies for the mistakes :) and all comments are welcomed!.
I've tried to address first the all int32s case though probably in not the most optimal way. What do you think?
I also have question: does the case of negative integers have to be addressed? In case it has, how are the negative integers lexicographically ordered?

Thanks!
Comment 2 Luke Wagner [:luke] 2012-01-08 16:19:32 PST
Comment on attachment 586857 [details] [diff] [review]
Patch addressing all int32s case

Hi Santiago!  Thanks for the patch, it looks like a great start.

I'll point out some special cases below, but most of my nits would be wrt our style guideline:
https://wiki.mozilla.org/JavaScript:SpiderMonkey:C%2B%2B_Coding_Style

With respect to the core CompareLexicographicInt32: indeed we must handle negative ints; '-' is ordered less than '0'.  It is very important to make sure we nail the semantics exactly so I recommend checking out v8's implementation Runtime_SmiLexicographicCompare (in v8/src/runtime.cc).
 
>+inline int GetDigitsInt32(int32_t n) {
>+
>+    // taken from http://stackoverflow.com/a/1069088
>+    if (n < 10) return 1;
>+    if (n < 100) return 2;

You'll also see this in the v8 source, but consider JS_CEILING_LOG2 (or JS_FLOOR_LOG2) since these can be implemented efficiently with very few ops (using clz).

>+    if (!JS_CHECK_OPERATION_LIMIT(cx))
>+        return false;

This check is what lets the engine kill scripts that are running too long.  In our case I don't think it is necessary since, unlike the string case, we can know that even the biggest integer array sort will finish relatively quickly.

>+    if (aint == bint)
>+        *lessOrEqualp = true;
>+    else {

Once a single branch gets { braces }, all the other branches need braces too, so that means the then branch here.

>+        // Get number of digits of both integers.
>+        // If they have the same number of digits --> arithmetic comparison.
>+        // If digits_a > digits_b: a < b*10e(digits_a - digits_b).
>+        // If digits_b > digits_a: a*10e(digits_b - digits_a) <= b.

see multi-line comment style in the guideline

Please flag me for review when you have a new patch with these issues addressed, thanks!
Comment 3 Andreas Gal :gal 2012-01-08 16:23:34 PST
> This check is what lets the engine kill scripts that are running too long. 
> In our case I don't think it is necessary since, unlike the string case, we
> can know that even the biggest integer array sort will finish relatively
> quickly.

How do we know that? Are there checks along getter invocations for example? Native ones maybe?
Comment 4 Luke Wagner [:luke] 2012-01-08 16:24:50 PST
(In reply to Andreas Gal :gal from comment #3)
Read patch.
Comment 5 Andreas Gal :gal 2012-01-08 16:28:13 PST
I couldn't figure it out from the patch, hence the question :) I will check out the surrounding code tonight.
Comment 6 :Ms2ger 2012-01-09 01:22:36 PST
Because we only use this code if each value in the array isInt32(), no?
Comment 7 Luke Wagner [:luke] 2012-01-09 09:20:34 PST
You are correct, Sir!
Comment 8 Santiago Gimeno 2012-01-09 15:09:00 PST
Created attachment 587172 [details] [diff] [review]
Address Luke Wagner comments

Comparing the performance of both the logarithm implementation and the "trivial" implementation(1) of GetDigitsInt32, I've not been able to notice a relevant difference. Which should be chosen?

(1)
inline int GetDigitsInt32(int32_t n) {

    if (n >= 1000000000) return 10;
    if (n >= 100000000) return 9;
    if (n >= 10000000) return 8;
    if (n >= 1000000) return 7;
    if (n >= 100000) return 6;
    if (n >= 10000) return 5;
    if (n >= 1000) return 4;
    if (n >= 100) return 3;
    if (n >= 10) return 2;
    return 1;
}
Comment 9 :Ms2ger 2012-01-10 01:01:29 PST
Did you consider INT32_MIN?
Comment 10 Santiago Gimeno 2012-01-10 01:34:10 PST
(In reply to Ms2ger from comment #9)
> Did you consider INT32_MIN?

I'm not sure I understand. Where should I use it?
Comment 11 :Ms2ger 2012-01-10 07:19:47 PST
You should check if an array that contains it sorts correctly; there may be a bug where you negate it.
Comment 12 Santiago Gimeno 2012-01-10 07:25:23 PST
(In reply to Ms2ger from comment #11)
> You should check if an array that contains it sorts correctly; there may be
> a bug where you negate it.

Oh yes, I get it. I'll update the patch. Thanks!
Comment 13 Santiago Gimeno 2012-01-10 10:12:27 PST
Created attachment 587360 [details] [diff] [review]
Handle INT32_MIN
Comment 14 Luke Wagner [:luke] 2012-01-11 15:57:57 PST
Comment on attachment 587360 [details] [diff] [review]
Handle INT32_MIN

Review of attachment 587360 [details] [diff] [review]:
-----------------------------------------------------------------

Great job! I see a 3x speedup on the test case in comment 0 (clearly my quick and dirty estimate was too dirty...).  Also, of interest, the log2-based GetDigitsInt32 on my machine is 38% faster compared to the simple one in comment 8.

r+ with a few style nits (spacing, comments).  I assume you need someone to land the patch, so I'll just fix that directly if you'd like?  One question: is it ok with you if I rename GetDigitsInt32 to NumDigitsBase10?  ("Get" usually means "fallible operation" in SM.)

Santiago: would you be interested in continuing work on the other idea suggested in comment 0 (avoiding the creation of temporary GC strings that immediately become garbage) or should I split off a separate bug for later?
Comment 15 Santiago Gimeno 2012-01-11 16:35:36 PST
> 
> r+ with a few style nits (spacing, comments).  I assume you need someone to
> land the patch, so I'll just fix that directly if you'd like?  One question:
> is it ok with you if I rename GetDigitsInt32 to NumDigitsBase10?  ("Get"
> usually means "fallible operation" in SM.)
> 
> Santiago: would you be interested in continuing work on the other idea
> suggested in comment 0 (avoiding the creation of temporary GC strings that
> immediately become garbage) or should I split off a separate bug for later?

Thanks! Yes, I'd like to continue working on the bug so if you tell me about the style issues I can make the needed changes. Are you ok with this?

Regarding the issue with the temp strings, I started working on it and have one question: what class should I use to build the vector of indexes? Thanks in advance.
Comment 16 Luke Wagner [:luke] 2012-01-11 17:03:02 PST
Comment on attachment 587360 [details] [diff] [review]
Handle INT32_MIN

(In reply to Santiago Gimeno from comment #15)
> Thanks! Yes, I'd like to continue working on the bug so if you tell me about
> the style issues I can make the needed changes. Are you ok with this?

You bet, style nits below.

> Regarding the issue with the temp strings, I started working on it and have
> one question: what class should I use to build the vector of indexes? Thanks
> in advance.

You can use the StringBuffer class (jsstrinlines.h) + ValueToStringBuffer (jsstr.h).  You can see an example use in array_toString_sub.

>+static int32_t const PowersOf10Int32[] =
>+        {1, 10, 100, 1000, 10000, 100000,
>+         1000000, 10000000, 100000000, 1000000000};

static int32_t const powersOf10Int32[] = {
    1, 10, ...
};

>+inline int GetDigitsInt32(int32_t n) {

static inline int
NumDigitsBase10(int32_t n)
{
...
}

>+    int32_t bint = b.toInt32();
>+    /*
>+     * If both numbers are equal ... trivial

need newline before comments

>+        else if (digitsb > digitsa)
>+            *lessOrEqualp = (aint*PowersOf10Int32[digitsb - digitsa] <= bint);

just use 'else' not 'else if'
Comment 17 Santiago Gimeno 2012-01-16 07:38:13 PST
Created attachment 588879 [details] [diff] [review]
Address stype nits and initial version avoiding tmp JSStrings

Getting a segmentation fault with arrays around 60000 elements
Comment 18 Santiago Gimeno 2012-01-16 07:40:38 PST
Created attachment 588880 [details] [diff] [review]
Address style nits and initial version avoiding tmp JSStrings

Getting a segmentation fault with arrays around 60000 elements.
Comment 19 Santiago Gimeno 2012-01-16 09:40:41 PST
This script causes the segfault:

var arr =[];                                                                                                                                                                                                                                                        
for (var i = 0; i < 20000; ++i) {                                                                                                                                                                                                                                         
  arr.push(Math.floor(Math.random() * 20));                                                                                                                                                                                                                               
  arr.push(-(Math.floor(Math.random() * 20)));                                                                                                                                                                                                                            
  arr.push("H_" + i);                                                                                                                                                                                                                                                     
}                                                                                                                                                                                                                                                                         
var bef = new Date();                                                                                                                                                                                                                                                     
arr.sort();                                                                                                                                                                                                                                                               
var aft = new Date();                                                                                                                                                                                                                                                     
print("Total time:", aft - bef);
Comment 20 Luke Wagner [:luke] 2012-01-16 12:00:57 PST
Comment on attachment 588880 [details] [diff] [review]
Address style nits and initial version avoiding tmp JSStrings

One thing that I believe would cause a crash is that StringBuffer is backed by a Vector which resizes via realloc() which means the char buffer can move.  This means that the pointers stored in JSSubString can become invalid and indices should be used instead.  Hope this helps!

A few pre-emptory nits below:

>+        }
>+        /*
>+         *  ... get number of digits of both integers.

need extra \n between } and /*

>+CompareSubStrings(JSContext *cx, const JSSubString &a, const JSSubString &b, bool *lessOrEqualp)

Could you refactor this function to share more with CompareStrings?

>+                // order on vec[n:2n-1] by using strElements.index
>+                for (i = 0; i < n; i ++) {
>+                    vec[n + i] = vec[strElements[i].index];
>+                }

/* ... */ for single-line comments.  Also don't need { } around a single-line for-body.

I'll clear the review flag until the next patch is ready.
Comment 21 Santiago Gimeno 2012-01-17 06:47:42 PST
Created attachment 589181 [details] [diff] [review]
Address last Luke's comments
Comment 22 Santiago Gimeno 2012-01-17 06:51:12 PST
Created attachment 589182 [details] [diff] [review]
Fix typo from previous patch (patch #589181: Address last Luke's comments)
Comment 23 Luke Wagner [:luke] 2012-01-17 11:00:46 PST
Comment on attachment 589182 [details] [diff] [review]
Fix typo from previous patch (patch #589181: Address last Luke's comments)

Review of attachment 589182 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for addressing my comments, the patch is looking good!  I have a few more substantial comments this time and some more style nits.  I'll clear r? until the next patch.

::: js/src/jsarray.cpp
@@ +2084,1 @@
>      SortComparatorStringValuePairs(JSContext *cx)

It looks like SortComparatorStringValuePairs and StringValuePair can be removed.

@@ +2103,5 @@
> +struct SubStringIndexPair {
> +    size_t      subIndex;
> +    size_t      subLength;
> +    size_t      index;
> +};

Would you consider the names:

struct StringifedElement
{
  size_t charsBegin;
  size_t charsEnd;
  size_t elementIndex;
}

?  This would match the name 'strElements' nicely.

@@ +2114,5 @@
> +      : cx(cx), sb(sb) {}
> +
> +    bool operator()(const SubStringIndexPair &a, const SubStringIndexPair &b, bool *lessOrEqualp) {
> +        return CompareSubStringValues(cx, sb.begin() + a.subIndex, a.subLength,
> +                                sb.begin() + b.subIndex, b.subLength, lessOrEqualp);

When a call wraps lines, the style is to line up the args on the second line with the first arg (so 'c' in 'cx).  (There are a few more sites to fix in the patch.)

@@ +2278,2 @@
>              } else {
> +                /**

/*

@@ +2289,5 @@
> +                    return false;
> +
> +                SubStringIndexPair pair;
> +                jsint cursor = 0;
> +                size_t i = 0;

Could you push the decls for 'i' into the for's initializers and push the decl for 'pair' into the else branch?

@@ +2299,2 @@
>                          return false;
> +                    } else {

https://wiki.mozilla.org/JavaScript:SpiderMonkey:C_Coding_Style#Control_Flow wants there to be no 'else' when the 'if' branch contains a 'return'.

@@ +2301,5 @@
> +                        pair.subLength = sb.length() - cursor;
> +                        pair.subIndex = cursor;
> +                        pair.index = i;
> +                        strElements.infallibleAppend(pair);
> +                        cursor += pair.subLength;

I think it would be a bit more clear/efficient to replace 'cursor += pair.subLength' with 'cursor = sb.length()'.

@@ +2315,5 @@
>                      return false;
> +
> +                /* order on vec[n:2n-1] by using strElements.index */
> +                for (i = 0; i < n; i ++)
> +                    vec[n + i] = vec[strElements[i].index];

s/order/Order/

@@ +2319,5 @@
> +                    vec[n + i] = vec[strElements[i].index];
> +
> +                /* swap second part of the vector to the first */
> +                for (i = 0; i < n; i++)
> +                    vec[i] = vec[n + i];

I was initially torn about the need for this two-copy fixup.  I considered the possibility of merging sortElements with vec or copying from vec directly into the array, but both hit problems because InvokeArrayElements wants a linear array of values.  In theory, InitArrayElements should be templated to take an iterator so that we can relax this restriction, but that is for another day.  Instead, I think this second copy can be avoided by passing InitArrayElements (vec.begin() + n) instead of (vec.begin()).  Beware the 'vec.resize' call below.

@@ -2214,2 @@
>                  return false;
> -            }

While I agree it looks weird, if a condition is more than one line, the statement is always in braces.

::: js/src/jsstr.cpp
@@ +3420,5 @@
>  namespace js {
>  
>  static bool
> +CompareSubStringsImpl(const jschar *s1, size_t l1,
> +                        const jschar *s2, size_t l2, int32_t *result)

How about 'CompareChars' ?  Also, lines can be up to 99 chars so (with the new name) you don't need to wrap.  Lastly, can you put this in jsstrinlines.h (so it can be inlined into the array sort comparator)?  This will also avoid the need for the externed function that calls the static Impl function.

@@ +3449,5 @@
> +        return true;
> +    }
> +
> +    return CompareSubStringsImpl(str1->getChars(cx), str1->length(),
> +                                str2->getChars(cx), str2->length(), result);

Good factoring, but there is an annoying problem in the way: if str1->getChars(cx) fails (OOM), then we need to return before executing str2->getChars(cx).  Thus, I think you need to keep it in the longer form it was in before (this also allows you to hoist the null checks out of CompareSubStrings).
Comment 24 Santiago Gimeno 2012-01-19 02:25:36 PST
Created attachment 589808 [details] [diff] [review]
More fixes
Comment 25 Luke Wagner [:luke] 2012-01-23 10:45:28 PST
Comment on attachment 589808 [details] [diff] [review]
More fixes

This looks great, r+.  In addition to the 3.64x improvement for arrays of ints, I see a 13% speedup sorting 100 doubles and 10% speedup sorting 2M doubles which is pretty good for not changing the core sorting algorithm.  On the subject of changing the algorithm, I've been hearing good things about timsort (http://en.wikipedia.org/wiki/Timsort)... perhaps you'd be interested if you are not sick of array_sort ;-) ?

Three minor comments below.  If you agree with the comments, I can just make the changes directly before pushing the patch.  Does this sound good to you?

>+                Vector<StringifiedElement, 0, ContextAllocPolicy> strElements(cx);
>+                if (!strElements.reserve(2 * n))

The TempAllocPolicy would be better here since the malloc() allocations are temporary and thus we don't want to count them in gcMallocBytes.

>+                /* resize strElements so we can perform the sorting */
>+                if (!strElements.resize(2 * n))
>                     return false;

Because of the reserve, this can't fail.  Also, I don't think the new elements are read before they are written, so we can write:

  JS_ALWAYS_TRUE(strElements.resizeUninitialized(2 * n));

>+                /* Perform InitArrayElements starting on vec[n] */
>+                if (!InitArrayElements(cx, obj, 0, jsuint(n), vec.begin() + n, DontUpdateTypes))
>+                    return false;
>+
>+
>+                /* We are done! */
>+                done = true;

Instead of this 'done' flag, could we instead have a local above the branch 'Value *result = vec.begin();' and have 'result = vec.begin() + n;' on this path and then have only one call to InitArrayElements (the original) that uses 'results'?  But what about the 'vec.resize(n)' that precedes it?  I think it is silly to resize 'vec' just to "make the job of a potential GC under InitArrayElements easier", so we can just remove that line and the comment.
Comment 26 Santiago Gimeno 2012-01-23 13:23:19 PST
(In reply to Luke Wagner [:luke] from comment #25)
> Comment on attachment 589808 [details] [diff] [review]
> More fixes
> 
> This looks great, r+.  In addition to the 3.64x improvement for arrays of
> ints, I see a 13% speedup sorting 100 doubles and 10% speedup sorting 2M
> doubles which is pretty good for not changing the core sorting algorithm. 
> On the subject of changing the algorithm, I've been hearing good things
> about timsort (http://en.wikipedia.org/wiki/Timsort)... perhaps you'd be
> interested if you are not sick of array_sort ;-) ?

Yes, I would like to give it a try :D. Is there a bug for that?

> 
> Three minor comments below.  If you agree with the comments, I can just make
> the changes directly before pushing the patch.  Does this sound good to you?

Yes, please, make those changes.

Thanks!
Comment 27 Luke Wagner [:luke] 2012-01-23 13:50:48 PST
(In reply to Santiago Gimeno from comment #26)
> Yes, I would like to give it a try :D. Is there a bug for that?

No, but please feel free to file one.  (Core -> Components -> JS engine).

Yet another sort-algorithm optimization idea is: when allInts is true, use a radix sort.
Comment 29 Marco Bonardo [::mak] 2012-01-24 05:00:49 PST
https://hg.mozilla.org/mozilla-central/rev/0ec0b8f17be9

Note You need to log in before you can comment on or make changes to this bug.