don't allocate a JSString per element in array_sort when there is no comparator

RESOLVED FIXED in mozilla12

Status

()

Core
JavaScript Engine
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: luke, Unassigned)

Tracking

unspecified
mozilla12
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

([good first bug][mentor=luke@mozila.com][lang=c++])

Attachments

(1 attachment, 7 obsolete attachments)

(Reporter)

Description

5 years ago
When there is no comparator, there is no reason to call ToString to produce string values as this does a GC-thing allocation per array element that becomes garbage as soon as the function returns.  The fix should be pretty simple and have an enormous speedup (see bug 715181 comment 2): build all the jschar data in a single StringBuffer (calling ValueToStringBuffer) and have a second vector of indexes into this buffer.  Lastly, we should consider adding a special case (analogous to allStrings) for when all the values are int32s in which case we can just do a lexicographic compare directly on the ints (v8 does this).

Here's a test-case:

  var arr = [];
  for (var i = 0; i < 200000; ++i)
    arr.push(Math.floor(Math.random() * 200000));

  var bef = new Date();
  arr.sort();
  var aft = new Date();
  print(aft - bef);

a rough experiement takes us from 111ms to 13ms on my local machine.

Updated

5 years ago
Whiteboard: [good first bug][mentor=luke@mozila.com] → [good first bug][mentor=luke@mozila.com][lang=c++]

Comment 1

5 years ago
Created attachment 586857 [details] [diff] [review]
Patch addressing all int32s case

This is my first patch for the JS engine so apologies for the mistakes :) and all comments are welcomed!.
I've tried to address first the all int32s case though probably in not the most optimal way. What do you think?
I also have question: does the case of negative integers have to be addressed? In case it has, how are the negative integers lexicographically ordered?

Thanks!
Attachment #586857 - Flags: review?(luke)
(Reporter)

Comment 2

5 years ago
Comment on attachment 586857 [details] [diff] [review]
Patch addressing all int32s case

Hi Santiago!  Thanks for the patch, it looks like a great start.

I'll point out some special cases below, but most of my nits would be wrt our style guideline:
https://wiki.mozilla.org/JavaScript:SpiderMonkey:C%2B%2B_Coding_Style

With respect to the core CompareLexicographicInt32: indeed we must handle negative ints; '-' is ordered less than '0'.  It is very important to make sure we nail the semantics exactly so I recommend checking out v8's implementation Runtime_SmiLexicographicCompare (in v8/src/runtime.cc).
 
>+inline int GetDigitsInt32(int32_t n) {
>+
>+    // taken from http://stackoverflow.com/a/1069088
>+    if (n < 10) return 1;
>+    if (n < 100) return 2;

You'll also see this in the v8 source, but consider JS_CEILING_LOG2 (or JS_FLOOR_LOG2) since these can be implemented efficiently with very few ops (using clz).

>+    if (!JS_CHECK_OPERATION_LIMIT(cx))
>+        return false;

This check is what lets the engine kill scripts that are running too long.  In our case I don't think it is necessary since, unlike the string case, we can know that even the biggest integer array sort will finish relatively quickly.

>+    if (aint == bint)
>+        *lessOrEqualp = true;
>+    else {

Once a single branch gets { braces }, all the other branches need braces too, so that means the then branch here.

>+        // Get number of digits of both integers.
>+        // If they have the same number of digits --> arithmetic comparison.
>+        // If digits_a > digits_b: a < b*10e(digits_a - digits_b).
>+        // If digits_b > digits_a: a*10e(digits_b - digits_a) <= b.

see multi-line comment style in the guideline

Please flag me for review when you have a new patch with these issues addressed, thanks!
Attachment #586857 - Flags: review?(luke)

Comment 3

5 years ago
> This check is what lets the engine kill scripts that are running too long. 
> In our case I don't think it is necessary since, unlike the string case, we
> can know that even the biggest integer array sort will finish relatively
> quickly.

How do we know that? Are there checks along getter invocations for example? Native ones maybe?
(Reporter)

Comment 4

5 years ago
(In reply to Andreas Gal :gal from comment #3)
Read patch.

Comment 5

5 years ago
I couldn't figure it out from the patch, hence the question :) I will check out the surrounding code tonight.
Because we only use this code if each value in the array isInt32(), no?
(Reporter)

Comment 7

5 years ago
You are correct, Sir!

Comment 8

5 years ago
Created attachment 587172 [details] [diff] [review]
Address Luke Wagner comments

Comparing the performance of both the logarithm implementation and the "trivial" implementation(1) of GetDigitsInt32, I've not been able to notice a relevant difference. Which should be chosen?

(1)
inline int GetDigitsInt32(int32_t n) {

    if (n >= 1000000000) return 10;
    if (n >= 100000000) return 9;
    if (n >= 10000000) return 8;
    if (n >= 1000000) return 7;
    if (n >= 100000) return 6;
    if (n >= 10000) return 5;
    if (n >= 1000) return 4;
    if (n >= 100) return 3;
    if (n >= 10) return 2;
    return 1;
}
Attachment #586857 - Attachment is obsolete: true
Attachment #587172 - Flags: review?(luke)
Did you consider INT32_MIN?

Comment 10

5 years ago
(In reply to Ms2ger from comment #9)
> Did you consider INT32_MIN?

I'm not sure I understand. Where should I use it?
You should check if an array that contains it sorts correctly; there may be a bug where you negate it.

Comment 12

5 years ago
(In reply to Ms2ger from comment #11)
> You should check if an array that contains it sorts correctly; there may be
> a bug where you negate it.

Oh yes, I get it. I'll update the patch. Thanks!

Comment 13

5 years ago
Created attachment 587360 [details] [diff] [review]
Handle INT32_MIN
Attachment #587172 - Attachment is obsolete: true
Attachment #587172 - Flags: review?(luke)
Attachment #587360 - Flags: review?(luke)
(Reporter)

Comment 14

5 years ago
Comment on attachment 587360 [details] [diff] [review]
Handle INT32_MIN

Review of attachment 587360 [details] [diff] [review]:
-----------------------------------------------------------------

Great job! I see a 3x speedup on the test case in comment 0 (clearly my quick and dirty estimate was too dirty...).  Also, of interest, the log2-based GetDigitsInt32 on my machine is 38% faster compared to the simple one in comment 8.

r+ with a few style nits (spacing, comments).  I assume you need someone to land the patch, so I'll just fix that directly if you'd like?  One question: is it ok with you if I rename GetDigitsInt32 to NumDigitsBase10?  ("Get" usually means "fallible operation" in SM.)

Santiago: would you be interested in continuing work on the other idea suggested in comment 0 (avoiding the creation of temporary GC strings that immediately become garbage) or should I split off a separate bug for later?
Attachment #587360 - Flags: review?(luke) → review+

Comment 15

5 years ago
> 
> r+ with a few style nits (spacing, comments).  I assume you need someone to
> land the patch, so I'll just fix that directly if you'd like?  One question:
> is it ok with you if I rename GetDigitsInt32 to NumDigitsBase10?  ("Get"
> usually means "fallible operation" in SM.)
> 
> Santiago: would you be interested in continuing work on the other idea
> suggested in comment 0 (avoiding the creation of temporary GC strings that
> immediately become garbage) or should I split off a separate bug for later?

Thanks! Yes, I'd like to continue working on the bug so if you tell me about the style issues I can make the needed changes. Are you ok with this?

Regarding the issue with the temp strings, I started working on it and have one question: what class should I use to build the vector of indexes? Thanks in advance.
(Reporter)

Updated

5 years ago
Blocks: 715422
(Reporter)

Comment 16

5 years ago
Comment on attachment 587360 [details] [diff] [review]
Handle INT32_MIN

(In reply to Santiago Gimeno from comment #15)
> Thanks! Yes, I'd like to continue working on the bug so if you tell me about
> the style issues I can make the needed changes. Are you ok with this?

You bet, style nits below.

> Regarding the issue with the temp strings, I started working on it and have
> one question: what class should I use to build the vector of indexes? Thanks
> in advance.

You can use the StringBuffer class (jsstrinlines.h) + ValueToStringBuffer (jsstr.h).  You can see an example use in array_toString_sub.

>+static int32_t const PowersOf10Int32[] =
>+        {1, 10, 100, 1000, 10000, 100000,
>+         1000000, 10000000, 100000000, 1000000000};

static int32_t const powersOf10Int32[] = {
    1, 10, ...
};

>+inline int GetDigitsInt32(int32_t n) {

static inline int
NumDigitsBase10(int32_t n)
{
...
}

>+    int32_t bint = b.toInt32();
>+    /*
>+     * If both numbers are equal ... trivial

need newline before comments

>+        else if (digitsb > digitsa)
>+            *lessOrEqualp = (aint*PowersOf10Int32[digitsb - digitsa] <= bint);

just use 'else' not 'else if'

Comment 17

5 years ago
Created attachment 588879 [details] [diff] [review]
Address stype nits and initial version avoiding tmp JSStrings

Getting a segmentation fault with arrays around 60000 elements
Attachment #588879 - Attachment is patch: true

Comment 18

5 years ago
Created attachment 588880 [details] [diff] [review]
Address style nits and initial version avoiding tmp JSStrings

Getting a segmentation fault with arrays around 60000 elements.
Attachment #588879 - Attachment is obsolete: true

Comment 19

5 years ago
This script causes the segfault:

var arr =[];                                                                                                                                                                                                                                                        
for (var i = 0; i < 20000; ++i) {                                                                                                                                                                                                                                         
  arr.push(Math.floor(Math.random() * 20));                                                                                                                                                                                                                               
  arr.push(-(Math.floor(Math.random() * 20)));                                                                                                                                                                                                                            
  arr.push("H_" + i);                                                                                                                                                                                                                                                     
}                                                                                                                                                                                                                                                                         
var bef = new Date();                                                                                                                                                                                                                                                     
arr.sort();                                                                                                                                                                                                                                                               
var aft = new Date();                                                                                                                                                                                                                                                     
print("Total time:", aft - bef);

Updated

5 years ago
Attachment #588880 - Flags: review?(luke)
(Reporter)

Comment 20

5 years ago
Comment on attachment 588880 [details] [diff] [review]
Address style nits and initial version avoiding tmp JSStrings

One thing that I believe would cause a crash is that StringBuffer is backed by a Vector which resizes via realloc() which means the char buffer can move.  This means that the pointers stored in JSSubString can become invalid and indices should be used instead.  Hope this helps!

A few pre-emptory nits below:

>+        }
>+        /*
>+         *  ... get number of digits of both integers.

need extra \n between } and /*

>+CompareSubStrings(JSContext *cx, const JSSubString &a, const JSSubString &b, bool *lessOrEqualp)

Could you refactor this function to share more with CompareStrings?

>+                // order on vec[n:2n-1] by using strElements.index
>+                for (i = 0; i < n; i ++) {
>+                    vec[n + i] = vec[strElements[i].index];
>+                }

/* ... */ for single-line comments.  Also don't need { } around a single-line for-body.

I'll clear the review flag until the next patch is ready.
Attachment #588880 - Flags: review?(luke)

Comment 21

5 years ago
Created attachment 589181 [details] [diff] [review]
Address last Luke's comments
Attachment #587360 - Attachment is obsolete: true
Attachment #588880 - Attachment is obsolete: true
Attachment #589181 - Flags: review?(luke)

Comment 22

5 years ago
Created attachment 589182 [details] [diff] [review]
Fix typo from previous patch (patch #589181: Address last Luke's comments)
Attachment #589181 - Attachment is obsolete: true
Attachment #589181 - Flags: review?(luke)
Attachment #589182 - Flags: review?(luke)
(Reporter)

Comment 23

5 years ago
Comment on attachment 589182 [details] [diff] [review]
Fix typo from previous patch (patch #589181: Address last Luke's comments)

Review of attachment 589182 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for addressing my comments, the patch is looking good!  I have a few more substantial comments this time and some more style nits.  I'll clear r? until the next patch.

::: js/src/jsarray.cpp
@@ +2084,1 @@
>      SortComparatorStringValuePairs(JSContext *cx)

It looks like SortComparatorStringValuePairs and StringValuePair can be removed.

@@ +2103,5 @@
> +struct SubStringIndexPair {
> +    size_t      subIndex;
> +    size_t      subLength;
> +    size_t      index;
> +};

Would you consider the names:

struct StringifedElement
{
  size_t charsBegin;
  size_t charsEnd;
  size_t elementIndex;
}

?  This would match the name 'strElements' nicely.

@@ +2114,5 @@
> +      : cx(cx), sb(sb) {}
> +
> +    bool operator()(const SubStringIndexPair &a, const SubStringIndexPair &b, bool *lessOrEqualp) {
> +        return CompareSubStringValues(cx, sb.begin() + a.subIndex, a.subLength,
> +                                sb.begin() + b.subIndex, b.subLength, lessOrEqualp);

When a call wraps lines, the style is to line up the args on the second line with the first arg (so 'c' in 'cx).  (There are a few more sites to fix in the patch.)

@@ +2278,2 @@
>              } else {
> +                /**

/*

@@ +2289,5 @@
> +                    return false;
> +
> +                SubStringIndexPair pair;
> +                jsint cursor = 0;
> +                size_t i = 0;

Could you push the decls for 'i' into the for's initializers and push the decl for 'pair' into the else branch?

@@ +2299,2 @@
>                          return false;
> +                    } else {

https://wiki.mozilla.org/JavaScript:SpiderMonkey:C_Coding_Style#Control_Flow wants there to be no 'else' when the 'if' branch contains a 'return'.

@@ +2301,5 @@
> +                        pair.subLength = sb.length() - cursor;
> +                        pair.subIndex = cursor;
> +                        pair.index = i;
> +                        strElements.infallibleAppend(pair);
> +                        cursor += pair.subLength;

I think it would be a bit more clear/efficient to replace 'cursor += pair.subLength' with 'cursor = sb.length()'.

@@ +2315,5 @@
>                      return false;
> +
> +                /* order on vec[n:2n-1] by using strElements.index */
> +                for (i = 0; i < n; i ++)
> +                    vec[n + i] = vec[strElements[i].index];

s/order/Order/

@@ +2319,5 @@
> +                    vec[n + i] = vec[strElements[i].index];
> +
> +                /* swap second part of the vector to the first */
> +                for (i = 0; i < n; i++)
> +                    vec[i] = vec[n + i];

I was initially torn about the need for this two-copy fixup.  I considered the possibility of merging sortElements with vec or copying from vec directly into the array, but both hit problems because InvokeArrayElements wants a linear array of values.  In theory, InitArrayElements should be templated to take an iterator so that we can relax this restriction, but that is for another day.  Instead, I think this second copy can be avoided by passing InitArrayElements (vec.begin() + n) instead of (vec.begin()).  Beware the 'vec.resize' call below.

@@ -2214,2 @@
>                  return false;
> -            }

While I agree it looks weird, if a condition is more than one line, the statement is always in braces.

::: js/src/jsstr.cpp
@@ +3420,5 @@
>  namespace js {
>  
>  static bool
> +CompareSubStringsImpl(const jschar *s1, size_t l1,
> +                        const jschar *s2, size_t l2, int32_t *result)

How about 'CompareChars' ?  Also, lines can be up to 99 chars so (with the new name) you don't need to wrap.  Lastly, can you put this in jsstrinlines.h (so it can be inlined into the array sort comparator)?  This will also avoid the need for the externed function that calls the static Impl function.

@@ +3449,5 @@
> +        return true;
> +    }
> +
> +    return CompareSubStringsImpl(str1->getChars(cx), str1->length(),
> +                                str2->getChars(cx), str2->length(), result);

Good factoring, but there is an annoying problem in the way: if str1->getChars(cx) fails (OOM), then we need to return before executing str2->getChars(cx).  Thus, I think you need to keep it in the longer form it was in before (this also allows you to hoist the null checks out of CompareSubStrings).
Attachment #589182 - Flags: review?(luke)

Comment 24

5 years ago
Created attachment 589808 [details] [diff] [review]
More fixes
Attachment #589182 - Attachment is obsolete: true
Attachment #589808 - Flags: review?(luke)
(Reporter)

Comment 25

5 years ago
Comment on attachment 589808 [details] [diff] [review]
More fixes

This looks great, r+.  In addition to the 3.64x improvement for arrays of ints, I see a 13% speedup sorting 100 doubles and 10% speedup sorting 2M doubles which is pretty good for not changing the core sorting algorithm.  On the subject of changing the algorithm, I've been hearing good things about timsort (http://en.wikipedia.org/wiki/Timsort)... perhaps you'd be interested if you are not sick of array_sort ;-) ?

Three minor comments below.  If you agree with the comments, I can just make the changes directly before pushing the patch.  Does this sound good to you?

>+                Vector<StringifiedElement, 0, ContextAllocPolicy> strElements(cx);
>+                if (!strElements.reserve(2 * n))

The TempAllocPolicy would be better here since the malloc() allocations are temporary and thus we don't want to count them in gcMallocBytes.

>+                /* resize strElements so we can perform the sorting */
>+                if (!strElements.resize(2 * n))
>                     return false;

Because of the reserve, this can't fail.  Also, I don't think the new elements are read before they are written, so we can write:

  JS_ALWAYS_TRUE(strElements.resizeUninitialized(2 * n));

>+                /* Perform InitArrayElements starting on vec[n] */
>+                if (!InitArrayElements(cx, obj, 0, jsuint(n), vec.begin() + n, DontUpdateTypes))
>+                    return false;
>+
>+
>+                /* We are done! */
>+                done = true;

Instead of this 'done' flag, could we instead have a local above the branch 'Value *result = vec.begin();' and have 'result = vec.begin() + n;' on this path and then have only one call to InitArrayElements (the original) that uses 'results'?  But what about the 'vec.resize(n)' that precedes it?  I think it is silly to resize 'vec' just to "make the job of a potential GC under InitArrayElements easier", so we can just remove that line and the comment.
Attachment #589808 - Flags: review?(luke) → review+

Comment 26

5 years ago
(In reply to Luke Wagner [:luke] from comment #25)
> Comment on attachment 589808 [details] [diff] [review]
> More fixes
> 
> This looks great, r+.  In addition to the 3.64x improvement for arrays of
> ints, I see a 13% speedup sorting 100 doubles and 10% speedup sorting 2M
> doubles which is pretty good for not changing the core sorting algorithm. 
> On the subject of changing the algorithm, I've been hearing good things
> about timsort (http://en.wikipedia.org/wiki/Timsort)... perhaps you'd be
> interested if you are not sick of array_sort ;-) ?

Yes, I would like to give it a try :D. Is there a bug for that?

> 
> Three minor comments below.  If you agree with the comments, I can just make
> the changes directly before pushing the patch.  Does this sound good to you?

Yes, please, make those changes.

Thanks!
(Reporter)

Comment 27

5 years ago
(In reply to Santiago Gimeno from comment #26)
> Yes, I would like to give it a try :D. Is there a bug for that?

No, but please feel free to file one.  (Core -> Components -> JS engine).

Yet another sort-algorithm optimization idea is: when allInts is true, use a radix sort.
(Reporter)

Comment 28

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/0ec0b8f17be9
Target Milestone: --- → mozilla12
https://hg.mozilla.org/mozilla-central/rev/0ec0b8f17be9
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED

Updated

5 years ago
Depends on: 720695
You need to log in before you can comment on or make changes to this bug.