IonMonkey: Optimize linear initialization of empty arrays

NEW
Unassigned

Status

()

Core
JavaScript Engine: JIT
P5
normal
5 years ago
a year ago

People

(Reporter: nbp, Unassigned)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

PdfJS octane benchmark use a pattern which seems to be common:

function (str) {
  var arr = [];
  for (var i = 0; i < src.length; i++) {
    …
    arr[i] = …;
    …
  }
  return … arr …;
}

This pattern is already recognized by the range analysis to remove bound check on src reads and writes when manipulated with the loop variable.  On such loop we have 2 major problems.

1/ “arr[i] =” is encoded as a SetElementHole, which put the initialization on the OOL path. This should be changed when such pattern is recognized, to be equivalent to an Array.push instruction.

2/ “arr = []” is reallocated every-time we hit the capacity limit. This should use the loop boundary to reserve the space and do only one allocations.

3/ Combining 1 & 2 should remove the 2 guards from emitArrayPush and get rid of the out-of-line path used to reallocate the array, thus making this setElement infallible.

This problem occur in PdfJS under IonMonkey at least:
 - 564889 under pdfjs.js:16635, alloc size:
   - 1547 at 6
   - 1406 at 12
   - 1034 at 24
   - 850 at 48
   - 752 at 96
   - 264 at 192
   - 41 at 384, 768, 1536
   - 42 at 3072 (Bailout / OSR effect ?)
   - 33 at 6144
   - 15 at 12288
 - 416007 under pdfjs.js:14938, alloc size:
   - 330 at 6
   - 322 at 12
   - 317 at 24
   - 224 at 48
   - 120 at 96
   - 93 at 192
   - 84 at 384
   - 42 at 768, 1536
   - 28 at 3072
   - 19 at 6144
   - 15 at 12288

Manually doing 1/ gives a 0.5% boost on PdfJS. Doing 2/ should at best save 7700 (at least 5900) re-allocations.
2/ Looks interesting. But note that in case of removing bound checks we predict the size of i. In that case we need to predict the right number or a higher less precise number. For arr[i] to initialize the array length, we probably want it the other way around. Use the right number or a lower less precise number. Because if our estimations is off and we predict higher number, we need more space ...

FYI: MNewArray takes initLength and allocating policy. In this case we want to beef up the initLength and set the allocating policy to NewArray_Allocating.
(In reply to Hannes Verschore [:h4writer] from comment #1)
> 2/ Looks interesting. But note that in case of removing bound checks we
> predict the size of i. In that case we need to predict the right number or a
> higher less precise number. For arr[i] to initialize the array length, we
> probably want it the other way around. Use the right number or a lower less
> precise number. Because if our estimations is off and we predict higher
> number, we need more space ...

Define lower less-precise ?!  I don't see what you are trying to refer too.  I think we can purely match bodies without break or any form of escaping where the loop counter is only incremented in the step-part of the for loop to guarantee the linear initialization.
I was referring to what to to "arr[i]", where the upper range of i is known to be between 4 and 10, but you cannot know for sure. Would you take a lower upper range, to not initialize the array to much, or the upper upper range and possible over-allocate the array?
(In reply to Hannes Verschore [:h4writer] from comment #3)
> I was referring to what to to "arr[i]", where the upper range of i is known
> to be between 4 and 10, but you cannot know for sure. Would you take a lower
> upper range, to not initialize the array to much, or the upper upper range
> and possible over-allocate the array?

I am taking the symbolic upper bound.  So the array would be reserved/grow for the use case of the initialization loop.
Created attachment 720235 [details] [diff] [review]
(WIP) Reserve array space and use infallible array push.

Using a micro benchmark and a first version of the patch to do copies of array does not reproduce the 0.5% boost seen on PdfJS.

On the other hand, I found more interesting to see performance differences to be caused, not by the copy of the elements, but mostly by the late scheduling of GC cycles (to be verified?).
Assignee: nicolas.b.pierron → nobody
Component: JavaScript Engine → JavaScript Engine: JIT
Priority: -- → P5
You need to log in before you can comment on or make changes to this bug.