Vector.prototype.push is slow due to use of unoptimized rest arguments and 'apply'

RESOLVED FIXED in Q3 11 - Serrano

Status

P3
normal
RESOLVED FIXED
10 years ago
8 years ago

People

(Reporter: jodyer, Assigned: lhansen)

Tracking

(Blocks: 3 bugs)

unspecified
Q3 11 - Serrano
x86
All
Dependency tree / graph
Bug Flags:
in-testsuite ?
flashplayer-qrb +
flashplayer-bug +
flashplayer-triage +

Details

(Whiteboard: PACMAN, has-patch)

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

10 years ago
The issue is that the AS version of push():

private function _push(items:Array) : uint

calls the length getter, which is exact memory allocation and very slow. Every push that goes through this method reallocates and copies the internal
array. The C++ version of push() does the correct thing, and operates much
faster.

Reproducible: Always

Steps to Reproduce:
1. This bug can be reproduced by calling the lineTo/moveTo utility funcitons
for the GraphicsPath object, which use the AS versions of Vector.push(), see
the bugfile for source code.

Actual Results:  
The moveTo/lineTo utility functions take much longer to complete the task than
the normal graphics.lineTo functions     

Expected Results:  
The moveTo/lineTo utility functions should be just as fast as graphics.lineTo

Originally a Player bug found here:
http://frpbugapp.macromedia.com/bugapp/detail.asp?ID=227805
Transferred Comments:

Trevor Baker - Tue Jan 27 14:51:57 CST 2009
back to internal review for reprioritization


This bug transferred from: http://bugs.adobe.com/jira/browse/ASC-3425
(Assignee)

Updated

10 years ago
Blocks: 479769
Component: Build Config → Virtual Machine
QA Contact: build-config → vm
(Assignee)

Updated

9 years ago
Priority: -- → P3
Target Milestone: --- → flash10.1

Updated

9 years ago
Flags: in-testsuite?
Flags: flashplayer-triage+
Flags: flashplayer-qrb?

Updated

9 years ago
Flags: flashplayer-qrb? → flashplayer-qrb+
(Assignee)

Updated

9 years ago
Priority: P3 → P4

Updated

9 years ago
Target Milestone: flash10.1 → flash10.2
(Assignee)

Updated

9 years ago
Whiteboard: PACMAN

Comment 1

9 years ago
Is this bug still accurate? I don't see the abovementioned variety of _push in TR, only

        AS3 native function push(...args):uint
        prototype.push = function(...args):uint
        {
            var n:uint = uint(this.length)
            for (var i:uint=0, argc:uint=args.length; i < argc; i++)
                this[n++] = args[i]
            this.length = n
            return n
        }
(Assignee)

Comment 2

9 years ago
The problem is the same for plain push(), of course - the rest args force a relatively expensive allocation where none should be required in the typical case of one argument to push.  (There may be another bug open on that problem as well.)
(Assignee)

Comment 3

9 years ago
The other one is bug #420368.

Updated

8 years ago
Depends on: 569321
(Assignee)

Comment 4

8 years ago
Though somewhat confusing, the original bug report pertained to Vector.push.

Even with the optimization in bug #569321, which optimizes straightforward uses of rest arguments in AS3 code and fixes Array.prototype.push, the problem with Vector.prototype.push remains.

That code currently looks like this:

prototype.push = function (...items) {
    return castToThisType(this).AS3::push.AS3::apply(castToThisType(this), items);
}

Here it is plain that 'items' is captured and cannot be optimized in the manner of the optimization implemented in the previously mentioned bug.  However, if this code were to be rewritten to be similar to the AS3 version used in Array (quoted in an earlier comment) it would be optimized by the VM.

The code quoted above is actually doubly nasty in that it delegates to the method AS3::push, which is native and takes a rest parameter itself, hence the need above to use apply to distribute the arguments.

Finally, VectorBaseObject::AS3_push (on the C++ level) is efficient enough, so early-bound calls to AS3::push on Vector objects should be plenty fast, but this remains to be verified by testing.
Assignee: nobody → lhansen
Blocks: 570944
Status: NEW → ASSIGNED
Summary: AS version of _push performance slow due to use of memory allocation → Vector.prototype.push is slow due to use of unoptimized rest arguments and 'apply'
(Assignee)

Comment 5

8 years ago
Bug 571475 addresses the problem generally.
Blocks: 571475
Component: Virtual Machine → Library
Priority: P4 → P3
QA Contact: vm → library
(Assignee)

Comment 6

8 years ago
Created attachment 452266 [details]
Baseline benchmark

 Scores on a MacPro with TR 4838 Release:

Compiled without -AS3: score=24
Compiled with -AS3: score=228

The code is untyped so the JIT should not currently be able to early-bind to AS3::push (though it will find it more quickly during class lookup).
(Assignee)

Comment 7

8 years ago
Created attachment 452270 [details] [diff] [review]
Tentative patch

Simple patch.  I'm interested in comments on my use of castToThisType as a type check, especially, but whatever else you can think of.  I've tested this, but only lightly.

With this patch the score goes up to 56 on my machine (ie 2x) - roughly as expected.

The still-lagging score (almost 5x) suggests that the -AS3 version of the benchmark gets help from an inline cache while the call via the prototype does not; another reason why we need to implement hidden classes if we want to be serious about supporting programs compiled without -AS3.
Attachment #452270 - Flags: feedback?(edwsmith)

Comment 8

8 years ago
Would it make sense to set the length first to preallocate the internal array so that we don't have to call grow() a bunch of times and potentially reallocate?

Comment 9

8 years ago
Comment on attachment 452270 [details] [diff] [review]
Tentative patch

surface comments:

* looks correct, but i haven't thought about corner cases of as3::push() that might not be handled right here..  (holes? calling these push() functions on other objects?)

* probably don't need "var dummy = ...", since any as3 call is considered to have side effects, the freestanding "castToThisType(this)" wont be optimized out.

* a hand-inlined cast will be optimized into an OP_coerce<T> -- would have to cut+paste the body of prototype.push to each unique vector type, but might be worth it.

* var n:uint = uint(this.length) ==>  the call to uint() is not necessary.  assigning to a uint var does the same thing.  JIT will optimize it out, but its wasted bytecodes.

* this[n++]    =>    this[uint(n++)]    since array-index context is not typed.  this will give us uint addition

In retrospect:  Array.prototype.push is carefully designed to be copyable to other objects, but Vector.prototype.push isn't.  (hence the typecheck).  so why didn't we just make the Vector.push methods public?

Updated

8 years ago
Attachment #452270 - Flags: feedback?(edwsmith) → feedback+

Comment 10

8 years ago
Also, the expression "this" throughout the body is untyped.  if we hand-wrote this method for each vector type, you'd profit from early binding:

var v:Vector.<int> = this
var n:uint = v.length
for (var i:uint=0, argc:uint=args.length; i < argc; i++, n++)
   v[n] = args[i]
v.length = n

(times N for <int>, <uint>, <Number>, ...)

Comment 11

8 years ago
any correctness issues with n wrapping around from 2^32-1 to 0?  as long as vector.length's legal maximum is 2^32-1, then an exception should occur before wraparound of n occurs, i think.  (not a real problem yet, given 4GB heap limit)
(Assignee)

Comment 12

8 years ago
(In reply to comment #8)
> Would it make sense to set the length first to preallocate the internal array
> so that we don't have to call grow() a bunch of times and potentially
> reallocate?

Could be, although the common case is probably one argument.  I'll think about it.
(Assignee)

Comment 13

8 years ago
(In reply to comment #9)
> * looks correct, but i haven't thought about corner cases of as3::push() that
> might not be handled right here..  (holes? calling these push() functions on
> other objects?)

No holes in vector, and vector methods are not generic, hence the type check.

> * probably don't need "var dummy = ...", since any as3 call is considered to
> have side effects, the freestanding "castToThisType(this)" wont be optimized
> out.

Fixed.

> * a hand-inlined cast will be optimized into an OP_coerce<T> -- would have to
> cut+paste the body of prototype.push to each unique vector type, but might be
> worth it.

I will probably note this for future work along with the suggestion in comment #10 about specializing 'this', mainly because (a) lots of methods might benefit from that optimization and (b) the common case is one argument, and little or no work is saved if there's only one iteration of the loop.

> * var n:uint = uint(this.length) ==>  the call to uint() is not necessary. 
> assigning to a uint var does the same thing.  JIT will optimize it out, but its
> wasted bytecodes.

Willfix.  I copied the code from Array.prototype.push, maybe fix the same thing there?

> * this[n++]    =>    this[uint(n++)]    since array-index context is not
>  typed.  this will give us uint addition

That's a little surprising, n is typed as uint, so n++ really ought to yield an uint reliably (post-increment).

> In retrospect:  Array.prototype.push is carefully designed to be copyable to
> other objects, but Vector.prototype.push isn't.  (hence the typecheck).  so
> why didn't we just make the Vector.push methods public?

Search me :-)

(In reply to comment #11)
> any correctness issues with n wrapping around from 2^32-1 to 0?  as long as
> vector.length's legal maximum is 2^32-1, then an exception should occur before
> wraparound of n occurs, i think.  (not a real problem yet, given 4GB heap
> limit)

The AS3::push code is this:

uint32_t VectorBaseObject::AS3_push(Atom *argv, int argc)
{
    if( m_fixed )
        toplevel()->throwRangeError(kVectorFixedError);
    grow(m_length + argc);
    for (int i=0; i < argc; i++) {
        setUintProperty(m_length, argv[i]);
    }
    return m_length;
}

The use of 'int' should be fixed obviously, but it's a minor point.  For the moment, with a 4GB object limit, we can have at most 1G elements, and grow() will abort the process (or perhaps in the future throw an exception) if we try to allocate beyond 4GB.  So this code is correct and indices do not wrap around.  Thus the AS3 code could use this[uint(n++)] as you suggest.
(Assignee)

Comment 14

8 years ago
Created attachment 452688 [details] [diff] [review]
Patch

Some of the suggested changes, plus one bug fix (need to check for fixedness explicitly in order to throw the correct numbered exception - arguably a misfeature to specify errors beyond "RangeError", but there you have it).  Also added some comments.

I will log type specialization as a separate bug.
Attachment #452270 - Attachment is obsolete: true
Attachment #452688 - Flags: review?(edwsmith)
(Assignee)

Comment 15

8 years ago
> I will log type specialization as a separate bug.

Bug #573452; it should cover both the early binding and the inlining of the cast as OP_coerce.
(Assignee)

Updated

8 years ago
Whiteboard: PACMAN → PACMAN, has-patch

Updated

8 years ago
Attachment #452688 - Flags: review?(edwsmith) → review+
(Assignee)

Comment 16

8 years ago
tamarin-redux changeset:   4892:4000d3e45311
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED

Updated

8 years ago
Flags: flashplayer-bug+
You need to log in before you can comment on or make changes to this bug.