Closed Bug 532690 Opened 15 years ago Closed 6 years ago

Vector initialization slower than Array

Categories

(Tamarin Graveyard :: Virtual Machine, defect, P2)

Tracking

(Not tracked)

RESOLVED WONTFIX
Q2 12 - Cyril

People

(Reporter: tharwood, Unassigned)

References

Details

(Whiteboard: PACMAN, Tracking)

Attachments

(2 files, 2 obsolete files)

The codegen for a Vector literal calls a constructor, vs. Array's bespoke newarray instruction.  A similar instruction for Vector would be beneficial.
Actually it's probably crucial unless the implementation is extremely careful about how it looks for the Vector binding.  Since OP_getouterscope is not supported by the JIT I bet it gets it wrong.

Consider this program:

function f() {
  var Vector = function (...args) { attack(args); }
  function g() {
    var t = new <int>[1,2,3]
  }
}

If the implementation of new<int> just looks up Vector in the local scope then it is sure to get the wrong one.  Sometimes this can be a security / integrity issue, as the code demonstrates.  (This is much worse with eval, but it can also result from code mixing, the 'with' statement, and other ickiness.)

We should fix this ASAP if it turns out to be the case.
Blocks: 532440
(Maybe the binding to be trying is Vector$int or something, but the gist is clear.  The same problem occurs in ES3 where 'Array' and 'Object' and 'RegExp' are subject to similar problems in some cases.)
Group: tamarin-security
Have not managed to spoof the lookup sequence so far, but the lookup uses a simple findpropstrict so it's probably just a matter of my relative lack of experience.  Since this is a Vector, we're preparing args for an applytype instruction.  op_applytype is armored against non-Vector objects, so we look safe from a security perspective.
Bug is not a security concern - downgrading.
Group: tamarin-security
Flags: flashplayer-qrb+
Target Milestone: --- → flash10.2
Probably the same as bug #532440.
Whiteboard: PACMAN
Assignee: nobody → lhansen
Priority: -- → P2
Related to bug #561140.
The code that's emitted for the vector literal:

  3         findpropstrict	__AS3__.vec::Vector
  5         getproperty   	{, private, __AS3__.vec, }::Vector
  7         findpropstrict	{, private, }::int
  9         getproperty   	{, private, }::int
  11        applytype     	(1)
  13        pushbyte      	3
  15        construct     	(1)
  17        dup           	
  18        pushbyte      	0
  20        pushbyte      	1
  22        setproperty   	null
  24        dup           	
  25        pushbyte      	1
  27        pushbyte      	2
  29        setproperty   	null
  31        dup           	
  32        pushbyte      	2
  34        pushbyte      	3
  36        setproperty   	null

Observations:

- This nails the fact that it's not a security concern, since 'Vector' is a QName and points into the internal __AS3__ namespace.  I believe this is not available to user code.  (That in addition to the check in op_applytype.)  It could still be a correctness concern if it is possible for a program to create __AS3__ bindings inside a class, say.

- The code compares unfavorably with the corresponding Array code:

  0         pushbyte      	1
  2         pushbyte      	2
  4         pushbyte      	3
  6         newarray      	[3]

but whether the main cost is in op_applytype, in the constructor call, or in the individual setproperty calls is unknown as of yet.
Attached file Comparative benchmark (obsolete) —
Results (iterations per second, higher is better):

Array     Vector

  29        25
  29        22
  29        22
  29        22
  29        22
Hm, not what we had in mind when the Vector facility was specified:

    finddef_cache1 = calli. #finddef_cache ( env 5917744 0 )
    ldi4 = ldi.o finddef_cache1[116]
    finddef_cache2 = calli. #finddef_cache ( env 5917776 1 )
    ldi5 = ldi.o finddef_cache2[96]
    ori1 = ori ldi4, 1
    alloci1 = alloci 4
    ori2 = ori ldi5, 1
    sti.o alloci1[0] = ori2
    op_applytype1 = calli.sro #op_applytype ( env ori1 1 alloci1 )
    alloci2 = alloci 8
    sti.o alloci2[0] = 1
    sti.o alloci2[4] = 30
    op_construct1 = calli.sro #op_construct ( env op_applytype1 1 alloci2 )
Somewhat surprisingly the ABC is the same in strict mode.
In type contexts we resolve early as __AS3__.vec::Vector.<int>, no surprise there.

It's surprising we don't do that in strict mode in these cases:

  new Vector.<int>
  <int>[...]

since some recent investigation suggests that we do resolve to QNames for other type names at least in strict mode, ie,

  new Array

is really

  new ""::Array

I need to verify that, but I need to understand the abcdump output properly first, it just prints 'Array'.
The situation looks worse still for Vector.<T> where T is not a blessed type like int, uint, double, or Object: in this case op_applytype does a fairly elaborate song and dance that involves multiple allocations and long logic chains.  There appears to be no attempts at caching any of those values except incidentally (in internString and so on).
Attached file Another benchmark (obsolete) —
This test is similar to the previous.  Here we construct Vector.<C> and Array of C.  The score for Array is still 29, the score for Vector drops to 18.  (I compiled in strict mode, which I did not do with the previous test.)
See also bug #575848.
Lots of things wrong here.

- the applytype is not specialized - we want to early-bind.
- that gives us a known type so we can specialize construct.
- that means the type from construct is known so the setproperty calls
  definitely do not need to use the write barrier, so the store should
  be much faster - open coded maybe.
- range check hoisting?
- methodology: create a specialized helper for the setproperty, then inline
  it manually for now.

This kind of optimization will have further impacts, eg in the 'as' and 'is' operators, where we presumably now have the applytype song & dance.

Making sure ASC knows something about vector types - somehow - may also be important because it will prevent it from inserting coercions that should not be there.

Jasper St Pierre observed that

   Vector.<int>([1,2,3])

is faster (by a whisker) than

   new <int>[1,2,3]

and that's just insane.  Part of the same problem.
Severity: normal → major
Status: NEW → ASSIGNED
Assignee: lhansen → nobody
Status: ASSIGNED → NEW
Flags: flashplayer-bug+
Whiteboard: PACMAN → PACMAN, must-fix-candidate
Depends on: 645018
No longer depends on: 645018
Blocks: 645018
Flags: flashplayer-injection-
Target Milestone: Q3 11 - Serrano → Q1 12 - Brannan
Depends on: 683187
There are numerous things to fix here, and most of them are fixable independently.

- APPLYTYPE should have a non-NULL result type (bug 683187)
- CONSTRUCT should specialize if presented with a known vector type
  (may already do that, need to investigate)
- If CONSTRUCT specializes then the APPLYTYPE call should become redundant and
  should be optimized away
- If the APPLYTYPE call becomes redundant then the FINDPROP/GETPROP pairs that
  feed it should similarly become redundant
- If the APPLYTYPE call does not become redundant (class initialization issues?)
  then we should make sure that op_applytype is as fast as possible

The result might not be a vector construction that's as fast as NEWARRAY but at least one that's much faster than constructing an Array and passing it to the Vector constructor.
Simple benchmarks testing the construction of Array and Vector data.
Assignee: nobody → lhansen
Attachment #441059 - Attachment is obsolete: true
Attachment #441291 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #557108 - Flags: review?(fklockii)
(In reply to Lars T Hansen from comment #16)

> Jasper St Pierre observed that
> 
>    Vector.<int>([1,2,3])
> 
> is faster (by a whisker) than
> 
>    new <int>[1,2,3]
> 
> and that's just insane.  Part of the same problem.

That is no longer observable, at least not on Mac (32-bit or 64-bit), not even before the Vector optimizations started landing  As of TR 6531 the former phrase is significantly slower than the latter, 32-bit results (Mac Pro 2x2.93GHz Xeon) are 194 it/sec vs 354 for integer data, 165 it/sec vs 282 for pointer data.
FWIW, after the fix in bug #683187 that prevents type laundering, a simple one-element cache in VectorClass::applyTypeArgs that caches the vector class for a given type key within the toplevel, sees a 1.14 speedup on the benchmark vector-init-1, which repeatedly constructs a short Vector.<C>.  This suggests that for sufficiently short vector initializers, op_applytype contributes substantially to the running time (ref my last point in comment #17).
Don't forget to regenerate the tracers!
Comment on attachment 557108 [details] [diff] [review]
Benchmarks extracted for asmicro/

Feedback: Perhaps there should also be a micro-benchmark for Vector.<int>([1,2,3]) and likewise for Vector.<C>([c,c,c])?  (Its a throw-away thought; I just thought it might be good to track how the performance of each evolves.)

Nits:

* Patch has tabs and trailing whitespace.  (Might I recommend utils/hooks/tamarin-commit-hook.py ...?  And/or emacs whitespace minor mode?  :)

* Copyright notices should say 2011, not 2010.
Attachment #557108 - Flags: review?(fklockii) → review+
(In reply to Felix S Klock II from comment #22)
> Feedback: Perhaps there should also be a micro-benchmark for
> Vector.<int>([1,2,3]) and likewise for Vector.<C>([c,c,c])?  (Its a
> throw-away thought; I just thought it might be good to track how the
> performance of each evolves.)

Ugh! I'm sorry, you clearly included that as benchmarks 3 and 4.  Somehow I got through reading 1+2 and thought I was done.
(In reply to Felix S Klock II from comment #22)

> * Patch has tabs and trailing whitespace.  (Might I recommend
> utils/hooks/tamarin-commit-hook.py ...?  And/or emacs whitespace minor mode?
> :)
> 
> * Copyright notices should say 2011, not 2010.

Both fixed.
changeset: 6557:6587cf2acf96
user:      Lars T Hansen <lhansen@adobe.com>
summary:   For 532690 - Vector initialization slower than Array: benchmarks (r=fklockii)

http://hg.mozilla.org/tamarin-redux/rev/6587cf2acf96
Whiteboard: PACMAN, must-fix-candidate → PACMAN
As Lars assigned it to PACMAN, removing the Performance tracker.
No longer blocks: 645018
As this is a tracking bug with specific fixes being landed under other bugs (eg bug #683187, with more coming), I am retargeting this to Cyril - some specific fixes will land in Brannan but not all.
Whiteboard: PACMAN → PACMAN, Tracking
Target Milestone: Q1 12 - Brannan → Q2 12 - Cyril
Assignee: lhansen → nobody
No assignee, updating the status.
Status: ASSIGNED → NEW
No assignee, updating the status.
No assignee, updating the status.
No assignee, updating the status.
Tamarin isn't maintained anymore. WONTFIX remaining bugs.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: