Closed Bug 571469 Opened 14 years ago Closed 14 years ago

VM should optimize the 'arguments' argument

Categories

(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED
Q3 11 - Serrano

People

(Reporter: lhansen, Assigned: lhansen)

References

Details

(Whiteboard: PACMAN, has-patch)

Attachments

(4 files, 6 obsolete files)

In bug #569321 we introduced an optimization that avoids allocating the rest array if it does not escape, by means of a conservative escape analysis that is run on the ABC code.

That analysis can almost certainly applied to optimize the 'arguments' argument as well, thus providing this speedup to plain JavaScript code run on Tamarin.
There are many occurences of 'arguments' in the builtins' Date code - all the setter functions use it.  Those are not performance critical typically, and in that case the arguments object escapes (via apply), so no analysis would help.
To my relative astonishment, 'arguments' use is everywhere in the earley-boyer benchmark in v8.5, even in the 'optimized' code.
(In reply to comment #2)
> To my relative astonishment, 'arguments' use is everywhere in the earley-boyer
> benchmark in v8.5, even in the 'optimized' code.

On the other hand we're not running earley-boyer because of bug #464134 - an ASC bug that appears not to have been fixed.
The ASC bug has actually been fixed.  However the v8 and v8.5 sources for earley-boyer are not quite in a runnable state.  Those problems are tracked by bug #585586 and bug #585564, respectively.
Depends on: 585564
We can optimize the arguments object under the same conditions that we can optimize the rest array (this is all on the ABC level):

- NEED_ARGUMENTS is set
- NEED_REST is not set (because it takes precedence over NEED_ARGUMENTS)
- NEED_ACTIVATION is not set (see below)
- The debugger is not present
- the arguments object is produced by GETLOCAL and only consumed by GETPROPERTY
  of one of these forms:
    arguments[{"",...}::"length"]
    arguments[{"",...}::x]

Those conditions are sufficient because the AS3 arguments object does not have ECMAScript semantics - it does not share storage with the initial formal parameters.

The reason NEED_ACTIVATION should not be set is subtle.  Consider the following code, which is stylized ABC code (not AS3 code):

  function f() {
     function g() {
        ... arguments ...
     }
  }

Since this is ABC code it is possible for NEED_ARGUMENTS to be set on the outer function but not on the inner function.  In that case the reference to 'arguments' in the inner function should reference the arguments object in the outer function.  As a cheap guard against this we simply disable the analysis on methods that have nested functions.  (It is possible to do better.)

Anyway once the analysis says that we can optimize then we optimize as for ...rest arguments, except that for the arguments array we need to make the entire array of arguments available, not just the ones that weren't mapped to fixed and optional parameters.
There is actually a complication: the layout of the argument area.  If there are  fixed or optional arguments, then if those arguments are typed then their values will be laid out in native form, not as atoms.  In that case we can't optimize quite as easily.

So three steps:

- if there are no fixed or optional args then it's easy
- if none of the fixed or optional args are typed (JS case) then it's also easy
- otherwise we must decode the argument area at run time.

In the third case it is exceedingly likely that the code uses ...rest anyway, so there's no really urgent need to handle that.
Attached patch Preliminary patch (obsolete) — Splinter Review
Passes initial testing but needs further vetting, and some of the comments are off by a hair and need to be cleaned up.  But more or less the right thing I think.
Turns out the check on NEED_ACTIVATION used for arguments and ...rest is not necessary; if an activation record is created and the rest array escapes into that record then that will be handled by the normal logic for detecting whether the rest array escapes.  Thus it is possible to generate code that has an activation record and which also a non-escaping rest array that will be optimized properly.
Attached patch PatchSplinter Review
Attachment #464069 - Attachment is obsolete: true
Attachment #464115 - Flags: review?(edwsmith)
Whiteboard: PACMAN → PACMAN, has-patch
Comment on attachment 464115 [details] [diff] [review]
Patch

wicked cool
Attachment #464115 - Flags: review?(edwsmith) → review+
Attached patch Acceptance testSplinter Review
Copied from the corresponding acceptance test for rest array optimization.
Attachment #464348 - Flags: review?(brbaker)
Attached patch Performance tests (obsolete) — Splinter Review
The asmicro cases are just copied from the corresponding restarg performance tests.  Scores are 914, 504, 23, and 31 respectively for no arguments object, optimized arguments object, unoptimized arguments object, and optimized arguments object that falls back to allocating lazily.  (It's curious that the last score is better than the next-to-last one, for rest arguments the order is the opposite.)

For jsmicro there are three cases: baseline (no arguments object), optimizable, and non-optimizable, which on my system break down with scores 148, 46, and 14 respectively - ie in untyped code the optimization provides a factor of 3 improvement when it kicks in, but avoiding the arguments object still gives you an edge by a factor of 3.  That last factor seems high and will want further investigation but that should not block this patch set.
Attachment #464042 - Attachment is obsolete: true
Attachment #464043 - Attachment is obsolete: true
Attachment #464044 - Attachment is obsolete: true
Attachment #464045 - Attachment is obsolete: true
Attachment #464353 - Flags: review?(fklockii)
(In reply to comment #17)
> 
> For jsmicro there are three cases: ... which on my system break down with
> scores 148, 46, and 14 ... avoiding the arguments object still gives you
> an edge by a factor of 3.  That last factor seems high and will want further
> investigation but that should not block this patch set.

It could be because the first benchmark just adds up its five parameters whereas the other two pay the overhead of a loop across the arguments object - so we're not really comparing apples and apples.  A more comparable test would add arguments[0] through arguments[4] without using a loop, or would have a baseline that accepts an array whose elements are added.
Comment on attachment 464353 [details] [diff] [review]
Performance tests

(reviewing now)
(In reply to comment #19)
> (In reply to comment #17)
> > 
> > For jsmicro there are three cases: ... which on my system break down with
> > scores 148, 46, and 14 ... avoiding the arguments object still gives you
> > an edge by a factor of 3.  That last factor seems high and will want further
> > investigation but that should not block this patch set.
> 
> It could be because the first benchmark just adds up its five parameters
> whereas the other two pay the overhead of a loop across the arguments object -
> so we're not really comparing apples and apples.  A more comparable test would
> add arguments[0] through arguments[4] without using a loop, or would have a
> baseline that accepts an array whose elements are added.

Yes this is definitely a factor.

The first proposal seems sound.  I do not think the second proposal is a good idea; it would add the overhead of modifying and accessing an array to the baseline benchmark, but we are not interested in measuring those overheads in this case.

Anyway, I just tried both of the approaches outlined above.  Here's the results:

Context: on my host (Thinkpad T60p), I get a score of 83 for the original baseline arguments-1.js (add five formals), a score of 23 for the original optimizable arguments-2.js, and a score of 8 for the original unoptimizable arguments-3.js (where arguments escapes).  This roughly jibes with Lars' results of a 3x speedup between each of the three cases.

With the first proposal of revising arguments-2.js to directly add arguments[i] for 0 <= i < 5, I get a score of 60 for the optimizable case and a score of 11 for the unoptimizable one.  That's a pretty significant difference.

With the second proposal (of revising arguments-1.js to add the elements of an array, the score is either 14 (if you re-initialize all five elements of the array on every iteration) or 17 (if you only re-initialize the first element since that is the only one that is changing).  Either way that is a huge drop from 83; its artificially hindering the baseline, which is not what we want to do for measuring this optimization's effectiveness.

So I recommend we adopt the first proposal (revise arguments-2.js to directly add the elements without a loop.)

I'll post a patch with the change shortly.
See previous comment motivating this change, which was of course originally suggested by Lars.
Attachment #464353 - Attachment is obsolete: true
Attachment #465645 - Flags: review?(lhansen)
Attachment #464353 - Flags: review?(fklockii)
Attachment #465645 - Flags: review?(lhansen) → review+
Comment on attachment 465645 [details] [diff] [review]
Performance tests v2, kill loop over arg array

tamarin-redux changeset:   5085:185bd5e2d123
Comment on attachment 464348 [details] [diff] [review]
Acceptance test

r+

Chris what is the best way to setup a section in testconfig that will ONLY run a couple of specific testcases? I know that you are working on another bug (#587872) that is trying to remove the "include" directive in the testconfig so not sure how to accomplish this. I want to have an acceptance run "*-performance" that will ONLY run a couple of tests.
Attachment #464348 - Flags: review?(brbaker)
Attachment #464348 - Flags: review+
Attachment #464348 - Flags: feedback?(cpeyer)
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Attachment #464348 - Flags: feedback?(cpeyer) → feedback+
Flags: flashplayer-bug+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: