Last Comment Bug 656813 - Optimize Array.prototype.slice for arguments objects?
: Optimize Array.prototype.slice for arguments objects?
Status: NEW
: perf
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal with 4 votes (vote)
: ---
Assigned To: :Ehsan Akhgari
:
Mentors:
http://code.google.com/p/v8/source/br...
Depends on: 657665
Blocks:
  Show dependency treegraph
 
Reported: 2011-05-12 17:10 PDT by Jeff Walden [:Waldo] (remove +bmo to email)
Modified: 2013-11-21 06:14 PST (History)
11 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP 1 (4.87 KB, patch)
2011-05-14 10:57 PDT, :Ehsan Akhgari
no flags Details | Diff | Splinter Review

Description Jeff Walden [:Waldo] (remove +bmo to email) 2011-05-12 17:10:52 PDT
From v8 (cf. URL):

    // Array.slice(arguments, ...) is quite a common idiom (notably more
    // than 50% of invocations in Web apps).  Treat it in C++ as well.

We seem not to optimize this.  Assuming that comment is correct, we should.

The usual traps to be wary of here: arguments with deleted properties, where the value on Object.prototype (if any) should show through, slicing past the end of provided arguments, slicing when arguments.length has been modified, slicing when any argument has been modified, getting order of operations correct in case of such modifications when the modification is a getter that mutates later elements of arguments or Object.prototype, etc.  Keeping the optimization narrow enough should make avoiding these pitfalls straightforward.
Comment 1 Luke Wagner [:luke] 2011-05-12 17:20:41 PDT
GetElement/GetElements may be a useful guide to the pitfalls of arguments.
Comment 2 :Ehsan Akhgari 2011-05-13 15:11:34 PDT
I'm gonna take this as my first js-engine bug, but if it's urgent, somebody else should steal this from me, since I will most likely try to work on it on a weekend...
Comment 3 Jeff Walden [:Waldo] (remove +bmo to email) 2011-05-13 15:21:40 PDT
As with most standard methods, Array.prototype.slice is predictably defined in the js*.cpp file for that kind of object in js/src (jsarray.cpp, here).  Also as is roughly typical, the method implementing it is named <kind>_<methodname>, so array_slice in this case.

The kind of optimization to be done here -- implementing a function specially when the arguments fit some particular pattern -- occurs throughout jsarray.cpp, in most of the array_* methods.  Array methods are technically "generic" -- they can be called with any object as |this| -- but they do special work to be fast when |this| is a dense array (roughly: one where very few of the indexed properties from [0, length) are deleted, where those properties all are non-enumerable/configurable/writable and haven't been changed to be otherwise, and where the array has no named-but-not-indexed properties other than length), when the properties are stored in sequential addresses in memory.  The concerns mostly happen when the array is missing an element, because it's been deleted:

  var arr = [1, 2, 3];
  delete arr[1];
  assertEq(arr[1], undefined);
  Object.prototype[1] = 17;
  assertEq(arr[1], 17);

That sort of trap is replete in such optimizations, and we use js_PrototypeHasIndexedProperties to handle it.

If that's not enough information to make a start on this, feel free to comment here or ping me on IRC.
Comment 4 :Ehsan Akhgari 2011-05-14 10:57:37 PDT
Created attachment 532451 [details] [diff] [review]
WIP 1

This was a lot of fun!

I think I've got the basics working.  At least things don't break for the very simple case that I'm testing in the unit test, and everything seems to be ok in gdb...

I have no idea what things is missing from this patch, and also how I should test this.  The current test is definitely not enough.  Also, the code I've written may make no sense at all, I grepped my way through the code to try to figure out how arguments objects work, but I might have made embarrassing mistakes.  I also tried to stick with the existing coding style, but I admit that I didn't read the coding style document, so I might have messed some stuff up!  Hopefully not too many style mistakes...  :-)

I'm not sure what sort of feedback I'm requesting here!
Comment 5 :Ehsan Akhgari 2011-05-14 10:58:48 PDT
Also, how do you run the js tests only in a single directory?  The obvious way below doesn't work:

../tests/jstests.py ./js -m ../tests/js1_8_5/regress/jstests.list
Comment 6 :Ehsan Akhgari 2011-05-14 11:00:39 PDT
Not sure if this is the right venue to give feedback, but I managed to find build/test docs etc using very simple Google searches.  I tried to do the things the right way (i.e., building only js/, and testing stuff in the js shell).
Comment 7 :Ehsan Akhgari 2011-05-14 11:03:53 PDT
I promise this is going to be the last comment before Jeff's feedback!

I should say that reading the V8 code, I have no idea how the optimization they're doing works, so I tried to think about this for my self.  Here is my thinking (which led to this patch): we have an arguments object, instead of treating it using the usual array interfaces, just cherry-pick the correct actual args from the function call context (which I realized later is called StackFrame) and put them in the sliced array.

Feel free to correct my thinking here if needed.
Comment 8 Tom Schuster [:evilpie] 2011-05-14 12:23:20 PDT
I have the feeling this could fail if fp (private) is NULL. This can happen if you have the arguments object of a function that isn't on the stack anymore.

eg.
var args = (function () { return arguments; })(1, 2, 3);
Comment 9 :Ehsan Akhgari 2011-05-14 13:22:39 PDT
(In reply to comment #8)
> I have the feeling this could fail if fp (private) is NULL. This can happen if
> you have the arguments object of a function that isn't on the stack anymore.
> 
> eg.
> var args = (function () { return arguments; })(1, 2, 3);

Hmm, how would args[0] work in that case?  Are the actual arguments copied somewhere?  I stole the idea of getting the fp off of the arguments object from somewhere else in the code...
Comment 10 Jason Orendorff [:jorendorff] 2011-05-16 13:26:12 PDT
Thank you Ehsan!

But our actual arguments are in another castle!

See the code in ArgGetter:

            if (StackFrame *fp = (StackFrame *) obj->getPrivate())
                *vp = fp->canonicalActualArg(arg);
            else
                *vp = obj->getArgsElement(arg);

I think WIP 1 would give the wrong answer in strict mode, too, since in strict mode the arguments are actually copied eagerly into the arguments object and are never shared with fp. (obj->isArguments() returns true for both ordinary and strict-mode arguments objects.) The test for this would be something like:

    function f(a, b, c) {
        "use strict";
        a = "bad";
        var s = Array.prototype.slice.call(arguments, 0, 3);
        assertEq(s.length, 3);
        assertEq(s.join(","), "1,2,3");
    }
    f(1, 2, 3);
Comment 11 Jeff Walden [:Waldo] (remove +bmo to email) 2011-05-23 14:41:59 PDT
Comment on attachment 532451 [details] [diff] [review]
WIP 1

Review of attachment 532451 [details] [diff] [review]:
-----------------------------------------------------------------

Hopefully these comments will be a reasonable start on what to do next.  I apologize for the under-documentation of arguments objects (historically), but hopefully that should be fixed now, in vm/ArgumentsObject.h -- feel free to offer your opinion on it if you see anything confusing.

::: js/src/jsarray.cpp
@@ +2685,5 @@
> +    // Special case arguments objects
> +    if (obj->isArguments()) {
> +        ArgumentsObject *argsobj = obj->asArguments();
> +        if (!argsobj->hasOverriddenLength() &&
> +            argsobj->initialLength() >= end &&

For an in-range check I suspect most people would rather see |i < length| or similar.  Although this code mostly gets obsoleted by the new ArgumentsObject::getElements() I added in bug 567665.

@@ +2687,5 @@
> +        ArgumentsObject *argsobj = obj->asArguments();
> +        if (!argsobj->hasOverriddenLength() &&
> +            argsobj->initialLength() >= end &&
> +            !js_PrototypeHasIndexedProperties(cx, obj)) {
> +            nobj = NewDenseCopiedArrayFromArguments(cx, begin, end, argsobj);

The New*Array* methods are to handle various special ways arrays can be created, then to avoid work as much as possible in modifying them to what they're supposed to be for whatever algorithm's using them.  Here, you'd want to use NewDenseAllocatedArray (?), then copy from arguments into it.  The point of that method is to avoid an unnecessary copy from a temporary vector into the array, when you might as well just initialize directly in the array's elements, of course.  Generally such code is specific to the use case, so the special behavior should just be inline code here, not a separate method.

@@ +3204,5 @@
> +                                 ArgumentsObject *argsobj, JSObject *proto)
> +{
> +    struct STATIC_SKIP_INFERENCE CopyArg {
> +        CopyArg(JSObject *arr, uintN begin, uintN end)
> +            : arr(arr), begin(begin), end(end) {}

I'm not sure what platform you're developing on, but I'm somewhat surprised this works and/or expect it wouldn't on others.  Per ISO C++ of now (c++0x fixes it, as I recall) templatized things can't depend on local types, so I think this (via the forEachCanonicalBurgersAndFries) is not actually kosher C++.

(I searched in vain for a Bill Nye "now you know" sound clip to link right here.  :-( )

@@ +3212,5 @@
> +            if (index >= begin && index < end) {
> +                arr->setDenseArrayElement(index - begin, *src);
> +            }
> +            return true;
> +        }

You have some of the right idea with this comparator, but you're missing some of the complexities of arguments objects: that their elements, when redefined, become MagicValue(JS_ARGS_HOLE), and at that point to get the actual value you have to get the property the generic (slow) way with getProperty.  This probably wasn't obvious because of the lack of comments by ArgumentsObject explaining exactly how it's used.  But now there are such comments -- and even more importantly, a getElements method that encapsulates those details -- so it shouldn't be difficult to use that to avoid these issues.

A note on getElements: the interface is a little arcane because the idea is that either you can optimize every element, or you can't, so you should just fall back to doing it the slow way.  In this code, that might be as simple as |if (getElements()) { return true; }|, roughly, then falling through to the slow path.

Note You need to log in before you can comment on or make changes to this bug.