Closed Bug 1232266 Opened 9 years ago Closed 8 years ago

Allow iterable in TypedArray constructor

Categories

(Core :: JavaScript: Standard Library, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox45 --- affected
firefox52 --- fixed

People

(Reporter: anba, Assigned: gweng)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: dev-doc-complete)

Attachments

(1 file, 3 obsolete files)

Test case:
---
new Int8Array(function*(){ yield* [1,2,3]; }());
---

Expected: Creates typed array with three elements ({0:1, 1:2, 2:3})
Actual: New typed array is empty (byte length = 0)


ES2015, 22.2.1.4 %TypedArray% ( object ), step 3 [1].

[1] http://ecma-international.org/ecma-262/6.0/#sec-%typedarray%-object
If no one is working on this bug, I would like to take a look.
Assignee: nobody → gweng
In the 'jsfriendapi.h', it looks that we defined `JS_NewInt*Array` could be created with

1. uint32_t nelements             (a  Number)
2. JS::HandleObject array         (an Array)
3. JS::HandleObject arrayBuffer   (an ArrayBuffer with some other arguments)

https://dxr.mozilla.org/mozilla-central/source/js/src/jsfriendapi.h#1574-1659

Others are not in the definition. Then, with these types the creation actions is mapped to the macro definition in TypedArrayObject.cpp:

https://dxr.mozilla.org/mozilla-central/source/js/src/vm/TypedArrayObject.cpp#1832

which will invoke static methods in `TypedArrayObjectTemplate` to create the instance.

Therefore, if Iterable needs to be an acceptable type to create TypedArray, the header file, macro and static method should be extended. I will try to complete these tasks one-by-one, and add a test case.
Update the current situation:

I have figured out the path for constructing TypedArray will make it invokes

    TypedArrayObjectTemplate<signed char>::fromArray

via

    TypedArrayObjectTemplate<signed char>::create

and it is possible to make a

    TypedArrayObjectTemplate<signed char>::fromGenerator

to construct a TypedArray with a Generator as the STR describes. However, I now have no clue about where the value yielded by the generator would be. I've traced the flow and found it will follow the path:

1. Interpreter.cpp: JSOP_INITIALYIELD
2. GeneratorObject.h: initialSuspend
3. GeneratorObject.cpp: suspend (with the nvalues = 0 and pc = nullptr)

represents (I guess) the expression of `yield* [1,2,3];`. However, in the implementation of `suspend`, and the definition of GeneratorObject, I cannot find any slot for such 'value ([1,2,3])'. Because in another example I tried to make it more clear:

    var f  = function*(){ yield* [1,2,3]; };
    var g  = f();
    var n  = g.next();

the GeneratorObject seems to get the value from stack, when the GeneratorObject resumes itself when someone call `next()` to retrieve the value. The path looks like:

1. JSFunctionSpec star_generator_methods
2. StarGeneratorNext in Generator.js
   --> return resumeGenerator(), which looks like a operator that I cannot find out in codebase except:
3. BytecodeEmitter::emitSelfHostedResumeGenerator
4. Intepreter.cpp:3741 (CASE(JSOP_RESUME))
   --> It extracts `gen` and `val` from the stack
5. bool ok = GeneratorObject::resume(cx, activation, gen, val, resumeKind)

And in the `GeneratorObject::resume` function, it manipulates the value and frame/stack. I guess the in the `resume` there should be the value, because it is the last possible line of the case. However, when I inspect that with gdb, the `arg.isUndefined()` returns true, and the `gen` is the `genObj` but not the value. This is what I've stuck, because I don't see other possible places to get the value, and don't have any other statements to put the array on the stack after call the `next`.
The intention to have such value of the generator is because in my WIP patch I now can make my customized 

    TypedArrayObjectTemplate<signed char>::fromGenerator

to get the Generator as its argument. However, I don't know the proper way to get the value "inside" it to do the following construction.
(In reply to Greg Weng [:snowmantw][:gweng][:λ] from comment #3)
> and it is possible to make a
> 
>     TypedArrayObjectTemplate<signed char>::fromGenerator
> 
> to construct a TypedArray with a Generator as the STR describes.

There are many kinds of iterables and hard-coding all of them like this won't work.

Maybe you could use JS::ForOfIterator, but the better and faster way is probably to self-host these typed array constructors (write them in JS, see js/src/builtin/*.js) and then call an intrinsic to do the actual object allocation etc.
Here's an example of ForOfIterator, for |new Map(iterable)|:

https://dxr.mozilla.org/mozilla-central/rev/5cfc10c14aaea2a449f74dcbb366179a45442dd6/js/src/builtin/MapObject.cpp#468

I'd try either that or self-hosting :)
Thanks! The main reason I want to mimic the path of `fromArray` is because I'm not sure about the intention in the current constructing flow (especially the mysterious macros in the jsfriendapi.h). I would follow your suggestion to patch the issue.
Update: I now try to figure out the possible path to implement the constructor in self-hosting code. However, what I now have is that to new a TypedArray like `new Int8Array()` wouldn't trigger the `ConstructorForTypedArray`, which is the most likely and similar function for doing that in the self-hosting code. So I'm still trying to find out a possible path to route the creation expression to the self-hosting code.
Update: After wandering in the forest a while, I finally be able to identify the Generator argument and loop it with the help from :jandem. However, I'm still trying to get HandleValue from RootedObject, because ForOfIterator accepts the former. After that, I will get all the value and create the instance as the `fromArray` does in the same file.
Update: Since I need to get the length and values of the Generator in one iteration, I now try to use AutoValueVector to store these information inside the iteration.
Attached patch Patch ver.1 (obsolete) — Splinter Review
Hi Jan, thanks for your help. I finally made the test case work, but I'm not sure if it follows the conventions what SpiderMonkey crews usually follow, like the data structure I used (AutoValueVector), and the way I put and write the functions. So before I set the review, I would like to see if you have any suggestion for that. Thanks.
Attachment #8734325 - Flags: feedback?(jdemooij)
Oh sorry I left one debugging header in the file. I will modify it now.
Attachment #8734325 - Attachment is obsolete: true
Attachment #8734325 - Flags: feedback?(jdemooij)
Attached patch Patch ver.1 (obsolete) — Splinter Review
Comment on attachment 8734328 [details] [diff] [review]
Patch ver.1

Okay I updated the patch to remove the header.
Attachment #8734328 - Flags: feedback?(jdemooij)
Comment on attachment 8734328 [details] [diff] [review]
Patch ver.1

Thanks again for working on this! My main concerns are:

(1) This patch only handles generators. As I mentioned in comment 5, there are many kinds of iterables and the code should handle all of them. In other words, the patch should not contain the word "generator" (other than comments or tests maybe).

(2) Doing (1) may let us remove the code in TypedArrayObjectTemplate<T>::fromArray, setFromNonTypedArray, etc.

(3) Please add 'step' comments referring to the steps in the spec. See js::ArraySetLength for an example of this. IIUC the relevant code is this:

http://www.ecma-international.org/ecma-262/6.0/#sec-typedarrayfrom

(4) Looking at the spec, I think self-hosting these constructors is probably the way to go. See TypedArrayFrom in js/src/builtin/TypedArray.js, I think that's more-or-less the code corresponding to the spec algorithm.
Attachment #8734328 - Flags: feedback?(jdemooij)
(In reply to Jan de Mooij [:jandem] from comment #15)
> IIUC the relevant code is this:

Er, s/code/part of the spec/
Hi Jan,

Thanks for you inputs. I have noticed that your suggestion about to make a more general version of the constructor, but since I'm new, I'm not sure if to remove and refactor so many existing code would be too risky for the engine or not. And I did try to make a self-hosting version in JS, but again, it looks the whole TypedArray system currently are all constructing in C++ level, to create a whole new path to construct them will make lots of change. However, I would like to give it a try. And I will keep updating my status here.
(In reply to Greg Weng [:snowmantw][:gweng][:λ] from comment #17)
> And I did try to make a self-hosting version in JS, but
> again, it looks the whole TypedArray system currently are all constructing
> in C++ level, to create a whole new path to construct them will make lots of
> change.

The best way to do this is probably to add a self-hosting intrinsic corresponding to the current code, and then the self-hosted code can call that.

> However, I would like to give it a try. And I will keep updating my
> status here.

Awesome! It's not an easy bug, also because this code is performance sensitive and we don't want to regress that. Feel free to ask questions on IRC (#jsapi channel).
Okay. I'm back to this bug, and firstly I noticed that we already have |TypedArrayFrom| in |TypedArray.js|. Maybe this means I only need to make this case match the code deal with iteratable part in the spec. And to see what in our implementation need to be modified.
What I discovered now is we actually can accept generator as a initializer if it's for |TypedArrayFrom|. For example, if I change the test case to:

    Int8Array.from(function*(){ yield* [1,2,3]; }());

This will return 

    ({0:1, 1:2, 2:3})

And it's what we expect. That means maybe I can refactor the constructing path according to that, although the description of constructing and 'from' are different in the spec.
Currently the thing bothers me is I cannot find any data type put its constructor in our self-hosting JS files (ex: Array.js, String.js, Set.js, etc). So I'm not sure if this is do-able with existing methods I can take a look at, or I need to invent such path to achieve that.
After tracing the possible path of constructing an TypedArray, I may not start from to rewrite the constructing part from JS, but (if it's possible) to keep the constructing entrypoint ('class_constructor') as the interface, and then modify the following part as Comment 18 suggested. However, I'm not sure if it's possible to switch a path already in C++ (from macros in 'jsprototype.h', to the 'class_constructor' in TypedArrayObjectTemplate) to a C++-JS-C++ one.

For example, if somehow I make a self-hosted constructing implementation successfully, because of the current weaving method in 'jsprototype.h' and its following definitions, it looks that the path before reaching the 'class_constructor' would keep the same. And then in the 'class_constructor', it would need a way to invoke the JS implementation. It looks quite anti-intuitive and so far I haven't found a valid way to do that.
But maybe I can find another way to implement that.
Hi Jan,

Sorry for bothering, but may you help me to figure out a possible way to re-wire the constructing path from native C++ to JS hosted code, as you suggest in Comment 18? Because I could only find people add instance methods in the self-hosted code (ex: TypedArrayFrom), but no constructors. Also, for constructors it looks that there are lots of macros and C++ code to bind and expose real C++ native constructors to JS client, like, but I cannot find a way to do that for the Self-hosted "constructors".

The maybe most related idea so far in my mind is those pure JS functions exposed via JS_SELF_HOSTED_FN or other similar macros. But I'm not sure if for a constructor function written in JS, these macro could work well, too. For example, a most naive idea is to define Int8Array constructing API like:

    JS_SELF_HOSTED_FN("Int8Array", "NewInt8Array"...)

But I guess it's not simple as that.
Flags: needinfo?(jdemooij)
Hi Greg, I'll forward your question to Till :)
Flags: needinfo?(jdemooij) → needinfo?(till)
(In reply to Greg Weng [:snowmantw][:gweng][:λ] from comment #24)
> Hi Jan,
> 
> Sorry for bothering, but may you help me to figure out a possible way to
> re-wire the constructing path from native C++ to JS hosted code, as you
> suggest in Comment 18? Because I could only find people add instance methods
> in the self-hosted code (ex: TypedArrayFrom), but no constructors. Also, for
> constructors it looks that there are lots of macros and C++ code to bind and
> expose real C++ native constructors to JS client, like, but I cannot find a
> way to do that for the Self-hosted "constructors".
> 
> The maybe most related idea so far in my mind is those pure JS functions
> exposed via JS_SELF_HOSTED_FN or other similar macros. But I'm not sure if
> for a constructor function written in JS, these macro could work well, too.
> For example, a most naive idea is to define Int8Array constructing API like:
> 
>     JS_SELF_HOSTED_FN("Int8Array", "NewInt8Array"...)
> 
> But I guess it's not simple as that.

Hi Greg, I have a keen interest in this bug so I've been playing around with self-hosting [[Construct]]. I've started by adding something like `TypedArrayObjectTemplate::createConstructor` 

        JSFunctionSpec fs = JS_SELF_HOSTED_FN("MyConstructor", "MyConstructor", 1, 0);
        RootedId id(cx);
        JSAtom* atom = Atomize(cx, fs.name, strlen(fs.name), js::DoNotPinAtom);
        if (!atom)
            return nullptr;
        id.set(AtomToId(atom));


        JSFunction* fun = NewFunctionFromSpec(cx, &fs, id);
        if (!fun)
            return nullptr;

        return fun

Which compiles but predictably crashes at runtime. I'm trying to think of a better way today.

It's also worth nothing that to create a constructable self-hosted function you need to use the pattern:

    function MyConstructor() {}
    MakeConstructible(Int8Array2, {__proto__: {}})

It seems like till is away on PTO so I'm flagging arai for more info on the right way to go about this.
Hello arai, based on your knowledge of our self-hosting infrastructure do you have any advice on how to self-host a constructor which is exposed publicly (unlike List or Record)?
Flags: needinfo?(arai.unmht)
MakeDefaultConstructor will be a good example how to use self-hosted function as a constructor.

https://dxr.mozilla.org/mozilla-central/rev/23dc78b7b57e9f91798ea44c242a04e112c37db0/js/src/vm/Interpreter.cpp#261


Then, at the point we're calling TypedArrayObjectTemplate::createConstructor first time, we haven't yet initialized self-hosting global, as it's called while initializing self-hosting global.

https://dxr.mozilla.org/mozilla-central/rev/23dc78b7b57e9f91798ea44c242a04e112c37db0/js/src/vm/GlobalObject.cpp#531
>            InitBareBuiltinCtor(cx, global, JSProto_Uint8Array) &&
>            InitBareBuiltinCtor(cx, global, JSProto_Int32Array) &&


I guess they're done because we use Uint8Array and Int32Array in self-hosted code.

https://dxr.mozilla.org/mozilla-central/rev/23dc78b7b57e9f91798ea44c242a04e112c37db0/js/src/builtin/TypedObject.js#1395
>  var flags = new Uint8Array(NUM_BYTES(array.length));

https://dxr.mozilla.org/mozilla-central/rev/23dc78b7b57e9f91798ea44c242a04e112c37db0/js/src/builtin/Sorting.js#114
>        view = new Int32Array(array.buffer);


maybe we could replace those usage with a direct call to `MyConstructor` or what ever, and remove InitBareBuiltinCtor for JSProto_Uint8Array and JSProto_Int32Array from the dependency from self-hosting global.
So that TypedArrayObjectTemplate::createConstructor will be always called after self-hosting global is initialized.
Flags: needinfo?(arai.unmht)
Ah, removing InitBareBuiltinCtor Uint8Array and Int32Array did the trick. 

Morgans-MacBook-Pro-2:_DBG.OBJ mrrrgn$ dist/bin/js
js> Int8Array
function MyConstructor() {
    [native code]
}
js>

I'll see if I can't get a better working example going later.

Thanks!
Ugh, sorry for not seeing this earlier. Self-hosted constructors need some more infrastructure. I added that in bug 1226261. It bounced and I didn't need it anymore, so didn't re-land it then. I can either do that at the end of the week when I'm back from PTO, or someone else could rebase and land it.
Flags: needinfo?(till)
Thanks for all inspiring comments here. So I think the constructor in JS patch should better base on Bug 1226261 as Till mentioned. And What arai commented in Comment 28 will be done in the patch as well. However, if Morgan makes some progress before that it will be fine to me to have that patch.

By the way, maybe it will be a good time to start rewrite other constructors in JS like this one ? If so I can work on other bugs, too.
No longer blocks: es6
Depends on: 1226261
Attached patch bug1232266.patch (obsolete) — Splinter Review
This is the last issue preventing us from passing all kangax TypedArray tests. So I vote for implementing this now instead of waiting for bug 1226261 to happen.

I've tested a few implementation approaches and this alternative provided the best performance when passing an array or a non-iterable object to the TypedArray constructor.
Attachment #8734328 - Attachment is obsolete: true
Attachment #8803422 - Flags: review?(jdemooij)
Comment on attachment 8803422 [details] [diff] [review]
bug1232266.patch

Review of attachment 8803422 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/TypedArrayCommon.h
@@ +480,5 @@
> +            MOZ_ASSERT(offset + j < target->length());
> +#ifdef DEBUG
> +            SharedMem<T*> newDest =
> +                target->template as<TypedArrayObject>().viewDataEither().template cast<T*>();
> +            MOZ_ASSERT(dest == newDest);

@jandem:
Is this assumption correct when valueToNative can trigger gc? Or do I need to call viewDataEither() for each iteration to be safe?
Comment on attachment 8803422 [details] [diff] [review]
bug1232266.patch

Review of attachment 8803422 [details] [diff] [review]:
-----------------------------------------------------------------

Excellent, thanks!

::: js/src/builtin/TypedArray.js
@@ +1527,5 @@
> +    // Step 1.
> +    var iterator = GetIterator(items, method);
> +
> +    // Step 2.
> +    var values = [];

What if we use |var values = new List();| and then |values[i++] = next.value;| below? Is that slower or faster? Using List is kind of nice considering it's called IterableToList, but I don't care too much :)

::: js/src/vm/TypedArrayCommon.h
@@ +461,5 @@
> +        AutoValueVector values(cx);
> +        if (!values.reserve(len - i))
> +            return false;
> +
> +        for (; i < len; i++) {

Nit: no {}

@@ +462,5 @@
> +        if (!values.reserve(len - i))
> +            return false;
> +
> +        for (; i < len; i++) {
> +            MOZ_ALWAYS_TRUE(values.append(srcValues[i]));

values.infallibleAppend(..);

@@ +480,5 @@
> +            MOZ_ASSERT(offset + j < target->length());
> +#ifdef DEBUG
> +            SharedMem<T*> newDest =
> +                target->template as<TypedArrayObject>().viewDataEither().template cast<T*>();
> +            MOZ_ASSERT(dest == newDest);

Yeah, we might have allocated the data inline or in the nursery, and a GC could move this data, so we should reload it on each iteration.

::: js/src/vm/TypedArrayObject.cpp
@@ +1245,5 @@
> +static MOZ_ALWAYS_INLINE bool
> +IsOptimizableInit(JSContext* cx, HandleObject iterable, bool* optimized)
> +{
> +    if (!IsPackedArray(iterable))
> +        return true;

It wasn't immediately clear to me why we returned true here. At the start of this function, please either do |*optimized = false;| or add MOZ_ASSERT(!*optimized), to let readers know about this assumption.

@@ +1314,5 @@
> +            UniqueChars bytes = DecompileValueGenerator(cx, JSDVG_SEARCH_STACK, otherVal, nullptr);
> +            if (!bytes)
> +                return nullptr;
> +            JS_ReportErrorNumberLatin1(cx, GetErrorMessage, nullptr, JSMSG_NOT_ITERABLE,
> +                                       bytes.get());

Add a test for this case, if we don't have one yet?

@@ +1324,5 @@
> +        args2[1].set(callee);
> +
> +        // Step 6.a.
> +        RootedValue thisvOrRval(cx, UndefinedValue());
> +        if (!CallSelfHostedFunction(cx, cx->names().IterableToList, thisvOrRval, args2,

Nit: I'd use:

  RootedValue rval(cx);

And for thisv pass UndefinedHandleValue.
Attachment #8803422 - Flags: review?(jdemooij) → review+
(In reply to Jan de Mooij [:jandem] from comment #34)
> > +        if (!values.reserve(len - i))
> > +            return false;
> > +
> > +        for (; i < len; i++) {
> > +            MOZ_ALWAYS_TRUE(values.append(srcValues[i]));
> 
> values.infallibleAppend(..);

Instead of reserve + append, we could also do:

  if (!values.append(srcValues + i, len - i))
      return false;
(In reply to Jan de Mooij [:jandem] from comment #34)
> ::: js/src/builtin/TypedArray.js
> @@ +1527,5 @@
> > +    // Step 1.
> > +    var iterator = GetIterator(items, method);
> > +
> > +    // Step 2.
> > +    var values = [];
> 
> What if we use |var values = new List();| and then |values[i++] =
> next.value;| below? Is that slower or faster? Using List is kind of nice
> considering it's called IterableToList, but I don't care too much :)
> 

I've tested both alternatives and there was no real difference performance-wise. The only benefit of using an array is that it avoids the JIT (?) issue when List contains only a single element (bug 1291463's description contains a test case for the errant behaviour).
(In reply to André Bargull from comment #36)
> The only benefit of using an array is that it avoids the
> JIT (?) issue when List contains only a single element (bug 1291463's
> description contains a test case for the errant behaviour).

NI'ing myself to look into the List issue.

Using an array here is fine with me, I don't want to block this bug :)
Flags: needinfo?(jdemooij)
Attached patch bug1232266.patchSplinter Review
Updated patch per review comments, carrying r+ from jandem.

Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=91d7485902e6a5ccd2a0b2a86059a56eb58b0516
Attachment #8803422 - Attachment is obsolete: true
Attachment #8803917 - Flags: review+
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4b670be24e6d
Support iterables in TypedArray constructors. r=jandem
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/4b670be24e6d
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
(In reply to Jan de Mooij [:jandem] from comment #37)
> (In reply to André Bargull from comment #36)
> > The only benefit of using an array is that it avoids the
> > JIT (?) issue when List contains only a single element (bug 1291463's
> > description contains a test case for the errant behaviour).
> 
> NI'ing myself to look into the List issue.

Filed bug 1313064.
Flags: needinfo?(jdemooij)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: