Closed Bug 784293 Opened 12 years ago Closed 12 years ago

Support creating and lazily cloning arbitrary objects in self-hosted code

Categories

(Core :: JavaScript Engine, defect)

Other Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla20

People

(Reporter: till, Assigned: till)

References

(Blocks 1 open bug)

Details

(Whiteboard: [js:t])

Attachments

(1 file, 7 obsolete files)

Currently, only functions are allowed to be top-level members of self-hosted code. This should change to enable stateful operations in self-hosted builtins.

All top-level members should be cloned into the current global on first access, making all state unique to each global instead of shared across all of them.
Blocks: 769872
Blocks: 784400
Whiteboard: [js:t]
Attached patch wip (obsolete) — Splinter Review
This compiles and seems like the right approach to me. Unfortunately, it throws with
Assertion failure: bce->stackDepth >= 0, at [...]/js/src/frontend/BytecodeEmitter.cpp:225

I'm not sure what, but clearly, I'm doing something wrong during bytecode emission.

Note that this patch only adds interpreter support for JSOP_SETINTRINSIC, as that should be an operation that only happens during the first evaluation of self-hosted code. We might want to assert that, though.
Attached patch wip, v2 (obsolete) — Splinter Review
Attachment #670132 - Attachment is obsolete: true
Attached patch v3 (obsolete) — Splinter Review
This patch support assigning functions to vars during initial interpretation of self-hosted code and then referencing them at any point in time.

Specifically, this enables storing stable references to native builtins, such as Object.defineProperty, that can then be used with the guarantee of them not being overwritten by client code.

I renamed intrinsicName almost everywhere to getintrinsic and added setintrinsic.

However, I wonder if it shouldn't be renamed to something like shname, in analogy to gname: it refers to both the intrinsics installed via js_DefineProperties, as well as to self-hosted builtins within the intrinsics (or: self-hosting) holder.
Assignee: general → tschneidereit
Attachment #684196 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #684522 - Flags: review?(luke)
When using attachment 684522 [details] [diff] [review] on top of current mozilla-central and adding statements such as

var util_ListIndexOf = Array.prototype.indexOf;

to my self-hosted code, the js shell crashes with "TypeError: undefined has no properties".
Attached patch v4 (obsolete) — Splinter Review
(In reply to Norbert Lindenberg from comment #4)
> When using attachment 684522 [details] [diff] [review] on top of current
> mozilla-central and adding statements such as
> 
> var util_ListIndexOf = Array.prototype.indexOf;
> 
> to my self-hosted code, the js shell crashes with "TypeError: undefined has
> no properties".

Ugh, I'm not entirely sure how, but I lost the initialization of standard classes in the last patch.

Although that is fixed in the new version, self-hosted functions still won't work: even though they get cloned lazily, they still aren't available during intialization of the self-hosting environment. I hope to be able to rectify this later, but for now, you'll have to refer to them by their original name.

Instead of your example, you'd write
var util_ListIndexOf = ArrayIndexOf;

And make sure that your .js file is embedded after array.js.


Additionally, this version contains support for cloning acyclic objects and these types:
- Number (both in32 and double)
- Boolean
- String
- RegExp
- Date

Arrays don't yet work and I think that the way I iterate over the properties might not be ideal, so this is a somewhat-usable wip for now.
Attachment #684522 - Attachment is obsolete: true
Attachment #684522 - Flags: review?(luke)
It seems we have a little chicken-and-egg problem: The patch moves the call to GlobalObject::initStandardClasses up, presumably so that the self-hosted code can find the standard built-in objects it wants to reference, but the initialization code for the Internationalization constructors also tries to reference self-hosted functions. See, for example, the function InitCollatorClass in attachment 685044 [details] [diff] [review].
(In reply to Norbert Lindenberg from comment #6)
> It seems we have a little chicken-and-egg problem: The patch moves the call
> to GlobalObject::initStandardClasses up, presumably so that the self-hosted
> code can find the standard built-in objects it wants to reference, but the
> initialization code for the Internationalization constructors also tries to
> reference self-hosted functions. See, for example, the function
> InitCollatorClass in attachment 685044 [details] [diff] [review].

I see, mmh. One option to work around that is to only do that initialization if we're not currently initializing the self-hosting global. You can check that using JSRuntime::isSelfHostedGlobal(cx->global())

Another option would be to only initialize a subset of the standard classes for the self-hosting global.

I'm not entirely sure which of these options is better. I kinda like that the self-hosting global is very similar to a normal one, but at the same time it's nice to not have to work around too many things during initialization of the standard classes.
Attached patch v5 (obsolete) — Splinter Review
This version should work for pretty much everything except Arrays, which I don't yet clone correctly.

Cyclic dependencies get handled correctly by storing every cloned object in an original->clone HashMap and preferring the stored clones over re-cloning.

The HashMap isn't yet properly rooted, I'll have to talk to Terrence about how to do that.
Attachment #684971 - Attachment is obsolete: true
Attached patch v6 (obsolete) — Splinter Review
Turns out that some builds don't automatically include the various "*Object-inl.h" headers.

Like, well, almost all try builds.

I just hacked the headers into jscntxt.cpp for now, but a final patch will move all the self-hosting stuff into it's own vm/SelfHosting{.cpp,.h} pair.
Attachment #687473 - Attachment is obsolete: true
Attached patch v7 (obsolete) — Splinter Review
This version is rebased on top of the patches in bug 819700.
It fully supports cloning almost all object graphs, with two known exceptions:
- scripted getters/setters are turned into the getter-returned value
- typed arrays aren't cloned at all, currently

Asking for review anyway, as even with these restrictions, the provided cloning-support should be sufficient for what Norbert needs in bug 769872.
Attachment #690160 - Flags: review?(jwalden+bmo)
Attachment #687524 - Attachment is obsolete: true
(In reply to Till Schneidereit [:till] from comment #10)
> Asking for review anyway, as even with these restrictions, the provided
> cloning-support should be sufficient for what Norbert needs in bug 769872.

Yes, please. It would be great to see this functionality, even with the stated restrictions, checked in. Can't test v7 unfortunately because it isn't up to date with m-c.
Depends on: 820390
It seems there's a problem with how objects get cloned: The copying of properties appears to be visible to client software.

I have a test that checks that property access within the implementation of the internationalization API is not visible:
http://hg.ecmascript.org/tests/test262/file/53c4ade82d14/test/suite/intl402/ch10/10.1/10.1.1_10.js
This test relies on the taintProperties and taintDataProperty functions in
http://hg.ecmascript.org/tests/test262/file/53c4ade82d14/test/harness/testIntl.js
to install setters on Object.prototype for three property names: "localeMatcher", "kn", "kf".

This test passed until I updated this section of the self-hosted Intl.js file:

-function intl_collatorKeyMappings() {
+var intl_collatorKeyMappings = (function () {
     return {
         kn: {property: "numeric", type: "boolean"},
         kf: {property: "caseFirst", type: "string", values: ["upper", "lower", "false"]}
     };
-}
+}());
i.e., instead of creating a new object with "kn" and "kf" properties each time it's needed, I now create this object once during initialization of the self-hosting global, and let the framework clone it.

The error message indicates that the setter for the "kn" property is being called. I checked my code multiple times to make sure that no property "kn" is set anywhere. I suspect therefore that the setter is called while cloning the object. That shouldn't happen - in spec terminology, cloning should "define" properties, not "assign" them.
http://www.2ality.com/2012/08/property-definition-assignment.html
Oh, good catch! I'll update the patch to use putProperty or something to that effect.
Comment on attachment 690160 [details] [diff] [review]
v7

Review of attachment 690160 [details] [diff] [review]:
-----------------------------------------------------------------

Mostly just missing failure checks, looks good assuming we accept the clone-already-made-values premise.

Which, well...I'm a bit leery of.  It just has a Smell associated with it, to me.  Let's run with this for now, because it's forward progress, but there's gotta be a better way than this.  I'm flagging jorendorff with an r? here so if he has any good ideas here, he can throw them in the mix, but consider my r+ good enough for now.  (I'd have made his a feedback? or something, to make clearer his response isn't necessary to move on here, but reusing the r? lets the review-mail keep going to you.)

::: js/src/frontend/BytecodeEmitter.cpp
@@ +1172,5 @@
> +        switch (*op) {
> +          case JSOP_NAME:     *op = JSOP_GETINTRINSIC; break;
> +          case JSOP_SETNAME:  *op = JSOP_SETINTRINSIC; break;
> +          case JSOP_SETCONST:
> +            /* Not supported. */

This worries me.  This should be an assertion, or something, so that people don't try to define consts in self-hosted code.  Dunno what would happen if you *did*, but we shouldn't tempt fate that way.

::: js/src/jsfun.cpp
@@ +1481,5 @@
>      clone->flags = fun->flags & ~JSFunction::EXTENDED;
>      if (fun->isInterpreted()) {
> +        if (fun->isInterpretedLazy()) {
> +            AutoCompartment ac(cx, fun);
> +            fun->getOrCreateScript(cx);

Don't you need to check for failure somehow here?

::: js/src/jsinterpinlines.h
@@ +413,5 @@
> +SetIntrinsicOperation(JSContext *cx, JSScript *script, jsbytecode *pc, HandleValue val)
> +{
> +    RootedPropertyName name(cx, script->getName(pc));
> +    RootedValue valCopy(cx, val);
> +    return cx->global()->setIntrinsicValue(cx, name, valCopy);

Why do you need valCopy here?  setIntrinsicValue takes a HandleValue.  (The current set* botch of taking a MutableHandle should be pushed into that method, not exposed in interfaces.)

::: js/src/vm/GlobalObject.cpp
@@ +379,5 @@
> +    RootedObject intrinsicsHolder(cx);
> +    if (cx->runtime->isSelfHostingGlobal(self)) {
> +        intrinsicsHolder = this;
> +    } else {
> +        intrinsicsHolder = JS_NewObject(cx, NULL, NULL, self);

Hmm, we should be able to create an object faster than via an API call.  NewBuiltinInstance(cx, &ObjectClass) maybe?  I don't remember the exact incantation offhand.  Fine by me if you want to put this in a followup or something, doesn't really matter here.

::: js/src/vm/GlobalObject.h
@@ +401,5 @@
> +#endif
> +        RootedObject holder(cx, intrinsicsHolder());
> +        RootedId id(cx, NameToId(name));
> +        RootedValue valCopy(cx, value);
> +        mozilla::DebugOnly<bool> ok = JSObject::setGeneric(cx, holder, holder, id, &valCopy, false);

Use setProperty here, please.  And, um, I guess this flat-out prevents strict mode semantics from ever applying to assignments to globals in self-hosted code.  I guess we can fix that when we fix strict mode enough to use it in self-hosted code.

@@ +402,5 @@
> +        RootedObject holder(cx, intrinsicsHolder());
> +        RootedId id(cx, NameToId(name));
> +        RootedValue valCopy(cx, value);
> +        mozilla::DebugOnly<bool> ok = JSObject::setGeneric(cx, holder, holder, id, &valCopy, false);
> +        JS_ASSERT(ok);

You can't assert ok here -- that depends on all sorts of failure conditions not happening.  Just return that result?

::: js/src/vm/SelfHosting.cpp
@@ +14,5 @@
>  #include "gc/Marking.h"
>  
>  #include "jsfuninlines.h"
> +#include "jstypedarrayinlines.h"
> +#include "vm/BooleanObject-inl.h"

Blank line between js*-inl.h and */*-inl.h includes.

@@ +132,5 @@
>  {
>      CallArgs args = CallArgsFromVp(argc, vp);
> +    if (!args.length() >= 1 || !args[0].isObject() || !args[0].toObject().isFunction())
> +        return false;
> +    args[0].toObject().toFunction()->setIsSelfHostedConstructor();

Erm.  Why wouldn't we assert if someone passes in a bad value?  I'm not understanding this change.

And, um, |!args.length() >= 1| does not do what you meant it to do.  Or well, um, actually it does, but this is IOCCC-land at best.  :-)

@@ +155,5 @@
>          return false;
>      JS_SetGlobalObject(cx, selfHostingGlobal_);
>      JSAutoCompartment ac(cx, cx->global());
> +    Rooted<GlobalObject*> shg(cx, &selfHostingGlobal_->asGlobal());
> +    GlobalObject::initStandardClasses(cx, shg);

You need to check for failure of this.

And, um, this is a bit odd, but are we assuming standard classes aren't going to use self-hosting stuff here?  'cause that's what it reads like, but obviously that's not quite the case.  There needs to be some sort of comment here explaining what's up.

@@ +194,5 @@
>  {
>      MarkObjectRoot(trc, &selfHostingGlobal_, "self-hosting global");
>  }
>  
> +typedef js::HashMap<JSObject *, JSObject *> CloneMemory;

You're in a |using namespace js|, so remove the js::.

@@ +209,5 @@
> +CloneProperties(JSContext *cx, HandleObject obj, HandleObject clone, CloneMemory &clonedObjects)
> +{
> +    RootedId id(cx);
> +    RootedValue val(cx);
> +    js::AutoIdVector ids(cx);

No js::.

@@ +251,5 @@
> +
> +static bool
> +CloneValue(JSContext *cx, MutableHandleValue vp, CloneMemory &clonedObjects)
> +{
> +    if (vp.isObject()) {

It would be nice to see the short cases handled first, then have object be the final one.  This would help cut down on indentation.  Although, come to think of it, probably it would be better to put the object case into a separate CloneObject method.  That's even more readable than un-indenting it.

@@ +273,5 @@
> +            clone = BooleanObject::create(cx, srcObj->asBoolean().unbox());
> +        } else if (srcObj->isNumber()) {
> +            clone = NumberObject::create(cx, srcObj->asNumber().unbox());
> +        } else if (srcObj->isString()) {
> +            Rooted<JSStableString*> str(cx, srcObj->asString().unbox()->ensureStable(cx));

if (!str)
    return false;

@@ +274,5 @@
> +        } else if (srcObj->isNumber()) {
> +            clone = NumberObject::create(cx, srcObj->asNumber().unbox());
> +        } else if (srcObj->isString()) {
> +            Rooted<JSStableString*> str(cx, srcObj->asString().unbox()->ensureStable(cx));
> +            str = js_NewStringCopyN(cx, str->chars().get(), str->length())->ensureStable(cx);

if (!str)
    return false;

(This is separate from the previous one; ensuring stable and copying are each separately fallible.)

@@ +279,5 @@
> +            clone = StringObject::create(cx, str);
> +        } else if (srcObj->isDenseArray()) {
> +            clone = CloneDenseArray(cx, srcObj, clonedObjects);
> +            vp.setObject(*clone);
> +            return true;

The vp.setObject/return true lines here should be removed to share the common ones at end, right?

@@ +306,5 @@
> +        vp.setString(clone);
> +    } else {
> +        JSString *valSrc = JS_ValueToSource(cx, vp);
> +        printf("Error: Can't yet clone value: %s\n", JS_EncodeString(cx, valSrc));
> +        return false;

if (JSString *valSrc = ...)
    printf(...);
return false;

@@ +350,5 @@
>       * initializing the runtime (see JSRuntime::initSelfHosting).
>       */
> +    if (cx->global() != selfHostingGlobal_) {
> +        CloneMemory clonedObjects(cx);
> +        if (!(clonedObjects.init() && CloneValue(cx, &val, clonedObjects)))

!... || !... surely?
Attachment #690160 - Flags: review?(jwalden+bmo)
Attachment #690160 - Flags: review?(jorendorff)
Attachment #690160 - Flags: review+
Contains all requested changes except for one:

>@@ +279,5 @@
>> +            clone = StringObject::create(cx, str);
>> +        } else if (srcObj->isDenseArray()) {
>> +            clone = CloneDenseArray(cx, srcObj, clonedObjects);
>> +            vp.setObject(*clone);
>> +            return true;

> The vp.setObject/return true lines here should be removed to share the common ones at end, right?

This is required because I ran into the problem that invoking CloneProperties on a dense array caused an error because it attempted to clone the "length" property. To be honest, I don't know why that happens, but seeing as how dense arrays for now can't have named properties we'd want to clone, it seems fine to me.


Try-servering once more for good measure here: http://tbpl.mozilla.org/?tree=Try&rev=622e03f19225
Attachment #690160 - Attachment is obsolete: true
Attachment #690160 - Flags: review?(jorendorff)
Comment on attachment 692606 [details] [diff] [review]
Support creating and lazily cloning arbitrary objects in self-hosted code.

Sadly, bzexport doesn't support setting review flags or requesting feedback. So carrying Waldo's r+ and f?'ing jorendorff here.
Attachment #692606 - Flags: review+
Attachment #692606 - Flags: feedback?(jorendorff)
Try run for 622e03f19225 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=622e03f19225
Results (out of 360 total builds):
    exception: 17
    success: 300
    warnings: 38
    failure: 5
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tschneidereit@gmail.com-622e03f19225
Green on try, let's land this. I really do hope that it works as expected when actually used by Intl.js, but things are looking pretty good.

https://hg.mozilla.org/integration/mozilla-inbound/rev/7e44aec095e3
Summary: self-hosting: Support top-level vars and lets → Support creating and lazily cloning arbitrary objects in self-hosted code
The issue described in comment 12 still exists; I've filed bug 822080 for it. Otherwise this patch does what I need for Intl.js.
https://hg.mozilla.org/mozilla-central/rev/7e44aec095e3
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Comment on attachment 692606 [details] [diff] [review]
Support creating and lazily cloning arbitrary objects in self-hosted code.

Review of attachment 692606 [details] [diff] [review]:
-----------------------------------------------------------------

I don't get a Smell from this approach. Looks fine.

Apart from the approach taken in the patch, which I'll call run-then-clone, and the opposite approach, clone-then-run, Waldo and I discussed a third option:

<Waldo> jorendorff: or, another thing, have a compilation mode
  where you can have a script in one compartment create values from
  another
<jorendorff> Waldo: now *that* sounds error-prone to me ;)
<Waldo> jorendorff: with native implementations we only have one
  implementation, and it doesn't have to be compiled once per
  global, or have values cloned
<Waldo> jorendorff: we should be able to have something similar
  for self-hosted code
<Waldo> there's no reason why we can't
<jorendorff> Agreed.
<jorendorff> All I'm saying is, it's tricky to get truly
  right. You'd have to weaken some assertions. I think we would end
  up with bugs that would only be discovered if you fuzz-tested
  self-hosting [...]
<Waldo> maybe
<Waldo> or maybe not
<Waldo> I don't think we can actually say this in advance of implementing anything

The assertions I had in mind are compartment assertions. They are already weakened in random places because of the atomsCompartment, so I think we already know this is tolerable.
Attachment #692606 - Flags: feedback?(jorendorff) → feedback+
Try run for 622e03f19225 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=622e03f19225
Results (out of 361 total builds):
    exception: 18
    success: 300
    warnings: 38
    failure: 5
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tschneidereit@gmail.com-622e03f19225
No longer depends on: 825705
Thanks for the Feedback, Jason. (And sorry for the belated response.)

Using only one instance for all self-hosted scripts sounds promising, indeed. I wonder, however, if the effort for that wouldn't be better spent on getting bug 679940 working. From some measurements I did, just sharing the bytecode across all compartments (and not only for self-hosted, but for all code), would save considerable amounts of memory. The more we could share there, the less important it should become not to clone scripts, right?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: