Closed Bug 827449 Opened 11 years ago Closed 11 years ago

.length is broken on slow array wrappers

Categories

(Core :: JavaScript Engine, defect)

Other Branch
x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: bhackett1024, Assigned: jorendorff)

References

Details

Attachments

(1 file)

For wrappers of Array objects, the behavior of .length accesses differs between the cases where the wrapped array is dense or slow.

array = [1];
wrapper = wrap(array);
print(wrapper.length);
array.what = 0;
print(wrapper.length);

> js test.js
1
0

The issue seems to be:

- In the dense case the wrapped object is non-native, its getGeneric hook gets called which is passed both the Array object (obj) and Proxy object (receiver).  The length is fetched from the Array and all is well.

- In the slow case the wrapped object is native, it has an explicit 'length' property with a getter (array_length_getter) that is passed only the Proxy.  The proxy is not unwrapped and array_length_getter ends up getting the length off Array.prototype.

I don't know what the right fix is here.

This blocks bug 586842, which makes all Arrays into native objects using array_length_getter.
Blocks: 827490
I bet this is a regression. Taking. I'll try and auto-bisect it.
Assignee: general → jorendorff
This is not a recent regression. The test fails at least back to July 2011. Crazy.
With cross-compartment wrappers, it works fine. So maybe this can be fixed in wrapper handler code. However it's possible that things are like this on purpose; I'm going to ask around.

var gw = newGlobal('new-compartment');
var aw = gw.eval('var a = [1]; a');
gw.eval("a.p = 0;");
assertEq(aw.length, 1);
Attached patch v1Splinter Review
Things are not this way on purpose, according to bholley. He suggested this hack.
Attachment #699443 - Flags: review?(ejpbruel)
Comment on attachment 699443 [details] [diff] [review]
v1

Review of attachment 699443 [details] [diff] [review]:
-----------------------------------------------------------------

I can't think of a better fix, unfortunately.

I really don't like that comment though. You should never comment *what* code does, but *why* it does it. That's totally not clear here from looking at the code. At the very least refer to this bug, please?
Attachment #699443 - Flags: review?(ejpbruel) → review+
In writing an improved comment for this method, I think I convinced myself that this isn't the right fix. I need to think about it. I'll comment tomorrow.
Why do I ever say "I'll comment tomorrow"?

The problem with this is that it causes ordinary Proxy code to do the wrong thing when passing a [[Get]] through to an ordinary getter. The latest spec does not permit unwrapping here.

  var p = new Proxy({get x() { return this; }}, {});
  assertEq(p.x, p);  // passes without patch, fails with patch

So this needs a new approach. I'll round up the usual suspects and try again.

(I'm increasingly convinced that the spec for Proxy is nonsensical and TC39 should either try a different tack or punt it to after ES6.)
So, I've been giving some thought to this.

It seems like wrappers aren't really sound without some clear definition of an associated membrane. I don't know too much about how ES6 Direct Proxies work, so this is mostly coming from the perspective of the wrappers that we use in Gecko. Maybe I should go read that spec at some point.

With js::CrossCompartmentWrapper, we have a clear membrane (the compartment boundary) that we can apply that everything that moves across the border. But with js::Wrapper, we just kind of wing it. This kinda works, but leads to ambiguity as to what to do with |desc->obj| and |receiver|.

Indeed, we want to do different things in different cases. For the outer window, for example, we more or less just want the membrane to be around the inner window only, i.e. a bijection for a single object pair. In the case of waivers, we want a bijection for all objects in the compartment (each underlying object can have a unique waiver associated with it). We currently implement the waiver stuff with WaiveXrayWrapper.cpp, but it seems like a special case of a more general problem.

So I'm wondering if we could generalize the notion of wrapping to arbitrary membranes, defined by a virtual trap on the proxy handler. We would hoist all the PIERCE stuff from CrossCompartmentWrapper into DirectProxyHandler, and then have two virtual traps, one to wrap objects on each side of the membrane. CrossCompartmentWrapper could then override these to do cross-compartment wrapping, and things like outer windows, XrayWaivers, and SCSWs could define their own semantics.

I just came up with this, so there might be glaring problems that I don't see. What say you, jorendorff?
Flags: needinfo?(jorendorff)
So, I've spoken about this with jorendorff, luke, Waldo, billm, and mrbkap. Everyone seems positive about the idea, so I'm going to implement it in Gecko.

The larger question is whether we want something like this for ES6 proxies. The basic issue is that Direct Proxies seem kind of broken from the perspective of implementing a membrane, because there's no way to specify what that membrane is. For example, in the current spec, we have the following trap for |get|:

get: function(target,name,receiver) -> any

If I understand correctly, |target| here will be the target object, and |receiver| will be the proxy. But the trap has no way of determining the relationship between |receiver| and |target|. The fact that |receiver| remains a proxy here seems pretty broken from the perspective of membrane semantics, because |target| and |receiver| are on opposite sides of the membrane. If this were a compartment membrane, this behavior would cause a compartment mismatch.

This manifests itself in other places as well. For example:

apply: function(target,thisArg,args) -> any

Which side of the membrane is |thisArg| on?

So I'm suggesting that proxies should be able to specify the semantics of the membrane they implement (if any). This has the added advantage of making secure membranes _much_ simpler to implement, because you don't have to worry about overriding every single trap in just the right way. The implementation will do all the necessary work to marshal all objects and arguments across the boundary.

Looping in Mark to get his feedback here.
Flags: needinfo?(jorendorff) → needinfo?(erights)
Looping in dherman as well.
Looped in Tom as well. Hi Tom.
Flags: needinfo?(erights)
So, one question.  Is this visible without wrap() at all?  Because that's a shell-only feature/extension, and it doesn't exist on the web.  What's the in-browser way to produce an equivalent failure to wrap() here?
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #12)
> So, one question.  Is this visible without wrap() at all?  Because that's a
> shell-only feature/extension, and it doesn't exist on the web.  What's the
> in-browser way to produce an equivalent failure to wrap() here?

Well, there are two sides of this issue. One of them is that our current notion of same-compartment wrappers (as implemented in SpiderMonkey today) doesn't do the right thing with respect to the various same-compartment wrappers we want to create. This is a gecko-only issue. However, it drew our attention fundamental ambiguities with Direct Proxies, as mentioned in comment 9.
Related to Bobby's comment about the apply hook, how are membranes supposed to unwrap the arguments? From what I've seen so far, I can't find a way for a proxy to tell which side of the membrane each argument is on so that it can unwrap it properly. Does anyone know how this is supposed to work?

Bobby's idea of having special support for membranes makes sense since it makes them less error-prone. However, it seems like there's a more fundamental problem here. Proxies are supposed to allow you to implement a certain abstraction, but there doesn't seem to be a way for them to break through their own abstraction. To make an analogy to C++ or Java, consider this code:

void Class::method(Class *arg) { ... }

It would be as if |method| were not allowed to access the private fields of arg. That would make even basic OOP pretty much impossible.

I suspect I'm just missing something about the spec that would allow this sort of detection and unwrapping. I couldn't find anything about it online though.
(In reply to Bobby Holley (:bholley) from comment #9)
> The larger question is whether we want something like this for ES6 proxies.
> The basic issue is that Direct Proxies seem kind of broken from the
> perspective of implementing a membrane, because there's no way to specify
> what that membrane is. For example, in the current spec, we have the
> following trap for |get|:
> 
> get: function(target,name,receiver) -> any
> 
> If I understand correctly, |target| here will be the target object, and
> |receiver| will be the proxy.

Indeed, except |receiver| is not necessarily the proxy: if the proxy is used as a prototype of some other object, |receiver| may be an object inheriting from the proxy. Consider:

var p = new Proxy({ get foo() { return this; } }, {});
var c = Object.create(p);
assert(c.foo === c); // get trap on p called with receiver = c

> But the trap has no way of determining the
> relationship between |receiver| and |target|. The fact that |receiver|
> remains a proxy here seems pretty broken from the perspective of membrane
> semantics, because |target| and |receiver| are on opposite sides of the
> membrane. If this were a compartment membrane, this behavior would cause a
> compartment mismatch.
> 
> This manifests itself in other places as well. For example:
> 
> apply: function(target,thisArg,args) -> any
> 
> Which side of the membrane is |thisArg| on?

Since the caller can pass arbitrary values as the |thisArg| via Function.prototype.{call,apply}, a membrane proxy must treat the |thisArg| as a value coming from the outside. If the |thisArg| is the proxy itself, the proxy can substitute itself for its |target|.

> So I'm suggesting that proxies should be able to specify the semantics of
> the membrane they implement (if any). This has the added advantage of making
> secure membranes _much_ simpler to implement, because you don't have to
> worry about overriding every single trap in just the right way. The
> implementation will do all the necessary work to marshal all objects and
> arguments across the boundary.

I think a first step would be defining the membrane wholly in Javascript itself as a library (proxies + weakmaps should be sufficiently powerful to express this). Then we can see how that implementation would benefit from being natively provided. So far, Mark and I have written some generic membrane code, but this hasn't been thoroughly tested. See:
http://wiki.ecmascript.org/doku.php?id=harmony:proxies#an_identity-preserving_membrane
https://code.google.com/p/es-lab/source/browse/trunk/src/proxies/simpleMembrane.js
https://github.com/tvcutsem/harmony-reflect/blob/master/examples/simple_membrane.js
(In reply to Bill McCloskey (:billm) from comment #14)
> Related to Bobby's comment about the apply hook, how are membranes supposed
> to unwrap the arguments? From what I've seen so far, I can't find a way for
> a proxy to tell which side of the membrane each argument is on so that it
> can unwrap it properly. Does anyone know how this is supposed to work?

Yes: use two WeakMaps to maintain a cache of inside->outside and outside->inside wrappers, as illustrated by this code:
http://wiki.ecmascript.org/doku.php?id=harmony:proxies#an_identity-preserving_membrane

Unwrapping boils down to performing a lookup in the cache.

AFAICT, membranes were one of the motivating use cases for WeakMaps, as it's easy to construct a cyclic data structure that crosses a membrane.
(In reply to Tom Van Cutsem from comment #15)
> > get: function(target,name,receiver) -> any
> > 
> > If I understand correctly, |target| here will be the target object, and
> > |receiver| will be the proxy.
> 
> Indeed, except |receiver| is not necessarily the proxy: if the proxy is used
> as a prototype of some other object, |receiver| may be an object inheriting
> from the proxy.

Sure, sorry. I meant to specify that I was referring to the simple own-property case.  

Consider:
> 
> var p = new Proxy({ get foo() { return this; } }, {});
> var c = Object.create(p);
> assert(c.foo === c); // get trap on p called with receiver = c
> 
> > But the trap has no way of determining the
> > relationship between |receiver| and |target|. The fact that |receiver|
> > remains a proxy here seems pretty broken from the perspective of membrane
> > semantics, because |target| and |receiver| are on opposite sides of the
> > membrane. If this were a compartment membrane, this behavior would cause a
> > compartment mismatch.
> > 
> > This manifests itself in other places as well. For example:
> > 
> > apply: function(target,thisArg,args) -> any
> > 
> > Which side of the membrane is |thisArg| on?
> 
> Since the caller can pass arbitrary values as the |thisArg| via
> Function.prototype.{call,apply}, a membrane proxy must treat the |thisArg|
> as a value coming from the outside. If the |thisArg| is the proxy itself,
> the proxy can substitute itself for its |target|.
> 
> > So I'm suggesting that proxies should be able to specify the semantics of
> > the membrane they implement (if any). This has the added advantage of making
> > secure membranes _much_ simpler to implement, because you don't have to
> > worry about overriding every single trap in just the right way. The
> > implementation will do all the necessary work to marshal all objects and
> > arguments across the boundary.
> 
> I think a first step would be defining the membrane wholly in Javascript
> itself as a library (proxies + weakmaps should be sufficiently powerful to
> express this). Then we can see how that implementation would benefit from
> being natively provided. So far, Mark and I have written some generic
> membrane code, but this hasn't been thoroughly tested. See:
> http://wiki.ecmascript.org/doku.php?id=harmony:proxies#an_identity-
> preserving_membrane
> https://code.google.com/p/es-lab/source/browse/trunk/src/proxies/
> simpleMembrane.js
> https://github.com/tvcutsem/harmony-reflect/blob/master/examples/
> simple_membrane.js
Doh, sorry. Accidentally hit enter before I was finished typing. Let me finish and repost that comment.
(In reply to Tom Van Cutsem from comment #15)
> Indeed, except |receiver| is not necessarily the proxy: if the proxy is used
> as a prototype of some other object, |receiver| may be an object inheriting
> from the proxy.

Sure, sorry. I meant to specify that I was referring to the simple own-property case.

The receiver case is interesting in general because, supposing the receiver does in fact inherit the property from its prototype, the receiver may or may not have a corresponding representation on the other side of the membrane. This depends on the semantics of the membrane, which for isn't well-defined for our C++ wrapper proxies, which is why we've been having trouble with this.

> 
> I think a first step would be defining the membrane wholly in Javascript
> itself as a library (proxies + weakmaps should be sufficiently powerful to
> express this).

Ah, ok. The WeakMaps things clears this up a lot and answers a lot of questions I had about how this stuff is supposed to work. If the membranes are being tracked by closures and WeakMaps, I can imagine that they can implement membranes properly, albeit clumsily. We don't have the weakmap stuff on the C++ side though, which is why I'm going to experiment with this.

> Then we can see how that implementation would benefit from
> being natively provided. 

In general, I think it would. If we can find a way to express it succinctly, it seems much more robust to describe the semantics of the membrane with one or two traps and have the implementation apply it to all the traps. But we'll see how it works out.
I've filed bug 857861 to track the relevant gecko work here.
Forget about the links I posted earlier: I just got around to implementing identity-preserving revocable membranes in JS using direct proxies. It passes some rudimentary tests, but probably still contains holes.

Nevertheless, this prototype might help provide some insight in what a generalized membrane implementation would look like:

https://github.com/tvcutsem/harmony-reflect/blob/master/examples/membrane.js
So as explained in bug 857861, I've halted the generalized membrane work. That leaves the question of what to do with this bug.

I think one of the big problems here is that we don't currently have any kind of same-compartment wrapper that isn't a super-class of CrossCompartmentWrapper. Really, the use cases are different: CrossCompartmentWrapper wants to implement full membrane semantics, whereas our same-compartment wrappers generally just want to be a separate object identity that transparently impersonates a separate object. So jorendorff's |receiver| hack in this bug should really go in a separate SameCompartmentWrapper subclass that remaps all references incoming and outgoing references between the target and the doppelganger.

A tricky point with same-compartment wrappers has always been the question of unwrapping. We have this nasty /* stopAtOuter */ parameter to the unwrapping functions to determine whether we should unwrap outer window proxies or not. But that has always seemed like a special case of a general problem that I couldn't quite put my finger on. And now I'm pretty sure of the issue: for things concerning object identity, we don't want to wrap, because it's a separate object. But when passing into C++ or dealing with anything that depends on object layout, we do want to unwrap.

So I think we should create a SameCompartmentWrapper class that inherits from Wrapper. It would do spot remapping for things like |receiver| and |desc->obj|, and we could then replace all those outer-window checks with SameCompartmentWrapper check.

Thoughts, jorendorff?
Jorendorff and I talked about this. We decided that we shouldn't do the receiver remapping hack, because that would cause us to pass unwrapped outer window references to script, which would be bad. The proper solution here would be to implement some kind of nativeCall for PropertyOps, but I don't think we really care. So we just decided to WONTFIX this bug.

As for the stopAtOuter stuff, I think I've convinced myself that we actually _do_ want to unwrap Xray waivers (and then reapply them) in the XPConnect wrapping jungle. So I think for now we should just leave stopAtOuter as-is until we have a second use-case for something that needs to be conditionally unwrapped in the same way.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: