objects created by or on behalf of fast natives are parented to the wrong proto and global

RESOLVED DUPLICATE of bug 650353

Status

()

defect
RESOLVED DUPLICATE of bug 650353
9 years ago
8 years ago

People

(Reporter: gal, Assigned: Waldo)

Tracking

({regression})

Trunk
mozilla2.0b12
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(blocking2.0 .x+)

Details

(Whiteboard: [softblocker][sg:low])

Attachments

(2 attachments, 5 obsolete attachments)

No description provided.
Assignee: nobody → general
Component: XPConnect → JavaScript Engine
QA Contact: xpconnect → general
Blocks: 579471
No longer depends on: 618827
blocking2.0: betaN+ → final+
Whiteboard: [softblocker]
Posted file test case
Assignee: general → gal
This is actually a nasty fallout from making constructors fast natives. The constructor no longer pushes a stack frame, so when invoking a constructor from a different scope cx->scopeChain() is wrong. We really want to make objects in the constructor's scope.

This is somewhat easy to fix for constructors (just use the callee's global instead of cx->scopeChain()), but it also affects other fast native code. I.e. if Object.getOwnPropertyDescriptor from scope A is called from a function in scope B, we should make the object it makes a couple layers down for scope A, not cx->scopeChain() (which is B).

This is a regression for constructors, and a bug for everything else.
Whiteboard: [softblocker] → [softblocker][sg:low]
Assignee: gal → general
dmandelin, I really think we should fix this for 4 but we survived for a couple months with this on trunk so its probably not catastrophic. There is probably a mild risk that this is abused as a vector for privilege escalation.
Unless someone else really wants this, I'll work on it tomorrow.
Assignee: general → jwalden+bmo
++waldo
Summary: new builtin objects are parented to the wrong proto and global → objects created by or on behalf of fast natives are parented to the wrong proto and global
Brendan and I discussed this bug. We might want to split this into 2 bugs:

1. Go through all constructors in the engine and also xpconnect and grab the global from JS_CALLEE instead of the scope chain. This will fix the regression.

2. In a separate bug we should basically eliminate the use of fp->scopeChain since it can be wrong if a native is on top of the stack (which we can't tell). callee should be passed explicitly everywhere. Such is the price of eliminating the stack frame for fast natives. This is not a regression, or well its a very long standing regression ever since we have fast natives (and quick stubs!).
(In reply to comment #6)
> 2. In a separate bug we should basically eliminate the use of fp->scopeChain
> since it can be wrong if a native is on top of the stack (which we can't tell).
> callee should be passed explicitly everywhere. Such is the price of eliminating
> the stack frame for fast natives. This is not a regression, or well its a very
> long standing regression ever since we have fast natives (and quick stubs!).

Fast natives came in with bug 385393, the CVS commit was 2007/08/02. They came with restrictions including not calling anything on cx->fp from the native.

The slow native removal was bug 581263, just over two years later, 2010/08/16. So I do not think "long standing regression" is accurate given the timelines.

We need to make our internal and public APIs safe-by-design in light of this bug.

/be
Can't do date math without coffee... "just over three years later", I mean. And we had bugs with fast natives looking at cx->fp, it was no picnic.

Still this bug seems like a pretty hard softblocker. We've lived with the main effects only for six months. We could patch in a .x but better to kill it dead now. Even safer APIs seem warranted right now.

/be
Gross.  I think this issue was actually discussed and what made it seem OK was the fuzzy fact that fast natives were treated like part of the scripted caller, hence it was expected that calling across globals would go through some wrappers that would set things straight.  In theory, this cold still be done but I gather that we don't want to go this direction.

One bit of good news is that most code outside js uses nsContentUtils::GetDocumentFromCaller which uses nsXPConnect::GetCaller which is JS_CALLEE(cx, vp) of the innermost native call to xpconnect.

This is pretty gross for xpconnect/js though: we almost had this really simple story of "just use the current scope chain", but now it becomes: "if all you want is the compartment modulo scope chain or you know you are in an interpreted frame, use the current scope chain; otherwise, find JS_CALLEE(cx, vp) for the innermost native call... somehow."

One attractive option to me is to fix JS_GetScopeChain/JS_GetGlobalForScopeChain/cx->scopeChain() so that they always return the right scope chain.  I think this could be achieved efficiently (without slowing down the common hot call paths) by:

- Add a field "Value *innermostNativeCallee" to regs.  It points to the most recent JS_CALLEE of a fast native call.

- Define

JSContext::globalForScopeChain() {
  return regs
         ? (Value *)regs->fp > regs->innermostNativeCallee
           ? regs->fp->scopeChain().getGlobal()
           : regs->innermostNativeCallee->toObject()->getGlobal()
         : innerized this->globalObject
}

- Set/restore innermostNativeCallee in CallJSNative.
- tjit calls never change global, so can avoid updating innermostNativeCallee
- The fastest/most comment mjit call paths are specialized to the callee, so can also avoid the update

With this, cx->scopeChain() is always what you want unless, as an optimization, you are obviously in an interpreted frame and you choose to use fp->scopeChain().
Sounds gross. I would rather hand the current scope around in the engine explicitly. There is no wrapper here because it's same origin.
(In reply to comment #10)
> Sounds gross. I would rather hand the current scope around in the engine
> explicitly.

Its a strictly simpler mental model.  It means that we can actually say 'the current scope chain' and be done with it and have to create a new distinction between "the current scope chain" and "the parent for new objects" (which would be that explicit parameter to NewBuiltinClassInstance).

I also get the feeling that having the engine always know the innermost callee (native or scripted) will only grow in utility.  This avoids hacks that I have seen working around the fact that the calleeValue of the innermost stack frame is usually not good enough and the information of the actual innermost callee is not available.

> There is no wrapper here because it's same origin.

Yes, and I wasn't proposing adding one.
(In reply to comment #11)
oops.  ...and be done with it _rather_ than have to create a new ...
Well, I guess one way this explicit parameter could be simpler than what I'm proposing is if having the wrong global (but right compartment) is *only* an issue for a small fixed number of places in the JS engine.  Is this the case?
(In reply to comment #13)

Talking to Waldo and considering this question another example came to mind: regexp statics are found through the current global.  Non-obvious examples like this support having an "always right" cx->globalForScopeChain().

Also, as Waldo pointed out, vp[0] gets clobbered by JS_SET_RVAL.  This can be fixed centrally by a tweak to comment 9 (dup the callee on top of the native's args (again, not for mjit/tjit hot paths)).  Doing this by having callers manually pass around vp's invites bugs (since the JS_CALLEE will be far from the aliased JS_SET_RVAL).

Also, as Waldo pointed out, the impl of JSContext::globalForScopeChain should replace
  innerized this->globalObject
with
  innermostNativeCallee
  ? innermostNativeCallee->toObject()->getGlobal()
  : innerized this->globalObject
meaning that innermostNativeCallee would be in cx, not regs.
For now I'm forging ahead with comment 9's hack in case it might be simpler for the short term.  I'll find out more as I continue implementing it.

But in the longer run I tend to agree with Andreas.  Luke argues this is a feature, to simplify life for extremely attenuated native-method to scope-chain-use progressions.  This seems to compound the error of attenuation with the error of yet more avoidable implementation complexity.

Comment 11's "will only grow in utility" worries me.  ES5's algorithms almost universally depend on explicitly provided information.  (The sole exception that comes to mind is some strict mode behavior, but this merely avoids duplication: it could easily be made explicit.)  I cannot believe ES6 will do otherwise.

If we implement this behavior as an implicit property of the system, we alone will be responsible for maintaining and understanding that logic.  Sometimes (compartments, GC, the many string kinds, strings versus ids, slow and dense arrays) we can, should, and sometimes must do this, even if it means a world of hurt (multiple worlds for compartments).  I don't see imperatives here when a simpler solution is possible.
> Comment 11's "will only grow in utility" worries me.  ES5's algorithms almost
> universally depend on explicitly provided information.

... well except for the implicit "single global object", and that is exactly the issue at hand!

As I see it, "current global (via the current scope chain)" is the same type of containment scheme as compartments: there is a tree and being "inside" a leaf implies being inside its ancestors.  (And, e.g., zones would add another ancestor to compartments.)

Now, I generally much prefer explicit parameterization to implicit state.  But it seems like if we were to do that, we should do it all the way: completely eradicate any notion of "the current scope chain".  Otherwise, you get this unpleasant middle ground where your explicit parameter can diverge from your implicit one (which is exactly whats happening in bug 629822).
(In reply to comment #15)
> Comment 11's "will only grow in utility" worries me.

I said that wrong.  What I should have said was "will avoid bugs that would otherwise continue to bite us".
(In reply to comment #16)
> But it seems like if we were to do that, we should do it all the way:
> completely eradicate any notion of "the current scope chain".  Otherwise, you
> get this unpleasant middle ground where your explicit parameter can diverge
> from your implicit one (which is exactly whats happening in bug 629822).

I think this would be a good change, although again, possibly not in the short term.
Posted patch WIP (obsolete) — Splinter Review
This passes all shell and JIT tests.  But the browser is the acid test, of course, and to be honest I don't yet even know that the patch does what it's posited to do.  I'll do a build and see, possibly tonight but no promises.
(In reply to comment #16)
> > Comment 11's "will only grow in utility" worries me.  ES5's algorithms almost
> > universally depend on explicitly provided information.
> 
> ... well except for the implicit "single global object", and that is exactly
> the issue at hand!

This is important. ES5 is great and all, but it's not the source of all truth. Don't put it on a pedestal.

We should definitely work to make our code (new C++ API, internal MOP, etc.) look more like the spec where we can. Some real wins there (see the proxy and ES5 implementations). But with our extensions and optimizations, and obviously for things the spec leaves out (and therefore leaves implicit) like multiple globals, we should be willing to differ from ECMA-262 Edition-latest.

Thus endeth this drive-by sermon.

/be
Posted patch Possible patch (obsolete) — Splinter Review
The browser starts with this, all JS tests and JIT tests pass, and these tests pass where before they failed:

http://whereswalden.com/files/mozilla/testcases/parent/

There might be more to fix, to be sure.  But this is a fair start, and it can't hurt to start looking at it now, while I do further testing (and automated-test-writing, at some point).
Attachment #509678 - Attachment is obsolete: true
Attachment #509937 - Flags: review?(lw)
Try reports two failures with the patch:

The test for bug 603753 relies on dynamic scoping of global object for function calls, seemingly to map where an error happens to the right window for Web Console purposes.  I think I understand the test that much, but I have only passing familiarity with the web console and what it's trying to do here (I get the gist from that bug, but not much more), so I could be wrong.

The xpcshell test_jsctypes.js test checks that some ctypes operations throw particular errors (Error and TypeError, at least in the first half of the test).  ctypes is derived, of course, from an imported module.  Guess what constructors are used to create exception objects with this patch?

What to do about these is an open question.  I'll give it some thought over the weekend.
Comment on attachment 509937 [details] [diff] [review]
Possible patch

Cool, and the regExpStatics bug you found I think was causing a crash in crash-stats cdleary was looking at (null global).

>+    js::Value *lastNativeCallee;
>+    js::Value *lastNativeCalleePtr;

Why not just have lastNativeCallee be a JSObject* ?

Also, to highlight the fact that the lastNativeCalleePtr is just good for its address and should not be used as a Value, can you change its type to 'void *'.

>+            /* Consult the frame if it's newer than any last native call. */
>+            if (regs->fp > reinterpret_cast<JSStackFrame *>(lastNativeCalleePtr)) // XXX direction
>+                return regs->fp->scopeChain().getGlobal();

VM stack growth direction is fixed to ++, can drop the comment.
With abovementioned 'void *', can also drop the cast.

>+            return lastNativeCallee->toObject().getGlobal();

Could add:

  JS_ASSERT(lastNativeCallee < stack().firstUnused())

(Making firstUnused() a public member; I'd rather not friend JSContext.  Ooh, or a DEBUG-only public member... or is that dastardly?)

>+        /* If we have only native calls, consult the last one. */
>+        if (lastNativeCallee)
>+            return lastNativeCallee->toObject().getGlobal();

Ditto on assert.  Or you could hoist it,  your choice.

>     explicit CompartmentChecker(JSContext *cx) : context(cx), compartment(cx->compartment) {
>-        check(cx->hasfp() ? JS_GetGlobalForScopeChain(cx) : cx->globalObject);
>+        check(cx->hasfp() ? cx->getGlobalFromScopeChain() : cx->globalObject);

You can drop the condition and else branch.

>+class AutoScopeChainSetter {
>+    JSContext * const cx;
>+    Value *oldLastNativeCallee;
>+    Value *oldLastNativeCalleePtr;
>+    Value callee;

If you are going to 'const' one of them, might as well 'const' 'em all.

>+     * Last native callee function, or null if none, used for scope chain
>+     * computation.  This value is not kept accurate by the trace JIT when
>+     * traced code calls into native code.  It will be an object whose global
>+     * is the proper object, but it may not be the exact last native callee at
>+     * every moment.

Mention mjit too.

On that subject, the patch needs to tweak methodjit/MonoIC.cpp to change this code:

        /* Right now, take slow-path for IC misses or multiple stubs. */
        if (ic.fastGuardedNative || ic.hasJsFunCheck)
            return true;

to:

        /*
         * Right now, take slow-path for IC misses, multiple stubs, or
         * change of global that would require a parent-test to possibly
         * update cx->lastNativeCallee.
         */
        if (ic.fastGuardedNative ||
            ic.hasJsFunCheck ||
            (f.regs.fp()->isGlobalFrame() && !f.regs.script()->compileAndGo) ||
            obj->callee().getGlobal() != f.regs.fp()->scopeChain()->getGlobal()) {
            return true;
        }

Yeah, the 'compileAndGo' check is sad, but it shouldn't affect perf (chrome mjit is off by default and, IIUC, all content is compileAndGo).

I think you can also un-beta-reduce the open-coded cx->globalForScopeChain() in JSCompartment::wrap.  (And fix that goofy comment formatting error above it :)

One last thing: I was about to say "you missed a case" for js::GetScopeChain{,Fast,Full}, but, of course, for these functions, the caller probably wants the calling scripted frame's scope chain, so such a change would be bad.  Now, for most of these, there is an 'fp' parameter, so there is a clear semantic directive to use fp's scope chain and ignore the lastNativeCallee.  Cool.  The only one that is ambiguous is js::GetScopeChain(JSContext *cx).  Fortunately, there are only two uses:
 (1) JS_GetScopeChain, which, coincidentally, I just filed bug 632064 about removing/deprecating
 (2) two uses in jsxml  (your favorite!!1)
So I propose we remove this potentially-misleading general utility and make an equivalent jsxml-static helper with a more specific name.
Attachment #509937 - Flags: review?(lw)
(In reply to comment #23)
> Why not just have lastNativeCallee be a JSObject* ?
> Also, to highlight the fact that the lastNativeCalleePtr is just good for its
> address and should not be used as a Value, can you change its type to 'void *'.

Good ideas both.


> Could add:
> 
>   JS_ASSERT(lastNativeCallee < stack().firstUnused())
> 
> (Making firstUnused() a public member; I'd rather not friend JSContext.  Ooh,
> or a DEBUG-only public member... or is that dastardly?)

More good ideas, except comparing to Ptr of course.  I do not think the latter suggestion is in any way dastardly.  :-)


> On that subject, the patch needs to tweak methodjit/MonoIC.cpp to change this
> code:

What testcase does this change fix?  Something like this?

  var cs = [Array, otherWin.Array];
  var fs = [Array.prototype.slice, otherWin.Array.prototype.slice];
  var rs = [];
  for (var i = 0, sz = fs.length; i < sz; i++)
     assertEq(fs[i].call([1, 2, 3], 0, 1) instanceof cs[i], true);

I don't immediately understand why this is necessary.
(In reply to comment #24)
> What testcase does this change fix?  Something like this?

Simpler still:

  function f() { assertEq((new otherWin.Array()) instanceof otherWin.Array); }
  f(); f(); f();

Without the callee.globalObject() != fp->scopeChain()->globalObject() check, we would compile an ic that *always* changes global and never sets lastNativeCallee.

The "compileAndGo" test is a result of the fact that, while the callee may remain fixed, fp->scopeChain() is variable (by definition of !compileAndGo) hence the inequality is not known at ic-compile-time.
(In reply to comment #22)
> The test for bug 603753 relies on dynamic scoping of global object for function
> calls, seemingly to map where an error happens to the right window for Web
> Console purposes.

Oops, this was bug 603750.

Patch is nearly done, building and testing it a final time now...
I fixed the two failures I mentioned earlier, pushed to try (as a series of patches -- this patch is the entirety of them qfo'd together):

http://tbpl.mozilla.org/?tree=MozillaTry&rev=3566d6803c6c

I feel reasonably good about this, but I won't be surprised if there's more to do.

Note that about half this patch consists of changes to a single ctypes test, so it's not so huge as it appears.  (It is more or less as fragile/intricate as it appears, however.)
Attachment #509937 - Attachment is obsolete: true
Attachment #511614 - Flags: review?(lw)
Comment on attachment 511614 [details] [diff] [review]
Patch with some tests, running past try now

The patch as pushed to try still has issues, most of which are reasonably obvious on debugging.  But one isn't quite so obvious:

  try
  {
    otherWindow.location = "chrome://global/content/mozilla.xhtml";
    throw "didn't throw";
  }
  catch (e)
  {
    var Error = ???; // otherWindow.Error or window.Error?
    assertEq(e instanceof Error, true);
  }

caps/tests/mochitest/test_bug246699.html tests this and currently expects that to be window.Error.  After this patch that becomes otherWindow.Error.  But only (ish, haven't reverse-engineered details) if otherWindow is the result of a not-fully-loaded iframe!

http://hg.mozilla.org/mozilla-central/annotate/199cb6282554/caps/tests/mochitest/test_bug246699.html

If I make the test do its thing after window onload, it suddenly passes again, throwing instanceof window.Error.  So for now I'm going to change the test to do that and forge ahead with that change until I can get some better idea of what should really be happening here, and whether I can easily do anything about it now.
Attachment #511614 - Flags: review?(lw)
Posted patch Patch, passes Linux tests (obsolete) — Splinter Review
Okay, this is actually good now, everything green in a scoped-to-Linux try run:

http://tbpl.mozilla.org/?tree=MozillaTry&rev=9e656a4a63a7

bz, can you look at the caps-test changes and the previous comment and explain how worried I should be about any of that, and whether just changing the test that way is acceptable?

The patch currently has JS_GetScopeChain removed due to whichever followup bug Luke filed.  I suppose it could be readded if necessary, but ideally we wouldn't have a footgun like that.
Attachment #511614 - Attachment is obsolete: true
Attachment #511781 - Flags: review?(lw)
Attachment #511781 - Flags: review?(bzbarsky)
Just changing the test is fine for the moment, I think, if we file a followup bug to figure out why the timing of the location set matters.  Or is what really matters whether otherWindow is same-origin with this one?

In any case, we need a bug on sorting that out and seeing what other browsers do.
Comment on attachment 511781 [details] [diff] [review]
Patch, passes Linux tests

r=me on the caps bit, though the lack of documentation around all this stuff makes it hard for me to tell why the changes to nsJSUtils::GetCurrentlyRunningCodeWindow are needed, say.  In general, having jst or sicking review the content and dom bits might be a good idea.
Attachment #511781 - Flags: review?(bzbarsky) → review+
I talked to mrbkap about it (also kinda sicking for a bit), and he was fine with the change as well.  The reason for it working is that <iframe/> contains about:blank, which is same-origin to the original window.  So the exception thrown is other-origin in that case.  The timing isn't actually relevant, that was just my own red herring.  With the change I made to the test, the thrown nsresult passes into the cross-compartment wrapper, which then creates the exception for the calling side, in the window from the calling side.  Blake did say there was some dispute about exactly how the case should work, but 

The nsJSUtils changes are needed for the websocket change.  They were looking to the scope chain to determine the identity of the calling window.  When we didn't do any awful magic to make calls across windows seemingly modify the scope chain, that gave you the calling window.  With the changes here, the seemingly-modified scope chain now gives back the window containing the called method.  That's certainly not the window "currently running code" in the sense that method wanted (i.e. whoever's to blame for the executing script), so I changed it to do what it really wanted, the way it really should have done it -- look at the stack, find the last interpreted frame, and use information gleaned from that.  Yes, I agree this is all not really documented.  :-(

Is there another case where setting location can throw like this, or some sort of analogous situation?  "what other browsers do" may well be nothing for a chrome: URL.  Chrome at least seems to just ignore the set.  IE9 dutifully loads the seemingly-valid URL.  It's not clear to me the question can be phrased in a way that's not intrinsically browser-centric.
Status: NEW → ASSIGNED
Comment on attachment 511781 [details] [diff] [review]
Patch, passes Linux tests

I'm cool getting more eyes on this, it's still a pretty massive hack.  :-)  Feel free to concentrate mostly on the non-JS-engine-internal changes, although the whole thing (minus the repetitive ctypes-test-of-horror changes) could always use scrutiny.
Attachment #511781 - Flags: review?(jst)
Comment on attachment 511781 [details] [diff] [review]
Patch, passes Linux tests

Looks good to me.
Attachment #511781 - Flags: review?(jst) → review+
Comment on attachment 511781 [details] [diff] [review]
Patch, passes Linux tests

>--- a/content/base/src/nsContentUtils.cpp
>+++ b/content/base/src/nsContentUtils.cpp
>@@ -5799,9 +5799,16 @@ CloneSimpleValues(JSContext* cx,
>+    JSObject* global = JS_GetGlobalForScopeChain(cx);
>+    if (!global) {
>+      return NS_ERROR_FAILURE;
>+    }
>+
>+    JSAutoEnterCompartment ac;
>+    if (!ac.enter(cx, global))
>+      return NS_ERROR_FAILURE;

As a preexisting invariant, we should have cx->globalForScopeChain()->compartment == cx->compartment.  Recall that calls across compartments should conjure up a proxy call which will push a dummy frame with the correct scope chain.  If this is not true, then there is a bug elsewhere.

>+  JSAutoEnterCompartment ac;
>+  if (!ac.enter(aCx, global))
>+    return JS_FALSE;

Ditto

>diff --git a/dom/src/threads/nsDOMWorker.cpp b/dom/src/threads/nsDOMWorker.cpp
>+  JSObject* global = JS_GetGlobalForScopeChain(aCx);
...
>+  JSAutoEnterCompartment ac;
>+  if (!ac.enter(aCx, global))
>+    return JS_FALSE;

Ditto

>@@ -2533,8 +2537,10 @@ nsDOMWorker::ReadStructuredClone(JSConte
>+      if (JSObject* global = JS_GetGlobalForScopeChain(aCx)) {
>+        JSAutoEnterCompartment ac;
>+        if (!ac.enter(aCx, global))
>+          return nsnull;

Ditto

>diff --git a/toolkit/components/perf/PerfMeasurement.cpp b/toolkit/components/perf/PerfMeasurement.cpp
>+  JSObject* global = JS_GetGlobalForScopeChain(cx);
>+  if (!global)
>     return NS_ERROR_NOT_AVAILABLE;
> 
>+  JSAutoEnterCompartment ac;
>+  if (!ac.enter(cx, global))
>+    return NS_ERROR_FAILURE;

Ditto

>diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h
>--- a/js/src/jscntxt.h
>+++ b/js/src/jscntxt.h
>@@ -245,6 +245,9 @@ class StackSegment
>     /* Whether this segment was suspended by JS_SaveFrameChain. */
>     bool                saved;
> 
>+    JSObject *savedLastNativeCalleeScope_;
>+    void *savedLastNativeCalleeScopePtr_;

Alignment

>+    /* The first stack location available for a new frame. */
>+    inline js::Value *firstUnused() const;

Could you s/for a new frame/to push new values and frames/ ?

>+    inline void *constFirstUnused() const;

Can you put it next to 'firstUnused' ?

>+     * An object created for the global of the currently executing native
>+     * method, native getter, or native setter (or null if none is executing),
>+     * used for scope chain computation.  The exact identity of this value is
>+     * unspecified.
>+     */
>+    JSObject *lastNativeCalleeScope;

The last statement would be less frightening by giving the primary exception:
", viz., calls to natives that don't change global will not change the lastNativeCalleeScope".

>+    AutoScopeChainSetter(JSContext *cx, JSObject *scope
>+                         JS_GUARD_OBJECT_NOTIFIER_PARAM)

I would prefer AutoScopeChainSetter to take lastNativeCalleeScopePtr as a third parameter.  For the two non-obvious cases (CallJSPropertyOp*), you could then write a comment saying why firstUnused() is good enough.  For CallJSNative, you could pass vp which is both faster than firstUnused() and is what the reader would expect, given the name lastNativeCalleeScopePtr.

>diff --git a/js/src/methodjit/MonoIC.cpp b/js/src/methodjit/MonoIC.cpp
>         /* Right now, take slow-path for IC misses or multiple stubs. */
>-        if (ic.fastGuardedNative || ic.hasJsFunCheck)
>+        if (ic.fastGuardedNative ||
>+            ic.hasJsFunCheck ||
>+            !f.regs.fp->script()->compileAndGo ||
>+            obj->getGlobal() != f.regs.fp->scopeChain().getGlobal())

Could you add the analogous disjunct to the comment?

>diff --git a/js/src/xpconnect/wrappers/AccessCheck.cpp b/js/src/xpconnect/wrappers/AccessCheck.cpp
>+    if (scope) {
>+        JSAutoEnterCompartment ac;
>+        if (!ac.enter(cx, scope))
>+            return false;
>+        scope = JS_GetGlobalForObject(cx, scope);
>+    } else {
>+        scope = JS_GetGlobalForScopeChain(cx);
>+    }

Can skip JSAutoEnterCompartment (includes heap allocation, frame pushing...) and just scope->getGlobal().

diff --git a/js/src/tests/ecma_5/extensions/cross-global-call.js
>+    var otherGlobal = newGlobal("same-compartment");

nice

r+ with nits
Attachment #511781 - Flags: review?(lw)
Attachment #511781 - Flags: review?(jst)
Attachment #511781 - Flags: review+
Comment on attachment 511781 [details] [diff] [review]
Patch, passes Linux tests

IIUC, I just un-r+'ed jst's r+.  Re-r+'ing.
Attachment #511781 - Flags: review?(jst) → review+
Duplicate of this bug: 632064
For what it's worth, I talked to waldo about this earlier, and he answered my question but raised a new one: whether the new world does the right thing for the various nsDOMClassInfo callsites.  I'm sort of hoping jst reviewed that part, because I don't know what those callers are looking for offhand.
Working on a new patch with a few more fixes, should be done in the next .5h or so I hope.
bz's point was that the changes to the semantics of JS_GetGlobalForScopeChain might not be what the callers wanted.  (Foolish me, thinking users were using it intentionally and on purpose.)  So I went through the JS_GetGlobalForScopeChain callers and double-checked them.  Turns out some of them don't want the global object for purposes of creating an object, they want it because they want to know who called them, roughly.  So I added new JSAPI specifically for that purpose and used it the places I saw that wanted it.  It's possible there are more, but I got all the ones specifically using GGFSC for that purpose.

The claim jst makes for the DOM WrapNative calls and all is that while we could pass in a wrong scope, it wouldn't affect the prototype chain because PreCreate (or PostCreate?) sets the prototype by walking up the DOM tree to find the right window.  So a bad proto chain would only be an issue for things that aren't in the DOM tree.  I'm unsure if there are such things in nsDCI.cpp (my eyes started glazing over when stepping through a plausible candidate).  But in any case, stepping through this page demonstrated parenting (not prototyping) is correct with this patch, and wrong without:

http://whereswalden.com/files/mozilla/testcases/parent/gcr.html

This looks fully green on try, so I think it's good to go, if we agree all the GGFSC callers are happy with the new behavior with this patch:

http://tbpl.mozilla.org/?tree=MozillaTry&rev=bf9dd383d2eb
Attachment #511781 - Attachment is obsolete: true
Attachment #512629 - Flags: review?(jst)
(In reply to comment #40)
> stepping through this page demonstrated parenting (not prototyping)
> is correct with this patch

Er, I meant both are correct with the patch!  And it looks like prototype is wrong without, but I am not 100% certain this is precisely because this is a DOM object that's not a node which would inherit its window's Object.prototype along its prototype chain -- I'm just accepting jst's assurances that this is the case, and that this is likely the reason for the observational change.
> it wouldn't affect the prototype chain because PreCreate (or PostCreate?) sets
> the prototype by walking up the DOM tree to find the right window.

Only for some objects....

Looks like we use JS_GetGlobalForScopeChain for wrapping the following ("safe" means the PreCreate hook looks up the right parent anyway):

  window.document: safe
  node.baseURIObject: will likely not wrap in the chrome scope anymore?  Depends
     on how this interacts with wrappers....
  node.nodePrincipal: same thing as baseURIObject
  Various nsArraySH stuff.  Plugin array and plugin stuff are the behavior
     changes here.  And the MIME type array.  And stylesheets, CSS rules,
     client rects, paint requests, tree columns, DOM Files.
  nsNamedArraySH.  plugin stuff, mimetypes, treecolumns, DOM storage.
  document.location: safe
  document.all stuff: safe
  form name getter: safe, I think
  select index getter: safe
I think that the change proposed is definitely fine for all of the above except maybe the baseURIObject/documentURIObject/nodePrincipal.  And even those should be ok as long as they get wrappered.
Comment on attachment 512629 [details] [diff] [review]
Updated, fixes JS_GetGlobalForScopeChain callers that need it

AFAICT this looks good, but I'd like to hear what mrbkap says about bz's list, in particular the baseURIObject, documentURIObject, and nodePrincipal code.
Attachment #512629 - Flags: review?(mrbkap)
Attachment #512629 - Flags: review?(jst)
Attachment #512629 - Flags: review+
I don't understand the argument here. Boris, are you taking into account the fact that we never enter the content compartment when going through Xray wrappers when asking about these objects? So, we should still be getting chrome's scope and using that to wrap them. Was there another part of this?
There isn't an "argument".  The items listed in comment 46 are just the ones where I'm not 100% sure what happens.

> are you taking into account the fact that we never enter the content
> compartment

1)  No.
2)  I didn't know whether we do or not.
3)  I don't know what the impact of entering or not entering it is.... (we
    really need documentation of this stuff).

> So, we should still be getting chrome's scope and using that to wrap them.

Are we entering the classinfo code with the Xray wrapper as the object the JS engine knows about (in terms of what it thinks the property get happed on)?  What happens if someone gets these off what used to be called a SafeJSObjectWrapper?  Same thing?
We stepped through things in a debugger, and mrbkap is satisfied with the state of the world.
Attachment #512629 - Attachment is obsolete: true
Attachment #513637 - Flags: approval2.0?
Attachment #512629 - Flags: review?(mrbkap)
Comment on attachment 513637 [details] [diff] [review]
Final patch with last mrbkap-nits fixed

This is a web visible regression, I'd much rather take this now than introduce more inconsistencies to web developers. This is well tested stuff, and we can get this into the last beta!

a=jst
Attachment #513637 - Flags: approval2.0? → approval2.0+
Comment on attachment 513637 [details] [diff] [review]
Final patch with last mrbkap-nits fixed

1689	     * s_InitFunctionAndObjectClasses, and which represents the default scope

Missing 'j' at start of line.

Nice work, hard patch.

/be
TM:

http://hg.mozilla.org/tracemonkey/rev/58eebd67ae57
http://hg.mozilla.org/tracemonkey/rev/3da12edf735e

m-c:

http://hg.mozilla.org/mozilla-central/rev/b0bf06306261
http://hg.mozilla.org/mozilla-central/rev/9d6f665f7f84
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
OS: Windows 7 → All
Hardware: x86 → All
Resolution: --- → FIXED
Whiteboard: [softblocker][sg:low] → [softblocker][sg:low][fixed-in-tracemonkey]
Target Milestone: --- → mozilla2.0b12
The patch does fine generally, but it seems to have produced an intermittent orange I can't reproduce locally.  Still, it's often enough a debugging change seems the most expedient way to figure things out:

http://hg.mozilla.org/tracemonkey/rev/4d86e63ff60d

I'll investigate results in the morning, then proceed as seems sensible.

On another note, philor sez this regressed dromaeo perf.  I wonder if this isn't just the price paid for calls to natives having to do so much extra work, suggesting that a pass-in-your-callee track really is the way to go but for the huge size of such a patch.  If it is I'd probably just eat it for this release, then promptly fix it the right way for the next release.  The bug fixed here worries me; it's neither cleanly right nor cleanly wrong, and it'll have a cost if we leave it in place.  But who knows: could have been cosmic rays or gremlins for all the investigation I've done.  I'll take a real look in the morning.
(In reply to comment #51)
> I wonder if this
> isn't just the price paid for calls to natives having to do so much extra work,

Since only natives-called-from-the-interpreter incur this overhead, I would be more suspicious of the property ops.  If you dig into the dromaeo perf, you might consider independently commenting out the AutoScopeChainSetter in CallNative vs. CallJSPropertyOp (and of course taking out both to see if that is even the source of he regression at tall).
Depends on: 635462
Flags: in-testsuite+
Version: unspecified → Trunk
(In reply to comment #53)
> Since only natives-called-from-the-interpreter incur this overhead, I would be
> more suspicious of the property ops.
In that case, would the fact that bug 631951 landed have made things worse for this bug?
This also looks like it might be causing the assertion failures in bug 635462.  Jeff: looks like it should get some backout.
This may have regressed kraken too.  Hard to tell, since the awfy regress view hasn't updated in two days.  :(
If you can't reproduce it, it seems like most tinderbox machines can.  Please revert at least on m-c, where we're now getting 2-3 per run, and then see what tm tells you.
http://tbpl.mozilla.org/?tree=MozillaTry&rev=37eaedb050dd is a backout on try, with trivial merge resolution for luke's telemetry.  If it's green I'll push it to m-c, because we're about to hit zero beta blockers, and this softblocker is now very much in the way.  Will reopen at that point.
Welp, it needs more work to merge semantically with Luke's stuff.  Sure don't want it failing all over the place like this this weekend, so I might try it later. :-/
I have an idea I'm testing (since managed to get the assert to hit, but not consistently).  Back with more shortly.
My backout passed try.  I think we should defer this softblocker until 5, rather than risk further destabilization not found until it's present on m-c.
backed out of TM by waldo and m-c by me:

http://hg.mozilla.org/tracemonkey/rev/8c14f73ca5ae
http://hg.mozilla.org/mozilla-central/rev/a633a0030ac6
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [softblocker][sg:low][fixed-in-tracemonkey] → [softblocker][sg:low]
The intermittent fail is probably bug 619565.  I probably made it more frequent through the cx->fp->scopeChain() in JSContext::gGFO in the patch, which would frob the scope chain more often.

Luke said he looked and the regression on Dromaeo was from one test, potentially because arrays have a getProperty hook.  If that's the case we could probably remove the parenting hack for property ops, because most DOM stuff gets it right via some XPConnect semi-awful hacks.  And in the rare cases where we don't get it right then, we could probably make them individually do it the right way if necessary, because there shouldn't be too many that aren't DOM operations.  It'd be somewhat inconsistent, but that might not be the end of the world.  I'll keep investigating.
Jeff, did you have a chance to look into whether this was the cause of the Kraken issues?  That might have been arrays too, of course.
I have not yet, but I'll do that too.
I ran the Dromaeo JS tests before/after a revised version of the patch, against latest TM as of a couple hours ago.

Before: http://dromaeo.com/?id=131004
After: http://dromaeo.com/?id=131006

Strings *improved* with the patch, probably a lie.  3d cube is in the noise.  Regexes is in the noise.  Code evaluation is in the noise.  Arrays lost a little (1455 before, 1382 after, higher better).  And base64 lost a lot (982 before, 680 after).

Shark now.
Comment on attachment 513637 [details] [diff] [review]
Final patch with last mrbkap-nits fixed

(removing approval for backed-out patch)
Attachment #513637 - Flags: approval2.0+ → approval2.0-
The reason Dromaeo sucked it up with the patch was that Dromaeo tests perform cross-global calls.  (Stupid me forgetting that.)  So lots of the testing, then, would be across globals, and the patch changed that so it wouldn't methodjit.  (You'd still get some trace-jit effect, but not enough to make up the difference.)  Luke believes it shouldn't be hard to do something slightly different in the methodjit ICs for the cross-global-call case.  I'll see how that pans out now, just in case this could still make 4.
** PRODUCT DRIVERS PLEASE NOTE **

This bug is one of 7 automatically changed from blocking2.0:final+ to blocking2.0:.x during the endgame of Firefox 4 for the following reasons:

 - it was marked as a soft blocking issue without a requirement for beta coverage
blocking2.0: final+ → .x+
Duplicate of this bug: 645130
Hi.

Is there any chance that you'll fix this bug any soon?

Thanks!
The initial landing had a perf regression.  Bug 650353 (which we want for many reasons) should either fix this implicitly or make this easy to fix without affecting perf.  The downside is that bug 650353 is non-trivial and has non-trivial dependencies so it will probably take a month or few to get fixed.
Status: REOPENED → RESOLVED
Closed: 9 years ago8 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: cpg
You need to log in before you can comment on or make changes to this bug.