Clearing a global's scope wrecks a whole host of assumptions which the JITs and TI make about scripts compiled against that global. There are some guards against this when executing scripts, but these are not enough, and compileAndGo scripts can (and do) still run against the global after clearing it. This is causing bug 679247 (intermittent TI tinderbox failure); I added some instrumentation, and found that a particular script is getting analyzed by inference, has its global cleared, and later on starts/resumes execution in the interpreter. When the script executes a JSOP_LAMBDA it pushes a function with a different (reconstructed) Function.prototype, which TI assertions catch. From cursory examination of the code, it looks like we only check for a cleared global in RunScript, and not when making inline calls from the interpreter. It seems like interpreter inline calls could cross from an uncleared global to a cleared global, with no isCleared check occurring. I don't know if fixing this would be enough, however. Under what constraints are scopes cleared? Can scripts associated with the global have frames on the stack? Can such scripts still have mjit code? If the latter holds, this would be a problem for the full call stubs which the mjit generates, which allow mjit frames to call arbitrary scripts which have jitcode, without a cleared global check. The real fix here is to remove scope clearing entirely (bug 637099).
Heh, so it seems the answer to bug 630072 comment 113 is 'yes'. I wonder if this is causing other random crashes. peterv: I heard the major ClearScope call is going away soon, do you have a time estimate? jst indicated this call site was the major offender for scripts-run-on-cleared-globals; do you know if this is the *sole* source of clear-scoped globals which have script run on them?
JS_ClearScope is also causing the other intermittent TI tinderbox orange, bug 679599. This case is somewhat different: JS_ClearScope is called on Array.prototype (I have no idea why), and when code later runs in the interpreter it can't find Array.prototype.push. I'm not sure why this works on m-c, but one possibility is that the script runs in jitcode on m-c, and some assumption the jitcode makes is being subtly broken.
> do you know if this is the *sole* source of clear-scoped globals Gecko currently makes ClearScope calls in 3 places: 1) Outer windows are cleared when a new inner is being set up. Now that outer windows are proxies that presumably have no slots, this may not be needed, but the comments claim this was at once point needed for correctness. 2) When nulling out the docshell of an outer window. This is presumably just leak-prevention. 3) When calling nsGlobalWindow::FreeInnerObjects. I assume this is also meant as leak-prevention. > JS_ClearScope is called on Array.prototype That's pretty odd. It should, as things stand, be called up the proto chain of the window, but not on Object.prototype.... and Array.prototype should not be there, unless someome manually munged the proto chain.
There's also 4) When unloading a component. Also for leak prevention.
Is this only on the TI branch or is any shipping version of Firefox affected?
The JS_ClearScope() removal bug is really close to done (right, peterv?), and that removes the clearing of the global window objects completely. But there's some cases left, like storage objects and what bz and mrbkap pointed out that's still left, but they're all non-global objects AFAIK.
Do we still want to track this for 9? AFAICT, the plan is to remote ClearScope, that will obviate this bug and another one.
(In reply to David Mandelin from comment #7) > Do we still want to track this for 9? AFAICT, the plan is to remote > ClearScope, that will obviate this bug and another one. No, I think we can wait for ClearScope to be removed entirely.
Seems unlikely we'd remove ClearScope during beta, should we not expect this in Fx10? Is Firefox 11 realistic?
Created attachment 592130 [details] [diff] [review] partial fix This fixes the cases where the global is cleared when scripts have compiled code against it, and where the interpreter tries to push an inline frame for a script with a cleared global. It is still possible to run code against the global in the interpreter (no security problem there), but only for frames which were on the stack when the global was cleared. It's not clear what to do should the latter case come up, as JS_ClearScope is infallible and I don't want to change its callers.
JS_ClearScope should simply be eliminated. The browser is pretty much ready to drop it I think. peterv probably knows the exact status.
(In reply to Andreas Gal :gal from comment #11) > JS_ClearScope should simply be eliminated. The browser is pretty much ready > to drop it I think. peterv probably knows the exact status. Yeah, that was my opinion in comment 8. That was 3 months ago though, I don't think there has been progress on bug 637099 since then, and this is a small fix which removes an existing security risk and cuts off this steady trickle of issues like bug 715149.
Comment on attachment 592130 [details] [diff] [review] partial fix Review of attachment 592130 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/GlobalObject.cpp @@ +365,5 @@ > */ > cx->compartment->newObjectCache.reset(); > + > +#ifdef JS_METHODJIT > + /* Destroy compiled code for any scripts parented to this global. */ r+ with comment revised to something like "If the global was cleared, for safety ..."
https://hg.mozilla.org/integration/mozilla-inbound/rev/4d8f729aa1aa There was a test (js/xpconnect/tests/mochitest/test_bug605167.html) which consistently tried to run a script against a cleared global, and which was failing because of this change --- a different exception was being thrown than the one it was testing for (it expected a permission-denied exception, and got the compile-and-go one instead). I just relaxed the test itself.
This broke Thunderbird's account provisioner - see bug 723910.
Unless there is a bug in the patch, that would mean thunderbird was running script on a cleared global, which is unsafe and can cause crashes.
See bug 730703. Alice's fix range includes this bug.
Comment on attachment 592130 [details] [diff] [review] partial fix From the error in bug 730703 comment 2 it seems likely that crash was fixed by this bug. I'm guessing it's too late for beta but asking anyways. [Approval Request Comment] User impact if declined: Potential repeatable crashes, e.g. bug 730703. Testing completed (on m-c, etc.): Has been on m-c for ~1 month Risk to taking this patch (and alternatives if risky): Some risk for changes in website behavior, can cause websites to behave differently if they run scripts on globals which have been cleared.
I have yet another case (a commercial 3d model viewer) that is fixed by this patch. I think this patch could really help stability.
Comment on attachment 592130 [details] [diff] [review] partial fix Please land the patch on Aurora. We can jump one train here for a crash fix and a security hole, but it's just too late to try to get something into Beta which might break web sites.
[Triage Comment] Don't land this until Firefox 12 is on mozilla-beta please.
Comment on attachment 592130 [details] [diff] [review] partial fix [Triage Comment] Please go ahead and land this on ESR branch (see https://wiki.mozilla.org/Release_Management/ESR_Landing_Process).
qa- until testcase-wanted is satisfied
Brian, could you rebase this for ESR10? Most of the bustage is trivial, but the patch landed to aurora uses Recompiler::clearStackReferences, which ESR10 doesn't have (and it seems to lack some of the machinery used by that function too).
Is there any direct way to verify this fix?
(In reply to Al Billings [:abillings] from comment #27) > Is there any direct way to verify this fix? Don't worry about it--this bug is a partial fix. The real fix is bug 637099.
Heh. That bug isn't frightening at all. :-)
Backed out for orange. https://hg.mozilla.org/releases/mozilla-esr10/rev/57aed64e01f3