Closed Bug 869740 Opened 6 years ago Closed 6 years ago

Non-LIFO use of Rooted in XPCConvert::NativeInterface2JSObject()

Categories

(Core :: XPConnect, defect)

defect
Not set

Tracking

()

RESOLVED INVALID

People

(Reporter: terrence, Assigned: jonco)

References

Details

Attachments

(2 files, 2 obsolete files)

:cpeterson reported this over IRC. We are hitting a ~Rooted assertion in a --enable-root-analysis browser build on MacOS X.
Which use, exactly?

Is it coming from the lazy construction of the XPCCallContext in the XPCLazyCallContext?  If we added a Rooted member to XPCCallContext, that would definitely do it.
(In reply to Boris Zbarsky (:bz) from comment #1)
> Which use, exactly?
> 
> Is it coming from the lazy construction of the XPCCallContext in the
> XPCLazyCallContext?  If we added a Rooted member to XPCCallContext, that
> would definitely do it.

We did. I forgot about LazyCallContexts. :-(

We'll probably need to go back to the custom rooter strategy.
Attached file stack-trace.txt
Here is the stack trace. I'm attaching it as a text file because it is 81 frames deep! :)
Assignee: general → nobody
Component: JavaScript Engine → XPConnect
Assignee: nobody → jcoppeard
Attached patch Patch (obsolete) — Splinter Review
Here's a patch to split the initialization of the contained XPCCallContext into two parts.  The roots are initialized when the XPCLazyCallContext is constructed, but the rest happens only if GetXPCCallContext() is called, as before.

I checked that this fixes the crash for rooting analysis builds (there is a later one however).
Attachment #747008 - Flags: review?(bobbyholley+bmo)
Blocks: 868302
Comment on attachment 747008 [details] [diff] [review]
Patch

Review of attachment 747008 [details] [diff] [review]:
-----------------------------------------------------------------

Ugh. This stuff feels like it's about to collapse under its own weight. It really needs a rewrite, but I don't want to make you do that. r=bholley contingent on copious commenting.

::: js/xpconnect/src/xpcprivate.h
@@ +1213,5 @@
>      // no copy ctor or assignment allowed
>      XPCCallContext(const XPCCallContext& r); // not implemented
>      XPCCallContext& operator= (const XPCCallContext& r); // not implemented
>  
> +    // The following are for construction from XPCLazyCallContext.

Please add copious commenting explaining all the factors at play here, including:
* Why we still need placement new (because we sometimes initialize lccx from an existing XPCCallContext, in which case we still don't want to bother with creating a stack-allocated ccx).

* The rooting issues at play.

* Why we need this whole LazyInit thing.
Attachment #747008 - Flags: review?(bobbyholley+bmo) → review+
So the next failure is closely related. The stack is basically:

#0  JS::Rooted<JSObject*>::~Rooted
#1  XPCLazyCallContext::~XPCLazyCallContext at js/xpconnect/src/xpcprivate.h:1344
#2  nsIDOMXPathResult_IterateNext at obj/js/xpconnect/src/dom_quickstubs.cpp:1683
#3  js::CallJSNative

The Rooted being destroyed is XPCLazyCallContext.mFlattenedJSObject. The top of the Rooted stack is the mFlattenedJSObject in the lazily constructed XPCCallContext stored in mData.

Line 1344 is the opening brace of ~XPCLazyCallContext, so it's the automatic destruction of mFlattenedJSObject from ~XPCLazyCallContext. mCcxToDestroy and mCcx are reported as NULL at the site of the crash, which makes no sense to me whatsoever -- how can the address of something inside mData be on the Rooted stack if it was never constructed?

Anyway, I haven't tracked it down yet, but going through it makes me wonder how this lazy construction stuff can work. If any Rooted is constructed in between the outer object's construction and the lazy construction, doesn't it mess up LIFO? Or does that not happen for some reason? Specifically:

  LazyOuter O; // O is { Rooted R; SpaceFor<Inner> inner }
  Rooted R;
  O.constructInner();


So O.R gets pushed on the stack, followed by R, followed by O.inner.R. At destruction time, R is destroyed first, and mismatches with O.inner.R. Are we saying that you shouldn't declare any Rooteds between LazyOuter and constructInner()? (This doesn't seem to be the case for the crash I'm talking about above, but I'm asking the general question.)
Comment on attachment 747008 [details] [diff] [review]
Patch

Oh! Sorry, my question was lame. The idea is that all of the Rooteds get pushed unconditionally, huh? Sadly, the patch only conditionally destroys the mData's XPCCallContext, so they never get popped if LazyInit() never gets called.

So the fix would be to destroy it unconditionally now that it's being constructed unconditionally. In which case, it isn't all that lazy, and certainly shouldn't be hidden in a mozilla::AlignedStorage2. But perhaps Init() is still doing some avoidable work?

Oh. Ugh. Except XPCLazyCallContext wants to be able to accept an existing XPCCallContext in its constructor, and use it instead.

This is craziness.

I see three paths forward: (1) plow ahead, keeping an unconstructed mData and mCcx and mCcxToDestroy. If a ccx is passed in, use it, otherwise construct one. Still do initialization lazily, guarded by IsValid(), for the case where no ccx was passed in and the ccx was never needed. Or (2) ditch mData, make mCcx be an actual XPCCallContext, and give XPCCallContext a copy constructor. Or (3) make a Rooted subclass that maintains a doubly-linked list, and therefore allows any ordering of destruction. Use that in things that might be lazily constructed.
Attachment #747008 - Flags: review-
Attachment #747008 - Flags: review+
Attachment #747008 - Flags: feedback?(bobbyholley+bmo)
Well, the update to jonco's patch to do #1 is a single line change, so let's see what bholley thinks of it. Comments not yet added; I'll wait to see what you think of the patch.
Attachment #747274 - Flags: review?(bobbyholley+bmo)
Assignee: jcoppeard → sphink
Argh, I'm not sure if I wanted bzexport to automatically reassign this to me. I suppose it doesn't matter that much. Anyway, the results of the attached patch are better, but we still have some LIFO failures: https://tbpl.mozilla.org/?tree=Try&rev=0f4dc522eb1d (these may be unrelated to LazyXPCCallContext).
Comment on attachment 747274 [details] [diff] [review]
Non-LIFO use of Rooted in XPCConvert::NativeInterface2JSObject()

Yes, let's do this (I think). But please address review comments from the previous patch.
Attachment #747274 - Flags: review?(bobbyholley+bmo) → review+
Ok. Pushed the commented version to try at https://tbpl.mozilla.org/?tree=Try&rev=03f128810edc

Added comments are:

    // The following are for construction from XPCLazyCallContext. Delay doing
    // the initialization work until we know that the XPCCallContext will
    // actually be needed, but still eagerly construct the overall
    // XPCCallContext because it contains Rooted<T> fields that must be
    // destroyed in LIFO order. An XPCCallContext will always push its
    // Rooted<T>'s onto the stack upon construction, and then pop them off in
    // LIFO order during its destruction, regardless of whether LazyInit ever
    // gets called.

and before the placement new call:

        // Construct with placement new to allow for the other constructor,
        // which accepts an existing XPCCallContext to use. This constructed
        // XPCCallContext may never be needed, in which case its LazyInit will
        // not be invoked, but it must be eagerly constructed here in order to
        // maintain LIFO order of its Rooted<T> fields and the ones contained
        // in XPCLazyCallContext. (If its construction were delayed, then any
        // Rooted<T> created in between the construction of XPCLazyCallContext
        // and the LazyInit call would not get destroyed during
        // ~XPCLazyCallContext and therefore would not follow LIFO ordering.)

Hopefully I got that all right. Complain before I land if they seem wrong!
(In reply to Steve Fink [:sfink] from comment #9)

Thanks for finishing this off - I didn't realise you didn't mean to reassign it to yourself.
Attachment #747274 - Flags: checkin+
(In reply to Jon Coppeard (:jonco) from comment #12)
> Thanks for finishing this off - I didn't realise you didn't mean to reassign
> it to yourself.

I can blame that on a a tool failure, except that I wrote that part of the tool.

I just stumbled across this because I was looking at what was blocking bug 868302. It shouldn't have taken me so long figure out what was going on, especially since it's already well described in the comments on this bug. :(
Comments look great. Thanks sfink!
(In reply to Steve Fink [:sfink] from comment #13)
> http://hg.mozilla.org/integration/mozilla-inbound/rev/0d90de935ba3

Backed out for mochitest-1 shutdown crashes.
https://hg.mozilla.org/integration/mozilla-inbound/rev/82ab6b5d0c03

https://tbpl.mozilla.org/php/getParsedLog.php?id=22826257&tree=Mozilla-Inbound

Note that bug 867857 is also on file for a recent instance of this happening on B2G.
Comment on attachment 747008 [details] [diff] [review]
Patch

IIUC this feedback request is obsolete. Correct me if I'm wrong.
Attachment #747008 - Flags: feedback?(bobbyholley+bmo)
Attachment #747274 - Flags: checkin+
Assignee: sphink → nobody
Assignee: nobody → jcoppeard
Depends on: 877261
Attached patch Updated patchSplinter Review
Fix for mochitest-1 failures - since we now construct the embedded XPCCallContext object in more situations than we used to, we cannot use its presence as an indicator that we don't need to call JS_EndRequest().
Attachment #747008 - Attachment is obsolete: true
Attachment #747274 - Attachment is obsolete: true
bholley, do you think the work in 877261 will make this unnecessary, or I should I go ahead and try and land this?
Flags: needinfo?(bobbyholley+bmo)
(In reply to Jon Coppeard (:jonco) from comment #19)
> bholley, do you think the work in 877261 will make this unnecessary, or I
> should I go ahead and try and land this?

Yes. Let's focus on that.
Flags: needinfo?(bobbyholley+bmo)
This is no longer an issue now XPCLazyCallContext has been removed by bug 877261.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.