Closed
Bug 947555
Opened 11 years ago
Closed 11 years ago
Rooting hazard failures (false positives) from script settings stack
Categories
(Core :: XPConnect, defect)
Tracking
()
RESOLVED
DUPLICATE
of bug 937317
People
(Reporter: bzbarsky, Assigned: sfink)
References
Details
Attachments
(1 file, 1 obsolete file)
13.14 KB,
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
Long story short, the analysis thinks GetIncumbentGlobal can GC, so this sort of code: new Function(&val.toObject(), mozilla::dom::GetIncumbentGlobal()); is now a GC hazard. Or so the analysis thinks. Its reason for thinking GetIncumbentGlobal can GC is: GC Function: nsIGlobalObject* mozilla::dom::GetIncumbentGlobal() JSContext* nsContentUtils::GetCurrentJSContextForThread() JSContext* nsContentUtils::GetCurrentJSContext() FieldCall: nsIXPConnect.GetCurrentJSContext As in, it sees that nsIXPConnect.GetCurrentJSContext is virtual and assumes it can GC. However: > mrgiggles, can nsXPConnect::GetCurrentJSContext GC? <mrgiggles> bz: No, |JSContext* nsXPConnect::GetCurrentJSContext()| cannot GC so this is in fact a false positive. That's good. ;) Can we just annotate nsIXPConnect.GetCurrentJSContext as not gcing and force the analysis to fail if nsXPConnect::GetCurrentJSContext ever starts being able to GC? If not, can we rejigger nsContentUtils::GetCurrentJSContext() to not do virtual calls (e.g. store a reference to the XPCJSRuntime or the XPCJSContextStack in nsContentUtils directly)?
Assignee | ||
Comment 1•11 years ago
|
||
I hope to fix this with bug 947400, though unfortunately, I don't think it's adequate yet. The analysis has had a limitation where virtual calls are treated as always GCing for the purpose of determining whether there's a hazard in the calling function. They are *not* assumed to GC for the purpose of constructing the call graph. So if you created a dummy function with a virtual call in it, and called the dummy function instead of the direct virtual call, the analysis would do the right thing. Bug 947400 fixes this to properly handle virtual function calls. However, it looks like that's not enough to detect that GetCurrentJSContext() cannot GC. I am still looking into why. But yes, I'll either fix it directly in the analysis, or add an annotation.
Assignee | ||
Comment 2•11 years ago
|
||
The fix in bug 947400 failed to resolve this bug because the relevant method is on a class that inherits from nsISupports, and the analysis special-cases nsISupports stuff because they're overridable via script (or extensions, I guess?). If I undo the special-casing for nsISupports.GetCurrentJSContext (only), it does the virtual function resolution properly. I haven't finished a complete analysis with that change yet, but it's looking good so far. With this setup, if nsXPConnect::GetCurrentJSContext were modified to GC, the analysis would not fail. But it would start reporting all these hazards, which seems fine since they would be real. We'd have to figure out what to do about it. I'll attach a patch after verifying that it really does fix this problem.
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → sphink
Reporter | ||
Comment 3•11 years ago
|
||
> the analysis would not fail. But it would start reporting all these hazards
That would turn the tree orange, yes? That's all I really want: that we don't silently introduce hazards. ;)
We have no plans to make GetCurrentJSContext GC. We do have plans to try to get rid of it entirely. ;)
Assignee | ||
Comment 4•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #3) > > the analysis would not fail. But it would start reporting all these hazards > > That would turn the tree orange, yes? That's all I really want: that we > don't silently introduce hazards. ;) Yes. Well, ok, it would after I unbreak the result checking code. Which I have a patch for in bug 944158, but I can't land it yet because of the 678 new hazards from this bug. ;-) And my fix didn't work. Haven't looked into why, yet. Argh.
Assignee | ||
Comment 5•11 years ago
|
||
Bug 947400 + special-casing GetCurrentJSContext revealed the next problem: nsIEventTarget.IsOnCurrentThread. That's pretty much the same story. Special-casing it gets me to: GC Function: nsIGlobalObject* mozilla::dom::GetIncumbentGlobal() nsIGlobalObject* xpc::GetNativeForGlobal(JSObject*) nsCOMPtr<T>::nsCOMPtr(nsQueryInterface) [with T = nsIXPConnectWrappedNative] void nsCOMPtr<T>::assign_from_qi(nsQueryInterface, const nsIID&) [with T = nsIXPConnectWrappedNative; nsIID = nsID] uint32 nsQueryInterface::operator(nsID*, void**)(const nsIID&, void**) const FieldCall: nsISupports.QueryInterface Somebody please tell me that can't GC, and why, so I can tell the analysis. needinfo? bholley.
Flags: needinfo?(bobbyholley+bmo)
Comment 6•11 years ago
|
||
(In reply to Steve Fink [:sfink] from comment #5) > Somebody please tell me that can't GC, and why, so I can tell the analysis. > needinfo? bholley. So, we're QIing the global to nsIXPCWrappedNative. This is generally ok, I think, but it's sketchy there are all kinds of funny things that can happen during QI. What would be the impact of just deciding that this function can GC?
Flags: needinfo?(bobbyholley+bmo) → needinfo?(bzbarsky)
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #6) > > What would be the impact of just deciding that this function can GC? Seems pretty bad. I mean, it triggers about 670 new hazards. Maybe there's an easy fix, but I'll let bz chime in. Talking to bholley on irc: <sfink> bholley: when you say QIing the global to nsIXPCWrappedNative is sketchy since all kinds of funny things can happen during QI (bug 947555), do you mean QI in general or QI to nsIXPCWrappedNative in particular? bholley sfink: QI in general sfink: I think it's probably fine sfink: but QI in general _can_ run script sfink: and there's all sorts of machinery that can run sfink bholley: right, I know that. I was wondering if we could declare this particular QI to be safe. sfink and if so, why this one feels safer than others, so I can minimize the holes I poke in the analysis bholley sfink: well, we know the object is a global object sfink: which means that it isn't an XPCWrappedJS sfink: and it isn't an aggregated-JS-with-native sfink bholley: is this stuff easily assertable? bholley sfink: well, in the common cases sure sfink: my concern is that maybe there's some case I hadn't thought of where we might hit some QI stuff that runs script bholley sfink: for WNs, the QI implementation is in XPCWrappedNative.cpp: NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION(XPCWrappedNative) sfink bholley: right, so I was hoping that I could annotate + add a believed-to-be-accurate assertion to back it up, and watch for assert failures I mean, I could always just do an AutoAssertNoGC but that'll only fire if it happens to GC, not if it's potentially possible bholley sfink: hm sfink: oh! sfink: I might have an idea sfink: instead of QIing the private, we can just check the isWrappedNative bit on the JSClass sfink: and then static_cast youpie sfink ok. I mean, I'd be fine with continuing to QI, but asserting the isWrappedNative bit. but if that's faster/cleaner, sounds great! bholley sfink: yes, and then we don't have to special-case the analysis sfink right yay, thanks! So I tried that out. It seems to work fine for elminating nsCOMPtr<nsIXPConnectWrappedNative> wn = do_QueryInterface(native) but sadly, the return value of the whole function is nsCOMPtr<nsIGlobalObject> global = do_QueryInterface(native); and that one isn't a plain static_cast<>. There's an offset of 0x8 for a SandboxPrivate, for example. I don't even know if doing a QI to nsIGlobalObject here is gc-free or not. bholley?
Flags: needinfo?(bobbyholley+bmo)
Reporter | ||
Comment 8•11 years ago
|
||
> What would be the impact of just deciding that this function can GC?
You'd need to change the ctor for CallbackObject and all subclasses to take a Handle instead of a raw JSObject* and then fix all callers to actually pass in a useful Handle (probably by sticking the value into a Rooted on the stack).
It's not insane, but it's a bit of work. And a small (very small, so I'd be ok with it) perf hit.
Flags: needinfo?(bzbarsky)
Reporter | ||
Comment 9•11 years ago
|
||
Oh, and the vast majority of those 670 callsites are codegenned, so the change would not be _that_ big.
Comment 10•11 years ago
|
||
(In reply to Steve Fink [:sfink] from comment #7) > (In reply to Bobby Holley (:bholley) from comment #6) > > > > What would be the impact of just deciding that this function can GC? > > Seems pretty bad. I mean, it triggers about 670 new hazards. Maybe there's > an easy fix, but I'll let bz chime in. > > Talking to bholley on irc: > > <sfink> bholley: when you say QIing the global to nsIXPCWrappedNative is > sketchy since all kinds of funny things can happen during QI (bug 947555), > do you mean QI in general or QI to nsIXPCWrappedNative in particular? > bholley sfink: QI in general > sfink: I think it's probably fine > sfink: but QI in general _can_ run script > sfink: and there's all sorts of machinery that can run > sfink bholley: right, I know that. I was wondering if we could declare this > particular QI to be safe. > sfink and if so, why this one feels safer than others, so I can minimize > the holes I poke in the analysis > bholley sfink: well, we know the object is a global object > sfink: which means that it isn't an XPCWrappedJS > sfink: and it isn't an aggregated-JS-with-native > sfink bholley: is this stuff easily assertable? > bholley sfink: well, in the common cases sure > sfink: my concern is that maybe there's some case I hadn't thought of where > we might hit some QI stuff that runs script > bholley sfink: for WNs, the QI implementation is in XPCWrappedNative.cpp: > NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION(XPCWrappedNative) > sfink bholley: right, so I was hoping that I could annotate + add a > believed-to-be-accurate assertion to back it up, and watch for assert > failures > I mean, I could always just do an AutoAssertNoGC > but that'll only fire if it happens to GC, not if it's potentially possible > bholley sfink: hm > sfink: oh! > sfink: I might have an idea > sfink: instead of QIing the private, we can just check the isWrappedNative > bit on the JSClass > sfink: and then static_cast > youpie > sfink ok. I mean, I'd be fine with continuing to QI, but asserting the > isWrappedNative bit. > but if that's faster/cleaner, sounds great! > bholley sfink: yes, and then we don't have to special-case the analysis > sfink right > yay, thanks! > > So I tried that out. It seems to work fine for elminating > > nsCOMPtr<nsIXPConnectWrappedNative> wn = do_QueryInterface(native) > > but sadly, the return value of the whole function is > > nsCOMPtr<nsIGlobalObject> global = do_QueryInterface(native); > > and that one isn't a plain static_cast<>. There's an offset of 0x8 for a > SandboxPrivate, for example. I don't even know if doing a QI to > nsIGlobalObject here is gc-free or not. bholley? Oh right. Duh. So, here are the objects we might be dealing with: * an XPCWN (detectable in the case above). That, in turn, will contain an nsGlobalWindow (inheriting nsIScriptGlobalObject). * nsInProcessTabChildGlobal * WorkerScope * SandboxPrivate * BackstagePass (this is all from [1]) I can verify that these QI implementations will not GC when QIed to nsIGlobalObject. But I'm kind of concerned about leaving this footgun around. I think we should just Handle-ify this API. [1] http://mxr.mozilla.org/mozilla-central/search?string=public%2BnsIGlobalObject
Flags: needinfo?(bobbyholley+bmo)
Reporter | ||
Comment 11•11 years ago
|
||
> I think we should just Handle-ify this API.
I think I agree, as I said on irc.
Assignee | ||
Comment 12•11 years ago
|
||
It's straightforward to do. Is this what you were thinking of? My naming is kind of lame. I'll fix up the patch if it's roughly right.
Attachment #8345002 -
Flags: feedback?(bzbarsky)
Reporter | ||
Comment 13•11 years ago
|
||
Comment on attachment 8345002 [details] [diff] [review] Root CallbackObject params Yes, just like that. But wrap a scope around the tempRoot usage in codegen, please, to prevent name collisions and whatnot.
Attachment #8345002 -
Flags: feedback?(bzbarsky) → feedback+
Assignee | ||
Comment 14•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #13) > Comment on attachment 8345002 [details] [diff] [review] > Root CallbackObject params > > Yes, just like that. But wrap a scope around the tempRoot usage in codegen, > please, to prevent name collisions and whatnot. Ok, but that'll always result in an extra layer of scoping. The 1st tempRoot goes into a parameter passed to wrapObjectTemplate, which always puts it inside an if block. The 2nd tempRoot puts it into an if block just a little further down. I looked into adding a 'wrapScope' parameter to wrapObjectTemplate, but... well, it already always does, so the parameter would be ignored. But your wish is my command! ;-) Updated patch attached. I tried and failed to come up with a better name for tempRoot.
Attachment #8345409 -
Flags: review?(bzbarsky)
Assignee | ||
Updated•11 years ago
|
Attachment #8345002 -
Attachment is obsolete: true
Reporter | ||
Comment 15•11 years ago
|
||
Comment on attachment 8345409 [details] [diff] [review] Root around GC call GetIncumbentGlobal >+ "{\n" + Drop the + sign here and on the next two lines. Python will automatically concatenate adjacent literal strings for you. And just put the "% name" at the very end. Similar at the other spot. And add a comment after the '{' that it's a scope for tempRoot. r=me
Attachment #8345409 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 16•11 years ago
|
||
Since this may end up blocking bug 937317 from re-landing, moving the patch over there and resolving this bug.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•