Closed
Bug 883450
Opened 11 years ago
Closed 11 years ago
Conservative Stack Scanner doesn't work when not in a request
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla24
People
(Reporter: bholley, Assigned: bholley)
References
Details
(Keywords: sec-critical, Whiteboard: [Embargo until ESR17 EOL][adv-main24+])
Attachments
(6 files)
3.19 KB,
patch
|
gkrizsanits
:
review+
|
Details | Diff | Splinter Review |
2.65 KB,
patch
|
gkrizsanits
:
review+
|
Details | Diff | Splinter Review |
8.84 KB,
patch
|
gkrizsanits
:
review+
|
Details | Diff | Splinter Review |
3.76 KB,
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
10.41 KB,
patch
|
gkrizsanits
:
review+
|
Details | Diff | Splinter Review |
2.18 KB,
patch
|
terrence
:
review+
|
Details | Diff | Splinter Review |
I've been debugging bug 879649 for a good day and a half now, and finally figured out what's going on. In bug MarkConservativeStackRoots, we bail out of stack scanning if !cgcd->hasStackToScan(): http://mxr.mozilla.org/mozilla-central/source/js/src/gc/RootMarking.cpp#316 This predicate just evaluates to !!nativeStackTop - that is to say, whether we have a usable pointer to the top of the stack. This pointer is generated in RecordNativeStackTopForGC, but is then _cleared_ in updateForRequestEnd(), which gets called when our request count drops to zero: http://mxr.mozilla.org/mozilla-central/source/js/src/jsapi.cpp#1230 And though RecordNativeStackTopForGC is called at the beginning of Collect(), it no-ops because the request depth is zero: http://mxr.mozilla.org/mozilla-central/source/js/src/jsgc.cpp#963 This means that _anything_ on the stack outside of a request isn't rooted at all. I wouldn't be surprised if there were dozens of such cases in the browser - especially from consumers of nsIScriptContext functions, who don't necessarily have a need to enter a request themselves. Marking sec-critical, because the surface is wide enough that there's almost certainly an exploitable case somewhere.
I think we do sort of depend on being able to skip stack scanning during timer GCs. The stack scanner has a pretty high false positive rate on 32-bit platforms, and I'm afraid we'd leak a lot if we always scanned. However, we could probably find a better way to decide whether to skip stack scanning. Rather than "are we in a request", maybe we could skip it if we're invoked from a timer GC when we're not in a nested event loop. We also have to watch out for schedulePreciseGC. One question I have is how this is happening. My assumption was always that: 1. We have to be doing some JSAPI stuff in order to GC. 2. Almost all JSAPI calls assert that you're in a request. Which one of these is broken in this case?
Comment 2•11 years ago
|
||
Since when do we require a stack scan when we are not in a request? Only <20% of all our GCs do a stack scan last time I measured.
(In reply to Gregor Wagner [:gwagner] from comment #2) > Since when do we require a stack scan when we are not in a request? > Only <20% of all our GCs do a stack scan last time I measured. We don't scan the stack when we're not in a request. Bobby filed the bug because he found a case where that caused us to fail to scan something that needed to be scanned. We could just put a request there, but he's worried that it's a widespread problem.
Comment 4•11 years ago
|
||
I think the particular case was this: http://mxr.mozilla.org/mozilla-central/source/content/xul/content/src/nsXULElement.cpp#2575 This method has a JSScript pointer that is null, which calls into nsJSContext::CompileScript, which does an AutoRequest, sets the script, then exits the request, so we're in danger, returns the script, then roots the script on the heap. The particular problem he was seeing was that a Pop(), which happened in between the end of the AutoRequest and the return from CompileScript, was triggering a GC.
Comment 5•11 years ago
|
||
I guess we could dynamically detect this by adding an assertion to JS::Rooted that we're in a request? Then, if we're properly precisely rooted, we should be properly dynamically rooted, for paths we test. (Of course, there is a certain irony here.)
Bobby, do you have a stack or anything? I don't see where Pop() can GC outside of a request, but maybe I got lost following all the indirections.
Assignee | ||
Comment 7•11 years ago
|
||
nsCxPusher embeds a JSAutoRequest, but destroys it (manually) before calling xpc::PopContext: http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/nsCxPusher.cpp#168 xpc::PopContext invokes JS_RestoreFrameChain, which can GC (both according to evidence and to mrgiggles). I also don't think that adding more JSAutoRequests is the right solution to this problem, especially since the Request API is on its way out anyway.
Assignee | ||
Comment 8•11 years ago
|
||
(note that the issue in bug 879649 was that _adding_ an AutoPushJSContext (which wraps an nsCxPusher) to nsJSContext::CompileScript caused a ton of crashes in the wild, even though we're incorrect without it).
> xpc::PopContext invokes JS_RestoreFrameChain, which can GC (both according to evidence > and to mrgiggles). Are you talking about this call? http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCJSContextStack.cpp#52 What's confusing is that the line before it opens a request. So the GC should happen inside a request and we should scan the stack, right?
Also, JS_RestoreFrameChain does assert that we're in a request.
Assignee | ||
Comment 11•11 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #9) > > xpc::PopContext invokes JS_RestoreFrameChain, which can GC (both according to evidence > > and to mrgiggles). > > Are you talking about this call? > > http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/ > XPCJSContextStack.cpp#52 > What's confusing is that the line before it opens a request. So the GC > should happen inside a request and we should scan the stack, right? (In reply to Bill McCloskey (:billm) from comment #10) > Also, JS_RestoreFrameChain does assert that we're in a request. Oh, good point. However, shortly after calling PopRequest, we call ScriptEvaluted: http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/nsCxPusher.cpp#186 Whose first action in office is to call JS_MaybeGC(): http://mxr.mozilla.org/mozilla-central/source/dom/base/nsJSEnvironment.cpp#2333 So that's probably the cause of the GC we're seeing here.
OK. Direct calls to GC are one of the cases where we don't assert that you're in a request, so that makes sense. I'm kind of tempted to say that we should just spot-fix this. This problem will go away when exact rooting lands. My sense is that that will be ready roughly at the same time as getting rid of requests. The alternative is to do something like comment 1 suggests.
Assignee | ||
Comment 13•11 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #12) > I'm kind of tempted to say that we should just spot-fix this. This problem > will go away when exact rooting lands. Do you mean spot fix this one case we found? If so, I'm not really comfortable with that. If you mean "sprinkle in JSAutoRequests in the right places," I think that's the correct solution, assuming we can implement strong dynamic enforcement. I think the right approach is to assert that we're in a request every time we create a Rooted<T>. The static analysis has pointed us to all the places where we want to keep a GCThing alive across GC potential GC calls. For most of those, we should already be in a request. In theory, the only failures that crop up should be the hazards we're trying to fix here. I'll throw in some assertions and see what happens.
Assignee | ||
Updated•11 years ago
|
Assignee: general → bobbyholley+bmo
Assignee | ||
Comment 14•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=1f1e1b2f3e9c
Assignee | ||
Comment 15•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=4c42110868c2
Assignee | ||
Comment 16•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=9befe60d26e6
Comment 17•11 years ago
|
||
Ah, so... CallbackObject _used_ to have a request there until it got removed in bug 868130. Note that the comments were not updated in the process; please update them here?
Assignee | ||
Comment 18•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=05c86f1e1982
Assignee | ||
Comment 19•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=44175e0899ba
Assignee | ||
Comment 20•11 years ago
|
||
Green! Uploading patches and flagging for review.
Assignee | ||
Comment 21•11 years ago
|
||
There's not a lot we can do without it in the browser these days.
Attachment #765108 -
Flags: review?(gkrizsanits)
Assignee | ||
Comment 22•11 years ago
|
||
Attachment #765109 -
Flags: review?(gkrizsanits)
Assignee | ||
Comment 23•11 years ago
|
||
Attachment #765110 -
Flags: review?(gkrizsanits)
Assignee | ||
Comment 24•11 years ago
|
||
Attachment #765111 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 25•11 years ago
|
||
Attachment #765112 -
Flags: review?(gkrizsanits)
Assignee | ||
Comment 26•11 years ago
|
||
Attachment #765113 -
Flags: review?(gkrizsanits)
Assignee | ||
Comment 27•11 years ago
|
||
Comment on attachment 765113 [details] [diff] [review] Part 6 - Assert that we're in a request whenever we create a Rooted<T>. v1 Er, this.
Attachment #765113 -
Flags: review?(gkrizsanits) → review?(terrence)
Comment 28•11 years ago
|
||
Comment on attachment 765111 [details] [diff] [review] Part 4 - Re-order stuff in CallSetup so that we construct the RootedObject after the Push. v1 > // We construct our JS::Rooted right after our JSAutoRequest; let's just Please do fix the comment; see comment 17. r=me with that.
Attachment #765111 -
Flags: review?(bzbarsky) → review+
Comment 29•11 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #0) > especially from consumers of nsIScriptContext functions Or mCx, or GetSafeJSContext... Based on some mxr search we have quite some cases to worry about. So the idea here is we assert and rely on our test coverage, and fix only the cases that pops up on try with this assert, right? I have to say that it's not a very relaxing approach. Anyway, I'll finish up the review of the patches soon, just was wondering if it makes sense to try and do some major cleaning up on this department (by manually cherry-picking the suspicious looking cx usages) or this whole thing is going away soon, and this is a good enough solution for the time being?
Comment 30•11 years ago
|
||
Comment on attachment 765108 [details] [diff] [review] Part 1 - MOZ_CRASH if we fail to spin up the SafeJSContext, and remove error handling. v1 Review of attachment 765108 [details] [diff] [review]: ----------------------------------------------------------------- Shouldn't we rename it to SafeJSContext now that it cannot fail? (at some point in the future...)
Attachment #765108 -
Flags: review?(gkrizsanits) → review+
Updated•11 years ago
|
Attachment #765109 -
Flags: review?(gkrizsanits) → review+
Assignee | ||
Comment 31•11 years ago
|
||
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #29) > Or mCx, or GetSafeJSContext... Based on some mxr search we have quite some > cases to worry about. So the idea here is we assert and rely on our test > coverage, and fix only the cases that pops up on try with this assert right? Well, it's better than that. We're (more or less) green on the sixgill exact rooting static analysis. The means we can statically prove that we use Rooted/Handle for every stack value that we need to keep alive across calls that can GC. So as long as we at least call through these functions once, the assertions will tell us the right answer.
Updated•11 years ago
|
Attachment #765110 -
Flags: review?(gkrizsanits) → review+
Comment 32•11 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #31) > Well, it's better than that. We're (more or less) green on the sixgill exact > rooting static analysis. The means we can statically prove that we use > Rooted/Handle for every stack value that we need to keep alive across calls > that can GC. So as long as we at least call through these functions once, > the assertions will tell us the right answer. What ever the sixgill exact rooting static analysis exactly does under the hood, I already feel a bit more relaxed :)
Comment 33•11 years ago
|
||
Comment on attachment 765112 [details] [diff] [review] Part 5 - Miscellaneous requests. v1 Review of attachment 765112 [details] [diff] [review]: ----------------------------------------------------------------- AutoJSContext cx; nsresult rv; + JSAutoRequest ar(cx); why?
Attachment #765112 -
Flags: review?(gkrizsanits) → review+
Comment 34•11 years ago
|
||
Comment on attachment 765113 [details] [diff] [review] Part 6 - Assert that we're in a request whenever we create a Rooted<T>. v1 Review of attachment 765113 [details] [diff] [review]: ----------------------------------------------------------------- r=me
Attachment #765113 -
Flags: review?(terrence) → review+
Assignee | ||
Comment 35•11 years ago
|
||
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/be022529d704 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/809ce50459c1 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/9e6fa71b9f72 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/aae5d429b8d4 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/dd4b059747b8 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/ce7085b63b2b
Updated•11 years ago
|
status-firefox24:
--- → affected
tracking-firefox24:
--- → +
Comment 36•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/be022529d704 https://hg.mozilla.org/mozilla-central/rev/809ce50459c1 https://hg.mozilla.org/mozilla-central/rev/9e6fa71b9f72 https://hg.mozilla.org/mozilla-central/rev/aae5d429b8d4 https://hg.mozilla.org/mozilla-central/rev/dd4b059747b8 https://hg.mozilla.org/mozilla-central/rev/ce7085b63b2b
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
Comment 37•11 years ago
|
||
Bobby, what's the backporting story this these patches?
Assignee | ||
Comment 38•11 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #37) > Bobby, what's the backporting story this these patches? We're unlikely to be able to fix this solidly on all branches, because the assertions aren't backportable (they're only meaningful on branches where we're more or less exactly rooted, which is just trunk at the moment). There are also a lot of other things that have changed in this department (bug 834732, bug 868130, etc), that will make any kind of meaningful backport hard. I recommend backporting Part 5 and embargoing this bug until ESR17 EOL.
Updated•11 years ago
|
Whiteboard: [Embargo until ESR17 EOL]
Updated•11 years ago
|
Whiteboard: [Embargo until ESR17 EOL] → [Embargo until ESR17 EOL][adv-main24+]
Updated•11 years ago
|
status-firefox-esr17:
--- → wontfix
Updated•11 years ago
|
Comment 39•10 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #38) > I recommend backporting Part 5 and embargoing this bug until ESR17 EOL. Still recommended for b2g18? We still want to fix sec-crits there if possible.
Flags: needinfo?(bobbyholley+bmo)
Assignee | ||
Comment 40•10 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #39) > (In reply to Bobby Holley (:bholley) from comment #38) > > I recommend backporting Part 5 and embargoing this bug until ESR17 EOL. > > Still recommended for b2g18? We still want to fix sec-crits there if > possible. Yeah, we should probably at least try to backport Part 5.
Flags: needinfo?(bobbyholley+bmo)
Assignee | ||
Comment 41•10 years ago
|
||
Actually, I realize that backporting this pixie dust is totally useless without bug 834732.
Assignee | ||
Updated•10 years ago
|
Flags: in-testsuite?
Updated•10 years ago
|
Group: core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•