Closed Bug 863454 Opened 7 years ago Closed 6 years ago

crash in doInvoke @ XPCCallContext::Init

Categories

(Core :: Plug-ins, defect, P2)

21 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla23
Tracking Status
firefox20 --- wontfix
firefox21 --- wontfix
firefox22 --- fixed
firefox23 --- fixed
firefox24 --- fixed
firefox25 --- fixed
firefox-esr17 - wontfix
b2g18 --- unaffected
b2g18-v1.0.0 --- unaffected
b2g18-v1.0.1 --- unaffected

People

(Reporter: johns, Assigned: mccr8)

References

Details

(Keywords: csectype-uaf, sec-critical, Whiteboard: [adv-main22+])

Attachments

(4 files)

This is a dupe of bug 858900, but filing as security sensitive because of the possible use-after-free badness.

While investigating mostly-unrelated plugin bug 859099, I found this test case to reliably trigger 858900.

- Create an iframe
- On the iframe's load, create a plugin. Set a setter on the plugin's <embed> tag.
- In the plugin instantiate (NPP_New) set that property on embed, invoking setter
- Setter removes iframe from document
- js context is destroyed despite us currently running a setter on it

https://crash-stats.mozilla.com/report/index/bp-240c2eeb-bee2-478b-902e-ba01b2130418

Note that this test case might also trigger bug 854082 on firefox 22+ or bug 859099 if OOP plugins is disabled, but seems to reliably trigger this on 21.

On debug builds we hit a JS_ASSERT before crashing, stack attached
The assertion we're hitting is Assertion failure: cx->outstandingRequests == 0, at /home/johns/moz/moz-git/js/src/jscntxt.cpp:386, in js::DestroyContext.  I think this means we're destroying a context that we're currently running code on.  It seems like we should strongly consider making this fatal in release builds.

Marking sec-critical because it looks like we're running code on a destroyed context.
> - In the plugin instantiate (NPP_New) set that property on embed, invoking setter

To clarify, the plugin instantiates later (via nsAsyncInstantiateEvent) after the original script has finished. It then touches this property and triggers the setter inside NPP_New
Priority: -- → P2
I suspect this is actually fixed on trunk now, due to bholley's bug 860438, which ended up on m-c yesterday evening.  Interesting timing there.  A day later and John's test case wouldn't have triggered.

Anyways, before that patch, NP_SetProperty uses AutoCXPusher.  This in turn uses XPCJSContextStack to push the JSContext on a stack.  As far as I can tell, this doesn't AddRef() the nsJSContext that holds the JSContext.  Later, we destroy a docshell, which releases the nsJSContext, which assumes it uniquely owns the JSContext, causing a JSContext that is running code to be destroyed.

With bug 860438, NP_SetProperty uses nsCxPusher, which does this when you push:
  // Hold a strong ref to the nsIScriptContext, just in case
  // XXXbz do we really need to?  If we don't get one of these in Pop(), is
  // that really a problem?  Or do we need to do this to effectively root |cx|?
  mScx = GetScriptContextFromJSContext(cx);

mScx a COMPtr, so I think we'll keep the context alive.  Somebody should confirm that.

Anyways, bug 860438 is pretty huge, though part 5 doesn't look too bad.  Maybe we could just backport that to beta?
(In reply to Andrew McCreight [:mccr8] from comment #4)
> I suspect this is actually fixed on trunk now, due to bholley's bug 860438,
> which ended up on m-c yesterday evening.  Interesting timing there.  A day
> later and John's test case wouldn't have triggered.

Well, sorta. That bug was a dependent bug for bug 860085, which is the real heart of the problem. While it's true that using an nsCxPusher should root the cx in the common case, there are a few edge cases that need to be worked out (a few places remaining where we don't use nsCxPusher, the fact that the CC unlink stuff in nsJSEnvironment also destroys JSContexts, and the call to ReleaseJSContext in nsFrameMessageManager).

So while that bug got most of the way there, the proper dependency here is bug 860085.
Depends on: 860085
Andrew is going to use this bug to look at branch fixage based on bug 858900. \o/
The patch is actually in bug 863766, and is just a figleaf, in that we will probably just do a clean crash in this particular case.
Is this affecting esr17 and if so, shouldn't we do the approval in this bug instead of in the public bug 863766
Yeah, we should just backport the figleaf. Backporting bug 860085 isn't really feasible.
(In reply to lsblakk@mozilla.com from comment #8)
> Is this affecting esr17 and if so, shouldn't we do the approval in this bug
> instead of in the public bug 863766

I put the approval in the public bug and referred to this one in a private comment on the theory that I didn't really want to draw attention to the fact that we had a known issue.  Though it is probably hard to get from that to knowing what the problem is.  But maybe I was being silly.

Bug 863766 has been landed on Aurora.
Approving in here then, to land the 'figleaf' in esr17 now (please consider this comment to be the esr17 approval +) since I believe that marking an approval in bug 863766 would tip our hand on that issue, it's completely out of scope for ESR and would raise eyebrows of anyone watching the esr landings closely.
Oops - wrong version - this needs to land _after_ we ship FF21.
Whiteboard: [land to esr17 AFTER 5/21]
Marking this as closed as it is "fixed" on trunk.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Patch is in bug 863766.
Keywords: checkin-needed
Needs checkin to ESR17 only.
Keywords: checkin-needed
Whiteboard: [land to esr17 AFTER 5/21]
Target Milestone: --- → mozilla23
Matt: the test case for bug 863766 is here.
Unfortunately, I'm still getting some crashes and permanent hangs with the test case. All builds below are Mac.

The two ASan stack traces for FF23 and FF24 are attached.

Hang:

FF17esr 2013-06-18, release
FF22 2013-06-18, debug
FF23 2013-06-18, debug
FF24 2013-06-18, debug

Crash: 

FF17esr 2013-06-18, ASan debug
FF22 2013-06-18, ASan debug

No crash or hang:

FF23 2013-06-14, ASan debug
FF24 2013-06-14, ASan debug
Sorry, error above:
"The two ASan stack traces for FF23 and FF24 are attached."

Should be:
"The two ASan stack traces for FF17esr and FF22 are attached."
The FF22 crash looks like the one we're expecting, so that's okay...
Whiteboard: [adv-main22+][adv-esr1707+]
Hey guys - from the looks of this, it still appears to be crashing 17.0.7esr in a bad way. If that's the case, and we can't fix it for release (unlikely at this point) I should reopen it. And then we should not put out an advisory.

John - have you had a chance to look at this? Thanks.
Flags: needinfo?(jschoenick)
Reopening as affected for 17esr until we get some idea of what's going on.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [adv-main22+][adv-esr1707+] → [adv-main22+]
The debug+ASAN builds crash on a JS_ASSERT before the MOZ_CRASH, it's still the expected abort. The plugin container is killed by an external SIGSEGV:

>    0x00007ffff3a15a4e <+14>:    shr    $0x3,%r14
> => 0x00007ffff3a15a52 <+18>:    cmpb   $0x0,0x7fff8000(%r14)
>    0x00007ffff3a15a5a <+26>:    jne    0x7ffff3a15aac <mozilla::plugins::PluginScriptableObjectChild::GetObject(bool)+108>
>
> (gdb) print *(unsigned char *)($r14+0x7fff8000)
> $2 = 0 '\000'

So this is coming from either ASAN or parent code, I don't know enough about our child management, to say, but the child getting killed is expected either way.

I wasn't able to reproduce a hang in FF17ESR release or FF24. Is the plugin-container still running at this point? Is the build's associated nptest plugin available?
Flags: needinfo?(jschoenick)
I gather this is not going to be fixed for the current ESR releasing tomorrow.
Well, we're not really sure if it is fixed or not. :)  But yeah, we'll probably not be sure by tomorrow... The basic problem is that we're trying to turn a bad crash into a safe crash, so if something goes haywire, it isn't immediately clear whether it is working or not.
The solid "right" fix for this is the dependency chain of this bug, which will land soon, but probably isn't backportable.
Dependency landed! Thanks Bobby. Marking status flag fixed for FF 25 - please revert if I've misunderstood.
If this requires bug 860085 to be fixed, is this too complex/big to backport to ESR?
(In reply to Al Billings [:abillings] from comment #30)
> If this requires bug 860085 to be fixed, is this too complex/big to backport
> to ESR?

I would very much like to backport bug 860085 to aurora, so that it'll be in esr24. I'll flag it soon. It certainly isn't backportable to esr17 though.
If this was reopened because of ESR17 (only) then we can go ahead and close it as "fixed" for trunk. We're not going to be able to back-port the fix to the 17 branch.
Status: REOPENED → RESOLVED
Closed: 7 years ago6 years ago
Resolution: --- → FIXED
Group: core-security
You need to log in before you can comment on or make changes to this bug.