Closed Bug 684938 Opened 13 years ago Closed 13 years ago

Proxy nsXPCWrappedJS::Release to the main thread

Categories

(Core :: XPConnect, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla10
Tracking Status
firefox7 - wontfix
firefox8 - wontfix
firefox9 - wontfix
firefox10 + fixed
firefox11 + fixed
status1.9.2 --- unaffected

People

(Reporter: peterv, Assigned: peterv)

References

Details

(Whiteboard: [sg:critical?][qa?])

Attachments

(1 file)

Running the release on a thread other than the main thread can crash because we add nsXPCWrappedJS objects to the cycle collector, which assumes objects can't be refcounted while it runs. Both AddRef and Release are problematic, but Release will lead to crashes for sure whereas AddRef can lead to premature unlinking which shouldn't cause crashes, but might. I'll add a patch here to fix the crashes due to Release, bug 684711 will deal with the whole issue.
Whiteboard: [sg:critical?]
Attached patch v1Splinter Review
Peter, is this patch ready to be reviewed? If so, please request review asap. Thanks!
Not going to take this fix this late for 8. And I don't know if we can write a testcase for this, but if we can that would be awesome.
This doesn't really count, but I'd wager that some of the intermittent oranges I linked to in Bug 684711 will go away when this lands, so presumably if they come back that is kind of like a test...
Comment on attachment 561860 [details] [diff] [review]
v1

This seems to pass on try.
Attachment #561860 - Flags: review?(mrbkap)
Comment on attachment 561860 [details] [diff] [review]
v1

Review of attachment 561860 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/xpconnect/src/xpcwrappedjs.cpp
@@ +206,5 @@
> +    {
> +        // We'd like to abort here, but this can happen if someone uses a proxy
> +        // for the nsXPCWrappedJS.
> +        nsCOMPtr<nsIThread> mainThread = do_GetMainThread();
> +        NS_ProxyRelease(mainThread, static_cast<nsIXPConnectWrappedJS*>(this));

Hm, I'm not 100% sure that this is safe... I guess we should never be able to get here after the thread manager is shutdown, so do_GetMainThread should always succeed, but you should probably assert mainThread here. NS_ProxyReleae will just go ahead and release the object it if you pass null for target (which seems pretty dumb to me).
I guess we can also just return early if the mainThread has been shut down. Leaking isn't the end of the world in that case.
Comment on attachment 561860 [details] [diff] [review]
v1

r=me with bent's observation fixed.
Attachment #561860 - Flags: review?(mrbkap) → review+
https://hg.mozilla.org/mozilla-central/rev/e8e8396802f2
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Attachment #561860 - Flags: approval-mozilla-beta?
Attachment #561860 - Flags: approval-mozilla-aurora?
This landed the day before the switchover, so it should already be on Aurora.

I looked over the blocked bugs in bug 684711, which is a list of bugs with stacks involving the cycle collector calling into nonsense stacks.  All but 2 haven't been seen recently.  I looked more closely at the remaining two, which have been seen in the last few months with fairly high frequency.

Bug 675605: This involves nsXPCWrappedJS::cycleCollection::Traverse, which is something this patch should fix.  It looks like bug 675605 went away on trunk on 10/07.  On Aurora, it went away on 11/07, which is the day before the crossover.  Since then it has only shown up once, on beta.

Bug 635480: This involve nsXPCWrappedJS::cycleCollection::Traverse calling into CheckForRightISupports calling into a random SVG code, which sounds like an instance of something this bug would fix.  The last instance of this on m-c is 9/29, which is vaguely around the date the previous one went away on trunk.  Like the previous bug, the last instance of this on Aurora was on 11/07, right before the branch.

Both of these went away too early on trunk to have been fixed by this particular patch.  Perhaps some of the ongoing work on removing XPCOM proxies (bug 675221) removed the particular off-thread activity that was causing these crashes.  Looking at that bug, the patch "Implement Runnable that can be waited on synchronously, for removing XPCOM proxies" landed on 10/07, so maybe that fixed 675605?  It doesn't really sound like it, but who knows.
Comment on attachment 561860 [details] [diff] [review]
v1

Thanks Andrew!
Attachment #561860 - Flags: approval-mozilla-aurora?
And also, given the fact that this is a race condition that's likely very hard to reliably trigger from a webpage I don't think we need to push to get this in on beta from a security point of view, but I would like to know if the collective number of cycle collector crashes that are likely fixed by this is large enough to warrant the risk of taking this for beta.
It isn't clear to me.  The crashes in those bugs I linked were fairly common, but I can't see anything on crash-stats that looks like it.  But as 635480 shows, the top frame could be anything.  I think the most noticeable thing to find would be a crash with nsXPCWrappedJS::cycleCollection::Traverse anywhere on the stack, or probably even just in the second frame.  I'm not sure how one would go about doing that.
Comment on attachment 561860 [details] [diff] [review]
v1

I looked into the volume of the relevant crashes and they're fairly low volume, so we won't be pushing for this in beta.
Attachment #561860 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Marking fixed for 10 and 11, and wontfix for 9.
Is there something QA can do to verify this fix?
Whiteboard: [sg:critical?] → [sg:critical?][qa?]
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #17)
> Is there something QA can do to verify this fix?

bump
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: