Closed Bug 860085 Opened 7 years ago Closed 6 years ago

Use the JSContext stack rather than the XPCCallContext stack to track live JSContexts

Categories

(Core :: XPConnect, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla25
Tracking Status
firefox24 + fixed
firefox25 --- fixed

People

(Reporter: bholley, Assigned: bholley)

References

Details

(Whiteboard: [leave uplift for bholley][qa-])

Attachments

(7 files)

Right now we consult the XPCCallContext stack in order to determine when we can safely destroy a JSContext. This is totally wrong, at least since QuickStubs, because there are many cases where we don't create an XPCCallContext at all.

I recently did some work to make sure that we pretty much always push active cxes onto the JSContext stack, so we should be able to use that instead.

This is probably what's causing the crashes in bug 858900.
Depends on: 860438
Blocks: 863454
Depends on: 865729
Depends on: 865745
Blocks: 886704
Depends on: 888104
Attachment #770596 - Flags: review?(gkrizsanits)
With this change, we can be very, very sure that we never push an nsJSContext
without instantiating an AutoCxPusher on the stack.
Attachment #770597 - Flags: review?(gkrizsanits)
We now have the invariant that any in-use cx must be pushed onto the JSContext
stack with one of our stack-scoped automatic nsCxPusher classes. These classes
hold a strong ref to the nsIScriptContext associated with the JSContext they
push (if any). This means that, if this cx is in use, we will always have at
least one strong reference to the nsJSContext coming from the stack, meaning
that neither the destructor nor the Unlink() implementation will be called.
So we don't need to do any deferred destruction of the cx anymore.
Attachment #770599 - Flags: review?(gkrizsanits)
Attachment #770599 - Flags: review?(continuation)
We only use XPCCallContext for reflector calls now, at which point an AddRef
is totally insignificant. Using an auto pointer here lets us clean up some
code, and makes the XPCCallContext destructor start to look pretty sane. :-)
Attachment #770601 - Flags: review?(gkrizsanits)
Comment on attachment 770599 [details] [diff] [review]
Part 5 - Stop using XPConnect::ReleaseJSContext in nsJSEnvironment::DestroyJSContext. v1

Review of attachment 770599 [details] [diff] [review]:
-----------------------------------------------------------------

Sounds reasonable to me, assuming your comment is correct. ;)
Attachment #770599 - Flags: review?(continuation) → review+
Comment on attachment 770594 [details] [diff] [review]
Part 2 - Rename xpc::{Push,Pop}JSContext and make them assert against DOM JSContexts. v1

Review of attachment 770594 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/nsXPConnect.cpp
@@ +1288,2 @@
>  {
> +    MOZ_ASSERT(!aCx || !GetScriptContextFromJSContext(aCx));

MOZ_ASSERT_IF(aCx, !GetScriptContextFromJSContext(aCx));?
Comment on attachment 770599 [details] [diff] [review]
Part 5 - Stop using XPConnect::ReleaseJSContext in nsJSEnvironment::DestroyJSContext. v1

Review of attachment 770599 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsJSEnvironment.cpp
@@ +1168,5 @@
> +// This function is called either by the destructor or unlink, which means that
> +// it can never be called when there is an outstanding ref to the
> +// nsIScriptContext on the stack. Our stack-scoped cx pushers hold such a ref,
> +// so we can assume here that mContext is not on the stack (and therefore not
> +// in use).

Is this assertable?
Attachment #770592 - Flags: review?(gkrizsanits) → review+
Attachment #770594 - Flags: review?(gkrizsanits) → review+
Attachment #770596 - Flags: review?(gkrizsanits) → review+
Comment on attachment 770599 [details] [diff] [review]
Part 5 - Stop using XPConnect::ReleaseJSContext in nsJSEnvironment::DestroyJSContext. v1

Review of attachment 770599 [details] [diff] [review]:
-----------------------------------------------------------------

This makes sense. However I think we could keep the assert previously ReleaseJSContext did, and matching the comment: that the cx is not on the stack.
Attachment #770599 - Flags: review?(gkrizsanits) → review+
Attachment #770597 - Flags: review?(gkrizsanits) → review+
Comment on attachment 770600 [details] [diff] [review]
Part 6 - Remove nsIXPConnect::ReleaseJSContext. v1

Review of attachment 770600 [details] [diff] [review]:
-----------------------------------------------------------------

what a loss... ;)
Attachment #770600 - Flags: review?(gkrizsanits) → review+
Comment on attachment 770601 [details] [diff] [review]
Part 7 - Remove XPCCallContext refcounting optimization. v1

Review of attachment 770601 [details] [diff] [review]:
-----------------------------------------------------------------

Hmmm... what's the story of this optimization? In the current world I seriously cannot imagine a case where we would suffer from removing it, just would like to double check it at some point why it was invented on the first place. In a multi-threaded world I can imagine that addref can cause some performance hit, or in some extremely fast call path... I'm really curious. I r+ it for now anyway, as I highly doubt that you are wrong with removing it.
Attachment #770601 - Flags: review?(gkrizsanits) → review+
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #11)
> This makes sense. However I think we could keep the assert previously
> ReleaseJSContext did, and matching the comment: that the cx is not on the
> stack.

(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #13)
> Hmmm... what's the story of this optimization? In the current world I
> seriously cannot imagine a case where we would suffer from removing it, just
> would like to double check it at some point why it was invented on the first
> place. In a multi-threaded world I can imagine that addref can cause some
> performance hit, or in some extremely fast call path... I'm really curious.
> I r+ it for now anyway, as I highly doubt that you are wrong with removing
> it.

XPConnect is full of micro-optimizations, many of which were actually never measured. Moreover, the performance story here is much different than it used to be, since (a) we're not using XPConnect for the DOM anymore, and (b) we now only construct XPCCallContexts for reflector calls (instead of everywhere). And if you look at XPCWrapped{Native,JS}::CallMethod, an AddRef is totally insignificant there. :-)
(In reply to Bobby Holley (:bholley) from comment #14)
> And if you look at XPCWrapped{Native,JS}::CallMethod, an AddRef
> is totally insignificant there. :-)

I have no doubt about that part :)
(In reply to Bobby Holley (:bholley) from comment #14)
> (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #11)
> > This makes sense. However I think we could keep the assert previously
> > ReleaseJSContext did, and matching the comment: that the cx is not on the
> > stack.

Er, I meant to reply here that we no longer have a public API exposed to do this.
Release drivers - What do you think about uplifting this to 24, given that 24 will be the next ESR?

This bug, and its dependency chain (bug 865745 and bug 880917), is rather large. It was all slated to land for FF24, but we ran into some infra issues near the merge date and it ended up landing afterwards.

This fixes some longstanding stability issues that we knew about, where we were destroying JSContexts while they were still in use. In general, it puts us in a much cleaner and safer position on some pretty important stuff, and may make it easier to backport certain fixes to 24. Given that this stuff landed just after the 24 merge, and given that we haven't seen any fallout on trunk since then, we can have a fair amount of confidence in landing it on aurora. On the other hand, it's a pretty large landing. Thoughts?
Thanks for the awesome summary :bholley, a few follow up Questions :

(In reply to Bobby Holley (:bholley) from comment #19)
> Release drivers - What do you think about uplifting this to 24, given that
> 24 will be the next ESR?
> 
> This bug, and its dependency chain (bug 865745 and bug 880917), is rather
> large. It was all slated to land for FF24, but we ran into some infra issues
> near the merge date and it ended up landing afterwards.
> 
> This fixes some longstanding stability issues that we knew about, where we
> were destroying JSContexts while they were still in use. 
Do we know what stability issue's are being fixed here ? Was curious to see if they are high volume patches.
>In general, it puts
> us in a much cleaner and safer position on some pretty important stuff, and
> may make it easier to backport certain fixes to 24. Given that this stuff
> landed just after the 24 merge, and given that we haven't seen any fallout
> on trunk since then, we can have a fair amount of confidence in landing it
> on aurora. 
It's great that this is stable on nightly.But in case fallout's are expected would those be easily identifiable ? How is the test coverage here ?
>On the other hand, it's a pretty large landing. Thoughts?

If all goes well and we land it and found late breaking issues in mid-beta how easy is this to backout ?
(In reply to bhavana bajaj [:bajaj] from comment #20)
> Do we know what stability issue's are being fixed here ? Was curious to see
> if they are high volume patches.

The dependent bugs are the ones I know about.

> It's great that this is stable on nightly.But in case fallout's are expected
> would those be easily identifiable ? How is the test coverage here ?

These are all pretty deep internal changes to core stuff. So regressions might not be easily identifiable. But on the flip side, pretty much the entire test suite serves as test coverage. :-)

> If all goes well and we land it and found late breaking issues in mid-beta
> how easy is this to backout ?

It's a lot of patches, but really it depends of we land anything on top of it that depends on it. If we don't, backout is straightforward. If we do, less so. Backing it out would be on the invasive side for a mid-beta change. On the other hand, given that this seems to be just fine on Nightly, it's not clear to me that we gain much by letting it wait another 6 weeks to trickle up to beta. In fact, this might even be _safer_, because stuff landed on top of the Nightly version will make it pretty unbackoutable.
Whiteboard: [leave uplift for bholley]
Comment on attachment 770601 [details] [diff] [review]
Part 7 - Remove XPCCallContext refcounting optimization. v1

Flagging for aurora approval for the patches in the following bugs:

* This bug
* bug 865745
* bug 880917
* bug 880917
* bug 888104
Attachment #770601 - Flags: approval-mozilla-aurora?
Comment on attachment 770601 [details] [diff] [review]
Part 7 - Remove XPCCallContext refcounting optimization. v1

Given the test coverage, bake time on nightly and comment 22, approving this on aurora.
Attachment #770601 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to Bobby Holley (:bholley) from comment #0)
> This is probably what's causing the crashes in bug 858900.

Current crashes for the signature in bug 858900 are zero for Firefox 23+. Is that sufficient to call this verified fixed? If not is there another test we can use?
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #26)
> (In reply to Bobby Holley (:bholley) from comment #0)
> > This is probably what's causing the crashes in bug 858900.
> 
> Current crashes for the signature in bug 858900 are zero for Firefox 23+. Is
> that sufficient to call this verified fixed? If not is there another test we
> can use?

The bug as-filed is an internal refactoring, for which verification is nonsensical.
[qa-] as per comment 27.
Whiteboard: [leave uplift for bholley] → [leave uplift for bholley][qa-]
You need to log in before you can comment on or make changes to this bug.