Closed Bug 1103036 Opened 11 years ago Closed 11 years ago

Allow ContentChild to perform tasks on shutdown

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla37

Tracking Flags:

Tracking

Status

firefox37

---

wontfix

firefox38

---

wontfix

People

(Reporter: jchen, Assigned: jchen)

References

Details

Attachments

(7 files, 8 obsolete files)

WIP Patch (v1) 11 years ago (inactive) Jim Chen [:jchen] [:darchons] 6.19 KB, patch	billm : feedback+	Details \| Diff \| Splinter Review
jchen.patch 11 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) 16.19 KB, patch		Details \| Diff \| Splinter Review
Allow ContentChild to perform tasks during shutdown (v1) 11 years ago (inactive) Jim Chen [:jchen] [:darchons] 15.32 KB, patch		Details \| Diff \| Splinter Review
Allow ContentChild to perform tasks during shutdown (v2) 11 years ago (inactive) Jim Chen [:jchen] [:darchons] 15.89 KB, patch		Details \| Diff \| Splinter Review
Allow ContentChild to perform tasks during shutdown (v3) 11 years ago (inactive) Jim Chen [:jchen] [:darchons] 16.11 KB, patch		Details \| Diff \| Splinter Review
Allow ContentChild to perform tasks during shutdown (v4) 11 years ago (inactive) Jim Chen [:jchen] [:darchons] 15.96 KB, patch	billm : review+	Details \| Diff \| Splinter Review
use the new fancy shutdown at xpcom-shutdown 11 years ago Andrew McCreight [:mccr8] 1.64 KB, patch		Details \| Diff \| Splinter Review
review-changes 11 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) 5.23 KB, patch	billm : review+	Details \| Diff \| Splinter Review
tabchild-protection 11 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) 3.08 KB, patch	bent.mozilla : review+	Details \| Diff \| Splinter Review
timer-changes 11 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) 6.46 KB, patch	bent.mozilla : review+	Details \| Diff \| Splinter Review
review-changes v2 11 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) 5.22 KB, patch	jchen : review+	Details \| Diff \| Splinter Review
Allow ContentChild to perform tasks during shutdown (v4.1) 11 years ago (inactive) Jim Chen [:jchen] [:darchons] 16.16 KB, patch	jchen : review+	Details \| Diff \| Splinter Review
Wait for ContentChild shutdown during profile-before-change (v1) 11 years ago (inactive) Jim Chen [:jchen] [:darchons] 2.75 KB, patch	billm : review+ fabrice : feedback+	Details \| Diff \| Splinter Review
Use kill timer when shutting down ContentParent in shutdown observer (v1) 11 years ago (inactive) Jim Chen [:jchen] [:darchons] 2.85 KB, patch	billm : review+	Details \| Diff \| Splinter Review
Don't send shutdown message to Nuwa processes (v1) 11 years ago (inactive) Jim Chen [:jchen] [:darchons] 1.24 KB, patch	billm : review+	Details \| Diff \| Splinter Review

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Description

•

11 years ago

We want ContentChild to send over telemetry data to ContentParent on shutdown, because ContentChild can't save the telemetry data itself. Right now each TabChild is notified of destruction through the Destroy call, but I don't see a way for ContentChild as a whole to be notified of process shutdown -- ContentProcess shuts down by closing the IPC channel, so ContentChild only knows about it after the channel is closed. At that point, it's too late to send telemetry data.

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 1

•

11 years ago

Attached patch WIP Patch (v1) (obsolete) — Details — Splinter Review

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 2

•

11 years ago

Comment on attachment 8526926 [details] [diff] [review] WIP Patch (v1) Is this a good approach? This patch adds a Shutdown call from ContentParent to ContentChild, and now ContentChild is responsible for closing the channel on shutdown instead of ContentParent. This way, ContentChild can send over telemetry data before it closes the channel.

Attachment #8526926 - Flags: feedback?(wmccloskey)

Attachment #8526926 - Flags: feedback?(bent.mozilla)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 3

•

11 years ago

Comment on attachment 8526926 [details] [diff] [review] WIP Patch (v1) Review of attachment 8526926 [details] [diff] [review]: ----------------------------------------------------------------- This looks good to me. There is one part that worries me. I was curious why the parent waits for the child to die. The shutdown sequence is pretty synchronous, so it seems like the parent should just blaze ahead, not waiting for the child. The following code explains why it waits: http://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorParent.cpp#329 This code is called from here: http://mxr.mozilla.org/mozilla-central/source/xpcom/build/XPCOMInit.cpp#848 It runs a nested event loop that waits for the compositor thread to shut down. The compositor thread shuts down when the CrossProcessCompositorParent dies, which happens when the child closes the IPC channel and the parent cleans up all remaining actors. Mainly I'm just concerned that we're relying on this graphics code in order for the telemetry data to be sent back properly. As long as we're running a nested event loop in XPCOM shutdown, it seems like we might as well as a separate nested event loop to wait for the channel to be closed. At least that way we have an explicit place in the code where we know we're waiting for the child process to die. ::: dom/ipc/ContentChild.cpp @@ +2431,5 @@ > +bool > +ContentChild::RecvShutdown() > +{ > + // Close the IPC channel and let both the parent and child shut down. > + Close(); When we talked about this in person, I said that Close() won't work here because any data sent before it will be dropped. I was wrong; the data will be sent and received correctly. So I think this part of the patch is fine.

Attachment #8526926 - Flags: feedback?(wmccloskey) → feedback+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 4

•

11 years ago

Attached patch jchen.patch (obsolete) — Details — Splinter Review

Hi Jim. I hope you don't mind but I went ahead and made the changes I asked for. I think we need something like this for bug 1103279 and I wanted to test it out. Andrew is going to push this to try with some of his own patches to see if it fixes some issues with leak checking. If that works, we can proceed with review (probably from bent or bsmedberg).

Andrew McCreight [:mccr8]

Comment 5

•

11 years ago

try run with the patch from comment 4, plus a few test harness changes that enabled e10s content process leak checking: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2944bf4484c6

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 6

•

11 years ago

Attached patch Allow ContentChild to perform tasks during shutdown (v1) (obsolete) — Details — Splinter Review

Sorry, I was just about to post a new patch. Turns out letting the child close the IPC channel doesn't really work. The problem is that after we send the shutdown message, the child can close the IPC channel at any time, even while the parent is still trying to use the IPC channel for things like the message manager. That resulted in some test failures. We can't shutdown the message manager early either, because the child depends on it to send over telemetry data. I think the best way is to let the child call the parent back. The parent still first calls Shutdown() in the child. After the child finishes its tasks, it then calls FinishShutdown() in the parent. The parent can then close the channel like before. While the parent has an outstanding Shutdown() call, it waits for the FinishShutdown() call to arrive in a nested event loop. Also, we already have a timeout mechanism for shutdown [1], so I think we can just rely on that. What do you think? [1] http://mxr.mozilla.org/mozilla-central/source/dom/ipc/ContentParent.cpp?rev=551c3cd74dbd#1837

Attachment #8526926 - Attachment is obsolete: true

Attachment #8526926 - Flags: feedback?(bent.mozilla)

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Updated

•

11 years ago

Blocks: 1109931

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 7

•

11 years ago

Try build together with some e10s telemetry patches, https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=575fd593921a

Andrew McCreight [:mccr8]

Comment 8

•

11 years ago

There's a use-after-free crash in nsFrameMessageManager::DispatchAsyncMessageInternal in B2G ICS Emulator opt M5, during test_NuwaProcessCreation.html

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Blocks: 1083897

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 9

•

11 years ago

Attached patch Allow ContentChild to perform tasks during shutdown (v2) (obsolete) — Details — Splinter Review

This patch moves the nested event loop to the xpcom-shutdown observer. It also delays closing the channel in ContentParent::RecvFinishShutdown to see if it fixes the B2G crash. I'll push to try once it reopens.

Attachment #8534649 - Attachment is obsolete: true

Tim Taubert [:ttaubert] (inactive)

Updated

•

11 years ago

Blocks: 1109875

Andrew McCreight [:mccr8]

Comment 10

•

11 years ago

try push along with a few of my leak patches: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=bbd5a38d69c2

Andrew McCreight [:mccr8]

Comment 11

•

11 years ago

It looks like the use after free is still happening. There are a bunch of IsNuwaProcess() sorts of checks in ContentParent. Maybe one is needed here, too? A Nuwa person might have some ideas about what is going wrong.

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 12

•

11 years ago

Attached patch Allow ContentChild to perform tasks during shutdown (v3) (obsolete) — Details — Splinter Review

Turns out Nuwa was calling ContentParent::Close directly, which I didn't account for. This patch passes the failing test on try, https://tbpl.mozilla.org/?tree=Try&rev=ab7d253fbde1

Attachment #8535251 - Attachment is obsolete: true

Andrew McCreight [:mccr8]

Comment 13

•

11 years ago

The Linux debug bc3 failure in my try push looks like it could also be related to your patch. The M1 failures shouldn't show up without my leak harness changes. It looks like there are a few cases when running browser element parent tests out of process where we're still attempting to send messages to the parent too late. I did a separate try push with my patch in bug 1091766 that makes us not exit(0) immediately when a fail to send a message to the parent, but then I'm seeing the same failure I saw before where there's a constructor for actor failed abort in some networking test, when we have some half-loaded process running after the parent shuts down. I think billm said he'd look at that failure a bit. https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=590d42efa2d1

Andrew McCreight [:mccr8]

Comment 14

•

11 years ago

Comment on attachment 8534585 [details] [diff] [review] jchen.patch I think billm's modified version of the patch isn't needed now.

Attachment #8534585 - Attachment is obsolete: true

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 15

•

11 years ago

(In reply to Andrew McCreight [:mccr8] (away-ish Dec 17-26) from comment #13) > The Linux debug bc3 failure in my try push looks like it could also be > related to your patch. > > <snip> Did you try using my new patch? I think my try run with the new patch didn't show those failures, https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=65b77af93f9e

Andrew McCreight [:mccr8]

Comment 16

•

11 years ago

Oh, great. I just wanted to point those out in case you'd only done the B2G run.

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 17

•

11 years ago

Attached patch Allow ContentChild to perform tasks during shutdown (v4) (obsolete) — Details — Splinter Review

Hey Bill, would you be able to review this patch? I tested that it works locally, and try run looked okay.

Attachment #8537279 - Attachment is obsolete: true

Attachment #8537489 - Flags: review?(wmccloskey)

Andrew McCreight [:mccr8]

Comment 18

•

11 years ago

I'm wondering why you do CLOSE_CHANNEL instead of SEND_SHUTDOWN_MESSAGE in the xpcom-shutdown observer. I think this is what was causing some problems for leak checking. We create some processes late in shutdown, and they end up doing the old bad way of shutting down, which causes problems. Though that in turn causes another assertion at shutdown. Let me see if I can reproduce the problem with the newest version of your patch.

Andrew McCreight [:mccr8]

Comment 19

•

11 years ago

Attached patch use the new fancy shutdown at xpcom-shutdown (obsolete) — Details — Splinter Review

This is what I was doing, though it seems to just hang now when I'm testing it with the new version of the patch. :( I'm not sure when I'll have time to look into it next, but I can file a followup bug as needed to address my shutdown needs. The weird tab child change is needed because DelayedDeleteRunnable was getting posted, then ContentChild went through the new message back and forth which caused the subtree destroy to be called on tabchild, then the runnable was trying to do a send delete on the tabchild.

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 20

•

11 years ago

I see. I used CLOSE_CHANNEL to preserve current behavior for xpcom-shutdown. Normally, we've already started shutting down by the time we get to xpcom-shutdown, but I can see how that might not be the case when we're running tests.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 21

•

11 years ago

Sorry again this took me so long. I ended up playing around with this patch, so I'll just post my review comments as an additional patch.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Updated

•

11 years ago

Attachment #8537489 - Flags: review?(wmccloskey) → review+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 22

•

11 years ago

Attached patch review-changes (obsolete) — Details — Splinter Review

Here's my "review". The only substantial change is to the xpcom-shutdown behavior. I think we should always try to send the Shutdown message and then wait for the child to die. I pushed that change to try and it seems to work. This will help Andrew's leak checking work and it also makes things more consistent I think.

Attachment #8537543 - Attachment is obsolete: true

Attachment #8543419 - Flags: review+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 23

•

11 years ago

Attached patch tabchild-protection — Details — Splinter Review

This fixes an existing bug I noticed while messing around with the patch here. Namely, we create a runnable that sends the __delete__ message for TabChild. However, this runnable can happen to run after TabChild::ActorDestroy, in which case bad things will happen. (It won't happen after ~TabChild since it keeps a ref to the TabChild.)

Attachment #8543421 - Flags: review?(bent.mozilla)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 24

•

11 years ago

Attached patch timer-changes — Details — Splinter Review

I also noticed a pretty disturbing issue while testing this patch. I wanted to make sure that a hung child process won't cause shutdown to hang forever. So I stuck a sleep(30) call inside TabChild::RecvShutdown. In theory, NotifyTabDestroying sets a timer that should kill the child after 5 seconds. That doesn't work though. We get to the nested event loop that this patch adds to wait for the child to shutdown. However, for some reason the timer doesn't trigger while we're in that nested event loop. If I switch it to an XPCOM timer instead of a chromium timer, then it runs. I think there may be a bug in MessagePump that's causing this behavior since afaik know the chromium timer should work. Please correct me if that's wrong, Ben. Anyway, I'd like to work around this for now by switching to an XPCOM timer. I'm going to investigate the MessagePump issue now though. It seems pretty serious.

Attachment #8543426 - Flags: review?(bent.mozilla)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 25

•

11 years ago

I looked at the MessagePump issue in comment 24 and it seems to be sort of a corner case. We're creating a timer that's supposed to expire after 5 seconds, but then we cancel it here: http://mxr.mozilla.org/mozilla-central/source/ipc/glue/MessagePump.cpp#146 The reason is that we're exiting the main Gecko event loop. Besides this particular timeout, I'm not sure how likely it is that we have chromium delayed tasks that should still run when we're processing events during shutdown. It doesn't seem very likely to me, so I think the fix in comment 24 may be the right way to go.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 26

•

11 years ago

Attached patch review-changes v2 — Details — Splinter Review

The original version of this had a really stupid error, so flagging Jim to make sure he looks this over too :-).

Attachment #8543419 - Attachment is obsolete: true

Attachment #8543448 - Flags: review?(nchen)

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 27

•

11 years ago

Comment on attachment 8543448 [details] [diff] [review] review-changes v2 Review of attachment 8543448 [details] [diff] [review]: ----------------------------------------------------------------- LGTM! ::: dom/ipc/ContentChild.cpp @@ +2498,5 @@ > } > + > + // Ignore errors here. If this fails, the parent will kill us after a > + // timeout. > + unused << SendFinishShutdown(); Is there a reason you removed the PostTask call? I don't think we actually need it, but I added it originally just in case some shutdown task posts something to the event loop.

Attachment #8543448 - Flags: review?(nchen) → review+

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 28

•

11 years ago

Comment on attachment 8543426 [details] [diff] [review] timer-changes Review of attachment 8543426 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/ipc/ContentParent.cpp @@ -1908,4 @@ > int32_t timeoutSecs = > Preferences::GetInt("dom.ipc.tabs.shutdownTimeoutSecs", 5); > if (timeoutSecs > 0) { > - MessageLoop::current()->PostDelayedTask( Ugh, we should make this method crash somehow if called from non-chromium code... We've had so many bugs because of it :( @@ +1909,5 @@ > Preferences::GetInt("dom.ipc.tabs.shutdownTimeoutSecs", 5); > if (timeoutSecs > 0) { > + mForceKillTimer = do_CreateInstance("@mozilla.org/timer;1"); > + if (!mForceKillTimer) > + return; Returning seems not very future-proof (someone could add code below and not notice it?). Can't we assert that mForceKillTimer always succeeds? I don;t think this can be called after XPCOM has shut down, right? @@ +1973,5 @@ > { > mSubprocess = nullptr; > mChildID = gContentChildID++; > mGeolocationWatchID = -1; > + mForceKillTimer = nullptr; This isn't needed, nsCOMPtr default-initializes to null. @@ +3146,5 @@ > > +/* static */ void > +ContentParent::ForceKillTimerCallback(nsITimer* aTimer, void* aClosure) > +{ > + ContentParent* self = static_cast<ContentParent*>(aClosure); Nit: Use |auto*| here :) ::: dom/ipc/ContentParent.h @@ +20,5 @@ > #include "nsFrameMessageManager.h" > #include "nsHashKeys.h" > #include "nsIObserver.h" > #include "nsIThreadInternal.h" > +#include "nsITimer.h" Nit: I'd just forward-declare nsITimer, then #include it in the cpp.

Attachment #8543426 - Flags: review?(bent.mozilla) → review+

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 29

•

11 years ago

Comment on attachment 8543421 [details] [diff] [review] tabchild-protection Review of attachment 8543421 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/ipc/TabChild.h @@ +498,5 @@ > already_AddRefed<nsIWidget> CreatePluginWidget(nsIWidget* aParent); > > nsIntPoint GetChromeDisplacement() { return mChromeDisp; }; > > + bool IPCOpen() { return mIPCOpen; } Hm, ContentParent does this too, but calls everything IsAlive()... Kind of an unfortunate name, and I'm not sure I like it, but maybe copy it? Dunno, up to you.

Attachment #8543421 - Flags: review?(bent.mozilla) → review+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 30

•

11 years ago

(In reply to Jim Chen [:jchen] from comment #27) > Is there a reason you removed the PostTask call? I don't think we actually > need it, but I added it originally just in case some shutdown task posts > something to the event loop. There wasn't really any reason. You can keep it if it makes sense.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 31

•

11 years ago

> > - MessageLoop::current()->PostDelayedTask( > > Ugh, we should make this method crash somehow if called from non-chromium code... We've had > so many bugs because of it :( What other problems does it cause besides the issue in comment 25? > > + bool IPCOpen() { return mIPCOpen; } > > Hm, ContentParent does this too, but calls everything IsAlive()... Kind of an unfortunate > name, and I'm not sure I like it, but maybe copy it? Dunno, up to you. I'd rather keep it IPCOpen since we use that name a lot in the networking and graphics code for the same purpose. IsAlive is used in a few places in DOM code for different things.

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 32

•

11 years ago

Attached patch Allow ContentChild to perform tasks during shutdown (v4.1) — Details — Splinter Review

Rebased version of my patch. Bill, can you land this along with your additional patches once they're ready? Thanks!

Attachment #8537489 - Attachment is obsolete: true

Attachment #8544839 - Flags: review+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 33

•

11 years ago

I did a try push for the patches in this bug: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=563ecf55ca0e For some reason my previous try push hit an xpcshell issue on Mac. That push had some other patches, though, so hopefully this one will be green.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 34

•

11 years ago

remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/bc53d85bc1b2 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/900bb5fcdd36 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/e6ab63938473

Phil Ringnalda (:philor)

Comment 35

•

11 years ago

Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/0be44e8bf22a for the b2g xpcshell timeout in https://treeherder.mozilla.org/logviewer.html#?job_id=5144638&repo=mozilla-inbound (no idea whether the total failure to deal with a timeout after the timeout is just the normal way that tests on b2g suck, or also fallout from this).

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 36

•

11 years ago

That's too bad. I'd assumed it was some sort of infra bustage because of the download failure. Jim suggested a try push without the timer changes. https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=d42f21a51f33

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 37

•

11 years ago

Attached patch Wait for ContentChild shutdown during profile-before-change (v1) — Details — Splinter Review

I found out that waiting for shutdown in xpcom-shutdown is too late for receiving telemetry data because telemetry data collection happens during profile-before-change, before xpcom-shutdown. This patch adds a profile-before-change observer to ContentParent, in addition to xpcom-shutdown. For either observer, we initiate shutdown if needed and wait for any pending shutdown to complete. I'm not sure if xpcom-shutdown is really needed, but I guess it's a last resort just in case we don't have a profile for some reason. To simplify things, I made ShutDownProcess so that we can call it multiple times and only have it send the shutdown message once.

Attachment #8545507 - Flags: review?(wmccloskey)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 38

•

11 years ago

Comment on attachment 8545507 [details] [diff] [review] Wait for ContentChild shutdown during profile-before-change (v1) Review of attachment 8545507 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/ipc/ContentParent.cpp @@ +2728,5 @@ > const char* aTopic, > const char16_t* aData) > { > + if (mSubprocess && (!strcmp(aTopic, "xpcom-shutdown") || > + !strcmp(aTopic, "profile-before-change"))) { I think it would be much better if we just do this based on profile-before-change (and not xpcom-shutdown). I asked Fabrice whether profile-before-change fires on b2g in the parent and he said he thinks it does. I'm asking him for feedback because he wanted to check to make sure. If that's true, let's remove the xpcom-shutdown portion.

Attachment #8545507 - Flags: review?(wmccloskey)

Attachment #8545507 - Flags: review+

Attachment #8545507 - Flags: feedback?(fabrice)

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 39

•

11 years ago

Attached patch Use kill timer when shutting down ContentParent in shutdown observer (v1) — Details — Splinter Review

We used to just close the channel in the shutdown observer, but now that we always send the shudown message and wait for shutdown, we should start the kill timer in the shutdown observer too if needed. I will push to try when the tree reopens. I hope this will fix the b2g failures. If it does, that means there's something going on in those b2g tests that causes the child to hang; these hangs didn't show up before because the parent didn't wait on the child. Relying on the kill timer simply hides the underlying issue, but actually fixing the issue should be in a separate bug.

Attachment #8546134 - Flags: review?(wmccloskey)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 40

•

11 years ago

Comment on attachment 8546134 [details] [diff] [review] Use kill timer when shutting down ContentParent in shutdown observer (v1) Review of attachment 8546134 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/ipc/ContentParent.cpp @@ +1504,5 @@ > if (aMethod == SEND_SHUTDOWN_MESSAGE) { > if (mIPCOpen && !mShutdownPending && SendShutdown()) { > mShutdownPending = true; > + // Start the force-kill timer if we haven't already. > + StartForceKillTimer(); It seems like the timer ought to be running anyway since we've probably closed a tab very recently during shutdown. But it's worth a try.

Attachment #8546134 - Flags: review?(wmccloskey) → review+

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 41

•

11 years ago

Almost there! Only b2g M11 failure left. BTW I had to keep the xpcom-shutdown observer, otherwise there were failures in Linux debug tests; go figure. https://treeherder.mozilla.org/#/jobs?repo=try&revision=b1cefc2787d5

[:fabrice] Fabrice Desré

Updated

•

11 years ago

Attachment #8545507 - Flags: feedback?(fabrice) → feedback+

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 42

•

11 years ago

Attached patch Don't send shutdown message to Nuwa processes (v1) — Details — Splinter Review

This fixes the B2G M11 failure. So in summary, to fix the test failures, we need: * observers for both xpcom-shutdown and profile-before-change ** xpcom-shutdown observer prevents linux test failures (I suspect some tests are run without a profile / with the profile unloaded early somehow) ** profile-before-change is needed for telemetry * start kill timer during xpcom-shutdown/profile-before-change ** needed to fix b2g xpcshell failure * don't send shutdown message for nuwa processes in xpcom-shutdown/profile-before-change ** needed to fix b2g m11 failure

Attachment #8547082 - Flags: review?(wmccloskey)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 43

•

11 years ago

Comment on attachment 8547082 [details] [diff] [review] Don't send shutdown message to Nuwa processes (v1) OK. I'm still curious about the profile-before-change issue, but that shouldn't stop you from landing.

Attachment #8547082 - Flags: review?(wmccloskey) → review+

(inactive) Jim Chen [:jchen] [:darchons]

Assignee

Comment 44

•

11 years ago

Fingers crossed... https://hg.mozilla.org/integration/mozilla-inbound/rev/818164d97da8 https://hg.mozilla.org/integration/mozilla-inbound/rev/9d50f9865ed6 https://hg.mozilla.org/integration/mozilla-inbound/rev/5c4c5685381f https://hg.mozilla.org/integration/mozilla-inbound/rev/ebb500eb92d8 https://hg.mozilla.org/integration/mozilla-inbound/rev/66083ac0f3c0 https://hg.mozilla.org/integration/mozilla-inbound/rev/19017193859a https://hg.mozilla.org/integration/mozilla-inbound/rev/a55f093defe1

Carsten Book [:Tomcat]

Comment 45

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/818164d97da8 https://hg.mozilla.org/mozilla-central/rev/9d50f9865ed6 https://hg.mozilla.org/mozilla-central/rev/5c4c5685381f https://hg.mozilla.org/mozilla-central/rev/ebb500eb92d8 https://hg.mozilla.org/mozilla-central/rev/66083ac0f3c0 https://hg.mozilla.org/mozilla-central/rev/19017193859a https://hg.mozilla.org/mozilla-central/rev/a55f093defe1

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla37

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

11 years ago

Depends on: 1121430

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

11 years ago

No longer depends on: 1121430

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 46

•

11 years ago

Backed out on beta for bug 1138520: https://hg.mozilla.org/releases/mozilla-beta/rev/2cc99febbda0

status-firefox37: --- → wontfix

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 47

•

11 years ago

Backed out again on beta: https://hg.mozilla.org/releases/mozilla-beta/rev/8a5486269821

status-firefox38: --- → wontfix

You need to log in before you can comment on or make changes to this bug.