Closed
Bug 951908
Opened 11 years ago
Closed 10 years ago
Content processes never shutdown in some cases
Categories
(Core :: Widget: Cocoa, defect)
Tracking
()
RESOLVED
FIXED
mozilla29
People
(Reporter: billm, Assigned: billm)
References
Details
Attachments
(1 file)
696 bytes,
patch
|
smichaud
:
review+
bent.mozilla
:
review+
lsblakk
:
approval-mozilla-aurora+
praghunath
:
approval-mozilla-b2g26+
|
Details | Diff | Splinter Review |
I've been running into the following problem. When we start a content process, we get the following call stack: XRE_InitChildProcess http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsEmbedFunctions.cpp#516 MessagePumpForChildProcess::Run http://mxr.mozilla.org/mozilla-central/source/ipc/glue/MessagePump.cpp#251 XRE_RunAppShell http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsEmbedFunctions.cpp#635 When XRE_RunAppShell runs, it does some initial processing of messages. This initial processing was added in bug 625060. Afterwards it enters the normal event loop via appShell->Run(). The problem I'm seeing is that we're shutting down the content process (via XRE_ShutdownChildProcess) during the initial message processing, before nsAppShell::Run happens. XRE_ShutdownChildProcess prematurely exits the message loop and also calls nsAppShell::Exit. However, since we haven't called Run yet, the Exit has no effect. When we call Run, we run forever and the child process never exits. I'm seeing this as a shutdown timeout on Tinderbox because the main process waits for all child processes to exit on shutdown. I suspect this bug may be causing some of the Mac shutdown timeouts in bug 918759, but that's just a guess. The simplest fix seems to be to exit early from nsAppShell::Run if nsAppShell::Exit has already been called. However, this code seems pretty tricky, so I'm asking for two reviews.
Attachment #8349737 -
Flags: review?(smichaud)
Attachment #8349737 -
Flags: review?(bent.mozilla)
Comment 1•11 years ago
|
||
Comment on attachment 8349737 [details] [diff] [review] mac-fix This looks fine to me. But, as you say, appshell code is complicated -- not just in itself but because of accretions elsewhere (like the patch for bug 625060). So we'll need to watch out for problems, and be prepared to back this patch out if it causes trouble.
Attachment #8349737 -
Flags: review?(smichaud) → review+
Hrm, what's the stack when we call XRE_ShutdownChildProcess before XRE_RunAppShell? It sounds fishy to me that we're trying to shut down so quickly...
Flags: needinfo?(wmccloskey)
Assignee | ||
Comment 3•10 years ago
|
||
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #2) > Hrm, what's the stack when we call XRE_ShutdownChildProcess before > XRE_RunAppShell? It sounds fishy to me that we're trying to shut down so > quickly... I'm afraid I don't have a copy of the stack, and it's hard to reproduce. However, XRE_ShutdownChildProcess is not called before XRE_RunAppShell. It's call during the initial preprocessing stage of XRE_RunAppShell, before appShell->Run(). To put it another way, the stack trace for XRE_ShutdownChildProcess would contain this line: http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsEmbedFunctions.cpp#674
Flags: needinfo?(wmccloskey)
Comment on attachment 8349737 [details] [diff] [review] mac-fix Review of attachment 8349737 [details] [diff] [review]: ----------------------------------------------------------------- Ok, let's give this a try.
Attachment #8349737 -
Flags: review?(bent.mozilla) → review+
Assignee | ||
Comment 5•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/0d955b377e77 Thanks!
Comment 6•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/0d955b377e77
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Assignee | ||
Comment 7•10 years ago
|
||
Comment on attachment 8349737 [details] [diff] [review] mac-fix It looks like this fixed bug 960605, so I think we should backport it to reduce intermittent failures on aurora. [Approval Request Comment] Bug caused by (feature/regressing bug #): ? User impact if declined: none Testing completed (on m-c, etc.): On m-c. Risk to taking this patch (and alternatives if risky): Very low, since the code deals with a feature (content processes) that is rarely used in practice. String or IDL/UUID changes made by this patch: None
Attachment #8349737 -
Flags: approval-mozilla-aurora?
Updated•10 years ago
|
Attachment #8349737 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 8•10 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/6b861c14640b
status-firefox28:
--- → fixed
status-firefox29:
--- → fixed
Comment 10•10 years ago
|
||
Bill, would this be reasonably safe to backport to b2g26 too? These hangs bite us there too and I would love to get this landed there if the risk is low. Not sure what the risk to non-desktop builds is.
status-b2g-v1.2:
--- → affected
status-b2g-v1.3:
--- → fixed
status-b2g-v1.4:
--- → fixed
status-firefox27:
--- → wontfix
Flags: needinfo?(wmccloskey)
Assignee | ||
Comment 11•10 years ago
|
||
I think it would be fine to backport it. It's a Mac-specific fix, so it couldn't possibly affect anything on b2g itself. It would just be helping us to improve testing.
Flags: needinfo?(wmccloskey)
Comment 12•10 years ago
|
||
Comment on attachment 8349737 [details] [diff] [review] mac-fix NOTE: This flag is now for security issues only. Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings. [Approval Request Comment] Bug caused by (feature/regressing bug #): ? User impact if declined: sheriff pain Testing completed: on m-c and aurora without any known issues Risk to taking this patch (and alternatives if risky): None, this is NPOTB for B2G as it only affects OSX (the patch is to a Cocoa file, not shared platform code). Only the OSX builds we run in automation on the branch will see any effect from this. String or UUID changes made by this patch: None
Attachment #8349737 -
Flags: approval-mozilla-b2g26?
Comment 13•10 years ago
|
||
Comment on attachment 8349737 [details] [diff] [review] mac-fix bug 951908 for b2g26 2:16 PM <RyanVM|sheriffduty> the reasoning for that is that it's OSX-only code 2:16 PM <RyanVM|sheriffduty> it's completely NPOTB for b2g 2:16 PM <RyanVM|sheriffduty> but fixes a really annoying orange on tbpl 2:16 PM <RyanVM|sheriffduty> but that code isn't even compiled on b2g, so there's zero risk to taking it there
Attachment #8349737 -
Flags: approval-mozilla-b2g26? → approval-mozilla-b2g26+
Comment 14•10 years ago
|
||
Thanks :) https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/049ab1f88b13
You need to log in
before you can comment on or make changes to this bug.
Description
•