Closed
Bug 535077
Opened 15 years ago
Closed 14 years ago
Failing to exec() mozilla-runtime (plugin-runtime) for a sync launch causes Gecko to hang
Categories
(Core :: IPC, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: cjones, Assigned: benedict)
References
Details
(Keywords: hang)
Attachments
(1 file, 5 obsolete files)
11.93 KB,
patch
|
Details | Diff | Splinter Review |
This is because we wait on a monitor being notified, and it never happens. Quick fix is to time out after k seconds, better fix is to asynchronously launch all our subprocesses. AFAIK we already have bugs for the latter. I'm not very happy with this patch; it's just a stopgap.
Attachment #417869 -
Flags: review?(bent.mozilla)
Comment 1•15 years ago
|
||
Note: bug 535090 describes one reason that exec() can fail.
See Also: → 535090
Version: unspecified → Trunk
Comment on attachment 417869 [details] [diff] [review] time out if process isn't ready after 10 seconds Sigh.
Attachment #417869 -
Flags: review?(bent.mozilla) → review+
Reporter | ||
Comment 3•15 years ago
|
||
I've been debating whether or not to push this for the last few days. I've decided not to; it would just add too much suck. If we don't get async plugin launch done before this change becomes necessary, I think instead we could do something like the JS slow-script timeout. We'll want that anyway for generic NPAPI operations.
Comment 5•14 years ago
|
||
Bug 558146 is on Windows so if it is a duplicate of this bug, the Platform of this bug shouldn't be set to Linux, but All. Besides if I didn't miss something this bug is about fixing the hang, but there should also be some notification to the user why mozilla-runtime couldn't be run and what to do to fix the problem.
Updated•14 years ago
|
Summary: Failing to exec() mozilla-runtime for a sync launch causes Gecko to hang → Failing to exec() mozilla-runtime (plugin-runtime) for a sync launch causes Gecko to hang
Comment 7•14 years ago
|
||
Probably don't need to get this into 3.6.6, but sounds worth doing for the cases where virus scanners etc block plugin-runtime, or silly users remove it, or the other cases we've been discovering.
Assignee: nobody → bhsieh
blocking1.9.2: --- → ?
Comment 8•14 years ago
|
||
I can speak somehow speak for the silly or whatever classified users, having a full .exe process running you can just flip with taskmanager in mswin it's no surprise that users do it - to add some feedback from a users perspective. As message I suggest to just inform the user that "There was a problem to start the %s-Plugin." - and that's it. If it crashes there are no specifics as well. Could be made in the same design like the message you get as if a plugin (more precise: the plugin-container.exe) was removed from memory for whatever reason.
Comment 9•14 years ago
|
||
Not going to "block" a branch release on it but would consider approving a patch if you've solved cjones concern that "it would just add too much suck" (comment 3).
blocking1.9.2: ? → -
Comment 10•14 years ago
|
||
To be clear, what we should implement here is a timeout the same way we timeout RPC calls in general, so that by default it will time out after 10, or 45, or whatever seconds.
Assignee | ||
Comment 11•14 years ago
|
||
Uses the dom.ipc.plugins.timeSecs pref.
Attachment #459981 -
Flags: review?(jones.chris.g)
Reporter | ||
Comment 12•14 years ago
|
||
Comment on attachment 459981 [details] [diff] [review] time out if process isn't ready after X (pref or default of 10) seconds, plus test One large-ish change: we don't want to conflate the reply timeout with process launching. They're very separate beasts, and ironically we'll probably eventually want the process launch timeout to be shorter than the reply timeout, since all the code in that path is under our control (dllmains excepted). Something like dom.ipc.plugins.processLaunchTimeoutSecs with the same semantics and values as the reply timeout (45 secs for release builds, 0=infinity for debug) should be OK for now. We can fiddle with the value later. >diff --git a/dom/plugins/PluginModuleParent.cpp b/dom/plugins/PluginModuleParent.cpp >--- a/dom/plugins/PluginModuleParent.cpp >+++ b/dom/plugins/PluginModuleParent.cpp >@@ -77,23 +77,33 @@ > }; > > // static > PluginLibrary* > PluginModuleParent::LoadModule(const char* aFilePath) > { > PLUGIN_LOG_DEBUG_FUNCTION; > >+ PRInt32 prefSecs = nsContentUtils::GetIntPref(kTimeoutPref, 0); >+ > // Block on the child process being launched and initialized. > PluginModuleParent* parent = new PluginModuleParent(aFilePath); Since this function is now branchy, please make this an |nsAutoPtr<PluginModuleParent>| for which we |return parent.forget();| in the success case below. Really should have been an auto pointer from the start, that's my fault, sorry. >- parent->mSubprocess->Launch(); >+ bool retval = parent->mSubprocess->Launch(prefSecs * 1000); >+ if (!retval) { >+ // XXX Is this okay? >+ parent->mShutdown = true; >+ delete parent; >+ return nsnull; >+ } > parent->Open(parent->mSubprocess->GetChannel(), > parent->mSubprocess->GetChildProcessHandle()); > >- TimeoutChanged(kTimeoutPref, parent); >+ int32 syncTimeoutMs = (prefSecs > 0) ? (1000 * prefSecs) : >+ SyncChannel::kNoTimeout; >+ parent->SetReplyTimeoutMs(syncTimeoutMs); > I don't like the copy and paste of this code, but with a separate process-launch timeout this change will go away. >diff --git a/dom/plugins/PluginProcessParent.h b/dom/plugins/PluginProcessParent.h >--- a/dom/plugins/PluginProcessParent.h >+++ b/dom/plugins/PluginProcessParent.h >@@ -58,17 +58,17 @@ > { > public: > PluginProcessParent(const std::string& aPluginFilePath); > ~PluginProcessParent(); > > /** > * Synchronously launch the plugin process. > */ >- bool Launch(); >+ bool Launch(PRInt32 timeoutSecs); > Er, |timeoutMs|? Please describe the param in the comment here. >diff --git a/ipc/glue/GeckoChildProcessHost.cpp b/ipc/glue/GeckoChildProcessHost.cpp >--- a/ipc/glue/GeckoChildProcessHost.cpp >+++ b/ipc/glue/GeckoChildProcessHost.cpp >@@ -62,16 +62,18 @@ > #ifdef XP_WIN > #include "nsIWinTaskbar.h" > #define NS_TASKBAR_CONTRACTID "@mozilla.org/windows-taskbar;1" > #endif > > using mozilla::MonitorAutoEnter; > using mozilla::ipc::GeckoChildProcessHost; > >+static const int32 kExecTimeoutMs = 10000; >+ We should default to infinite timeout, see below. This change can go away. > template<> > struct RunnableMethodTraits<GeckoChildProcessHost> > { > static void RetainCallee(GeckoChildProcessHost* obj) { } > static void ReleaseCallee(GeckoChildProcessHost* obj) { } > }; > > GeckoChildProcessHost::GeckoChildProcessHost(GeckoProcessType aProcessType, >@@ -126,38 +128,41 @@ > } else { > mGroupId.AssignLiteral("-"); > } > } > } > #endif > > bool >-GeckoChildProcessHost::SyncLaunch(std::vector<std::string> aExtraOpts) >+GeckoChildProcessHost::SyncLaunch(std::vector<std::string> aExtraOpts, int aTimeoutMs) > { > #ifdef XP_WIN > InitWindowsGroupID(); > #endif > >+ int timeoutMs = (aTimeoutMs > 0) ? aTimeoutMs : kExecTimeoutMs; Please make this PRIntervalTime timeoutTicks = (aTimeoutMs > 0) ? PR_MillisecondsToInterval(aTimeoutMs) : PR_INTERVAL_NO_TIMEOUT; > MessageLoop* ioLoop = XRE_GetIOMessageLoop(); > NS_ASSERTION(MessageLoop::current() != ioLoop, "sync launch from the IO thread NYI"); > > ioLoop->PostTask(FROM_HERE, > NewRunnableMethod(this, > &GeckoChildProcessHost::PerformAsyncLaunch, > aExtraOpts)); >- > // NB: this uses a different mechanism than the chromium parent > // class. > MonitorAutoEnter mon(mMonitor); >- while (!mLaunched) { >+ while (!mChannelInitialized) { Um, no. This change makes SyncLaunch() have the same semantics as AsyncLaunch(). Instead, you want "has this timed out?" logic like here: http://mxr.mozilla.org/mozilla-central/source/ipc/glue/SyncChannel.cpp#244 . > mon.Wait(); > } >- >- return true; >+ if (!mLaunched) { >+ printf ("waiting %i ms\n, timeoutMs\n", timeoutMs); Kill this printf, or make it |if (LoggingEnabled()) ...|. >diff --git a/ipc/ipdl/test/cxx/PTestSyncHang.ipdl b/ipc/ipdl/test/cxx/PTestSyncHang.ipdl Holding off on reviewing the test until v2. This is r- for the change in SyncLaunch() semantics.
Attachment #459981 -
Flags: review?(jones.chris.g) → review-
Assignee | ||
Comment 13•14 years ago
|
||
Passes on tryserver, by the way!
Attachment #417869 -
Attachment is obsolete: true
Attachment #459981 -
Attachment is obsolete: true
Attachment #460787 -
Flags: review?(jones.chris.g)
Reporter | ||
Comment 14•14 years ago
|
||
Comment on attachment 460787 [details] [diff] [review] fixes review comments >diff --git a/dom/plugins/PluginModuleParent.cpp b/dom/plugins/PluginModuleParent.cpp >--- a/dom/plugins/PluginModuleParent.cpp >+++ b/dom/plugins/PluginModuleParent.cpp > // static > PluginLibrary* > PluginModuleParent::LoadModule(const char* aFilePath) > { > PLUGIN_LOG_DEBUG_FUNCTION; > >+ PRInt32 prefSecs = nsContentUtils::GetIntPref(kLaunchTimeoutPref, 0); >+ > // Block on the child process being launched and initialized. >- PluginModuleParent* parent = new PluginModuleParent(aFilePath); >- parent->mSubprocess->Launch(); >+ nsAutoPtr<PluginModuleParent> parent(new PluginModuleParent(aFilePath)); >+ bool retval = parent->mSubprocess->Launch(prefSecs * 1000); s/retval/launched/? >diff --git a/ipc/ipdl/test/cxx/TestSyncHang.cpp b/ipc/ipdl/test/cxx/TestSyncHang.cpp >new file mode 100644 >--- /dev/null >+++ b/ipc/ipdl/test/cxx/TestSyncHang.cpp >+void >+TestSyncHangParent::Main() >+{ >+ vector<string> args; >+ args.push_back("fake/path"); >+ gSyncHangSubprocess = new mozilla::ipc::GeckoChildProcessHost(GeckoProcessType_Plugin); >+ bool retval = gSyncHangSubprocess->SyncLaunch(args, 2); >+ if (retval) >+ fail("Calling SyncLaunch with an invalid path should return false"); >+ // XXX Is there a race with the IPDLUnitTest shutdown here? Nope. >diff --git a/modules/libpref/src/init/all.js b/modules/libpref/src/init/all.js >--- a/modules/libpref/src/init/all.js >+++ b/modules/libpref/src/init/all.js >@@ -1253,23 +1253,27 @@ > //pref("editor.encode_entity", "html"); > > pref("editor.resizing.preserve_ratio", true); > pref("editor.positioning.offset", 0); > > pref("dom.max_chrome_script_run_time", 20); > pref("dom.max_script_run_time", 10); > >+#ifndef DEBUG > // How long a plugin is allowed to process a synchronous IPC message > // before we consider it "hung". >-#ifndef DEBUG > pref("dom.ipc.plugins.timeoutSecs", 45); >+// How long a plugin launch is allowed to take before >+// we consider it failed. >+pref("dom.ipc.plugins.processLaunchTimeoutSecs", 45); > #else > // No timeout in DEBUG builds > pref("dom.ipc.plugins.timeoutSecs", 0); >+pref("dom.ipc.plugins.processLaunchTimeoutSecs", 0); > #endif Looks good, thanks. Might need something like sr for the new pref, pinging bsmedberg.
Attachment #460787 -
Flags: superreview?(benjamin)
Attachment #460787 -
Flags: review?(jones.chris.g)
Attachment #460787 -
Flags: review+
Comment 15•14 years ago
|
||
I'm not sure why we need the new pref, could we just reuse dom.ipc.plugins.timeoutSecs? But really, not a big deal either way.
Assignee | ||
Comment 16•14 years ago
|
||
From bsmedberg's comments in IRC, I think this is ready to go.
Attachment #460787 -
Attachment is obsolete: true
Attachment #460787 -
Flags: superreview?(benjamin)
Assignee | ||
Updated•14 years ago
|
Keywords: checkin-needed
Comment 17•14 years ago
|
||
This can't be checked in without approval2.0+ or blocking2.0+, right? This shouldn't be marked checkin-needed until it's got one of those & can actually be checked in. (/me removes checkin-needed keyword; feel free to re-add if I'm missing something)
Keywords: checkin-needed
Updated•14 years ago
|
blocking2.0: --- → betaN+
Keywords: checkin-needed
Assignee | ||
Comment 18•14 years ago
|
||
fixing up the header
Attachment #461731 -
Attachment is obsolete: true
Assignee | ||
Comment 19•14 years ago
|
||
Attachment #464567 -
Attachment is obsolete: true
Comment 20•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/89ef5b7a1f82
You need to log in
before you can comment on or make changes to this bug.
Description
•