Closed Bug 957883 Opened 6 years ago Closed 6 years ago

don't hang browser when PluginModuleChild::Init returns false


(Core :: Plug-ins, defect)

Not set





(Reporter: jaas, Assigned: jaas)



(1 file, 1 obsolete file)

Attached patch Fix v1.0 (obsolete) — Splinter Review
I was working on some code that started with copy and paste from some plugin child process init code:

42  bool
43  PluginProcessChild::Init()
122     mPlugin.Init(pluginFilename, ParentHandle(),
123                  IOThreadChild::message_loop(),
124                  IOThreadChild::channel());
126     return true;

In my code there was a serious problem detected in Init, and Init properly returned 'false'. However, because that status is ignored the process kept running in an unexpected state rather than shutting down cleanly. It made the host process behave in a glitchy way, including not shutting down cleanly.

I fixed the problem in my code by returning the result of Init instead of 'true'. We should probably do the same in the NPAPI code.
Attachment #8357495 - Flags: review?(benjamin)
Comment on attachment 8357495 [details] [diff] [review]
Fix v1.0

The fianl effect of this patch appears to be:

* XRE_InitChildProcess returns a failure code
* plugin-container exits with a code of 1
* in the parent, SyncLaunch should fail with `false` immediately (it should not hang on a timeout)
* PluginProcessParent::Launch should fail with `false`
* PluginModuleParent::LoadModule should fail by returning a null pointer

Is this what actually happens?
Attached patch Fix v1.1Splinter Review
Turns out we need to go a little further down with the fix. The child process exits with an error code of 1. Then the parent IPC code detects a broken pipe and attempts to notify the GeckoChildProcessHost, but it turns out the GeckoChildProcessHost isn't listening. 'GeckoChildProcessHost::OnChannelError()' is totally unimplemented. This means that after the process quits and a broken pipe is detected, we continue to wait on the launch timeout! That's a 45 second parent process hang in the case of NPAPI plugins. Because we retry failed loads at a higher level in our NPAPI implementation, that situation always becomes a 90 second wait. Multiply that by the number of plugins on the page.

This patch changes the process state to an error state when a channel error happens while trying to connect. It also updates the launch timeout loop to detect and handle an error state properly.

Without this patch, if you return false from 'PluginModuleChild::Init' and visit YouTube, the browser essentially hangs up indefinitely - one 45 second hang after another. With this patch there are no hangs, the browser continues to work normally. A lot happens in that Init code, we really need to be handling failure gracefully.
Attachment #8357495 - Attachment is obsolete: true
Attachment #8357495 - Flags: review?(benjamin)
Attachment #8358661 - Flags: review?(benjamin)
Summary: check return valid of Init() in plugin child process → don't hang browser when PluginModuleChild::Init returns false
Comment on attachment 8358661 [details] [diff] [review]
Fix v1.1

I'm a little curious why we only set mProcessState to PROCESS_ERROR if mProcessState is less than PROCESS_CONNECTED. Why not do it in that case as well?

r=me with that changed or an explanation of why it won't work ;-)
Attachment #8358661 - Flags: review?(benjamin) → review+
I don't really know what should happen if a channel error happens after we're connected. I know that before we're connected (< PROCESS_CONNECTED) we're sitting in a timeout loop that is watching the process state and will do the right thing on error. After we're connected, we'd have to make sure 1) the change to an error state was noticed and 2) that the expected thing happens. That's a different issue which I don't fully understand yet, so I chose to leave that case alone.
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
You need to log in before you can comment on or make changes to this bug.