[OOPP] Incorrect assertion "NP_Shutdown didn't" when plugin process crashes

RESOLVED FIXED

Status

()

Core
Plug-ins
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: Benjamin Smedberg, Assigned: cjones)

Tracking

Trunk
x86
Windows 7
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [land m-c])

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

8 years ago
I've seen an incorrect assertion "NP_Shutdown didn't" a couple times when running mochitest-ipcplugins locally. As far as I can tell, the sequence is something like this:

* plugin is being unloaded by refreshing plugins, nsPluginTag::TryUnloadPlugin
* plugin process crashes in NP_Shutdown, so that CallNP_Shutdown returns false. IPC dispatches a runnable to call ActorDestroy and whatnot
* PluginModuleParent::NP_Shutdown calls Close(), which calls AsyncChannel::Close. That method early-returns because ChannelError == mChannelState.
* At this point PluginModuleParent::ActorDestroy has not been called and mShutdown is false.
* nsPluginTag::TryUnloadPlugin releases the last reference to the nsNPAPIPlugin, which ends up deleting the PluginModuleParent
* in PluginModuleParent::~PluginModuleParent, we try calling NP_Shutdown again (failing again), and mShutdown is still false

I think I can fix this by just setting `mShutdown = true` in PluginModuleParent::NPP_Shutdown, but I'm not sure this is the correct solution, because I'm not sure that the ActorDestroy sequence is ever being called on the actor tree... trying to determine that now.

I'm also not sure how to reproduce this, i.e. what is causing us to reload plugins. I hope to have a stack to figure that out in a bit.
(Reporter)

Comment 1

8 years ago
Created attachment 420739 [details] [diff] [review]
Test

This is the test. I've verified that we aren't firing ActorDestroy correctly.
(Reporter)

Comment 2

8 years ago
->cjones because I'm not sure where ActorDestroy is actually supposed to be firing in this sequence.
Assignee: benjamin → jones.chris.g
For crashes, ActorDestroy is invoked when the main thread gets the "connection error" event from the IO thread.  It's possible that we're shutting down before then and the event getting ignored ... but I'll take a closer look.
(Reporter)

Comment 4

8 years ago
Yes, I'm sure that's what's happening because the PluginModuleParent is being destroyed immediately.
This is actually pretty tricky.  I think we have two options

 (1) Assume that deleting the top-level actor means clients don't care about ActorDestroy() being called, and remove this assertion.  Kinda sucks, because it kinda sorta could also be construed as breaking the ActorDestroy() contract.

 (2) Kick off IPDL cleanup from AsyncChannel::Close() if it's called after a channel error (and if cleanup hasn't already been done).  Kinda sucks too, because ActorDestroy handlers would need to be mindful of destroying things below Close() on the stack.

Thoughts?  The IPDL side of me leans towards (2), but the side that has seen modules/plugin code leans towards (1).
(In reply to comment #5)
> Thoughts?  The IPDL side of me leans towards (2), but the side that has seen
> modules/plugin code leans towards (1).

Hmm ... on second thought, even if we went with (1) we'd have to worry about the sub-protocols' actors being deleted from ~PluginModuleParent, with other crud on the stack.  So (2) looks like the way to go, maybe.
(Reporter)

Comment 7

8 years ago
From AsyncChannel::Close or from the toplevel destructor itself, yeah.

What would ActorDestroy impls need to be mindful of specifically? They are already pretty limited, because almost any kind of re-entry into IPDL causes crashes.
(In reply to comment #7)
> What would ActorDestroy impls need to be mindful of specifically? They are
> already pretty limited, because almost any kind of re-entry into IPDL causes
> crashes.

ActorDestroy doesn't necessarily have to care, it's the nsNPAPI* code that might have a transitive reference to an IPDL actor that has to care.  For example, if some nsNPAPI*Stream[Wrapper|Listener|Peer] thing can get to a actor, and the nsNPAPI* thing is destroyed, we have to be careful not to reference the actor.

I'm sure we'll have plenty of these kinds of bugs arise, guess we can assign blame/responsibility for caring case-by-case.
Created attachment 420799 [details] [diff] [review]
deterministic IPDL test of this behavior

On IRC, bsmedberg noted that he doesn't want to check in the mochitest because it relies on somewhat fiddly plugin reloading behavior.
Attachment #420739 - Attachment is obsolete: true
Created attachment 420809 [details] [diff] [review]
notify clients of channel errors if they call Close() before deleting themselves, before the error notification event is delivered

Not my favorite patch ever, but it passes all our IPC tests.
Attachment #420809 - Flags: review?(bent.mozilla)
Comment on attachment 420809 [details] [diff] [review]
notify clients of channel errors if they call Close() before deleting themselves, before the error notification event is delivered

Seems ok.
Attachment #420809 - Flags: review?(bent.mozilla) → review+
Pushed http://hg.mozilla.org/projects/electrolysis/rev/e5b6f530110b
Pushed http://hg.mozilla.org/projects/electrolysis/rev/d2ddb1901a44

Not terribly high-priority, no need to go directly to m-c.
Whiteboard: [land m-c]
(Reporter)

Comment 13

8 years ago
http://hg.mozilla.org/mozilla-central/rev/5468db693bfc
http://hg.mozilla.org/mozilla-central/rev/ea2bc4204d83
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.