Open Bug 1364200 Opened 7 years ago Updated 2 years ago

Drop messages sent to deleted protocols rather than crashing

Categories

(Core :: IPC, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: mccr8, Unassigned)

References

Details

In bug 1363614, Ehsan said:
> (I really wish our IPC layer would protect us against this, I'm sick of
> fixing crashes like this honestly!)

I talked to Bill about it and he said it didn't sound unreasonable (though maybe if we're sending a constructor it might be problematic?). Hopefully it won't be too hard to improve this.
The rationale for this change is that random classes end up hitting this, then they have to add a manual flag, and check that before doing a send, so why make everybody implement that code? I think the argument against this is that at some level needing this means you don't have a good idea of what is going on with the protocol. (Hopefully that's not just a strawman of the argument.)
The general issue here is that the IPC actor C++ objects have three states rather than two: dead, alive, and ActorDesyroyed() called but destructor not called yet (similar to our cycle collected objects).  This distinction does matter in some cases and doesn't in others.  It's just that I have personally mostly ran into the cases where the distinction hadn't mattered, in which case you'd be just working around this lifetime weirdness.  But really the core of the issue is that everyone (myself included) *forgets* to do the "mIPCOpen" check from the get-go!

(I'm semi-worried about extrapolating my personal experience on this on the entire code base here because I may have seen the outliers or something.)
I think this is a really bad idea. If we don't know what the effects of the message are, dropping it could have serious issues; if it's part of a request/response pair, then the response will never be received and the caller may have resources waiting on it for a long time or forever.

Fundamentally I think we need more precise IPC logic where sending something to a dead protocol is an assertion-worthy logic error. But of course I also think we need stateful checked protocols and that isn't happening, so maybe it's time to give up :-(
Assignee: continuation → nobody
Priority: -- → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.