Closed Bug 545591 Opened 14 years ago Closed 15 days ago

IPDL: Pending messages aren't flushed on *Channel::Close()

Categories

(Core :: IPC, defect)

defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: cjones, Unassigned)

Details

This means that the Goodbye message, used to signal "normal" termination, can be dropped.  The other side would spuriously detect abnormal shutdown.  This causes TestArrays to fail on windows, but it's a problem on windows and Posix.
Pushed http://hg.mozilla.org/projects/electrolysis/rev/2f9e351ca2b1 as a temporary workaround.

I'm not sure of the best way to handle this.  It would certainly be nice to warn the "other side" when we're about to Close() a channel.  There are three general options

 (1) Make Close() behavior only partially specified.  If a channel error is *not* reported, then shutdown was definitely clean.  If an error *is* reported, shutdown may or may not have been clean (and messages may or may not have been lost), and other means need to be employed to check.  That is, s/error/maybe-error/.

 (2) Fully flush the out-queue before actually closing the underlying transport.  Note that at this point the outqueue can only contain async messages.  This would incur an unknown amount of blocking IO on shutdown.

 (3) Bypass the outqueue when signaling normal shutdown.  One way is to do a blocking send of Goodbye() only, ignoring the outqueue.  Another way is to write a status code into a shmem segment.  The downside here of course is that we could leave messages unsent on exit.

Needs some thought.  I lean towards (1), because (a) it's not that hard to ensure that the outqueue is empty before Close()ing (see the workaround I pushed); and (b) it's easy to determine whether a process crashed by using system APIs.
Why are we allowing there to be pending messages on Channel::Close to begin with? The only time this would happen is if the shutdown/Close were triggered from a sync message, right? Can we just disallow that?
I think there's an ambiguity around "pending message".  The Messages have been "sent" as far as ipc/glue/*Channel is concerned, but they haven't yet been written to the pipe by the transport layer.  It's only a problem for async messages.  Example below, where a1/a2/a3 are async Messages, as is Goodbye.

 worker thread            IO thread
---------------          -----------
 Send(a1)
 Send(a2)                 [OnSend(a1)]
 Send(a3)                 write(a1)
 ...                      [OnSend(a2)]
 Close()                  write(a2)
 Send(Goodbye)            [OnSend(a3)]
 PostTask(ReallyClose)    oops, write() is blocked; keep a3 in outqueue
                          [OnSend(Goodbye)]
                          oops, write() is still blocked; keep Goodbye in outqueue
                          [OnReallyClose()]
                          transport->Close()
                            transport->outqueue.clear(); drops a3, Goodbye

A simple workaround is to send a synchronous message before Close(), which will flush the outqueue, allowing Goodbye to be write()d without blocking.  That's what 2f9e351ca2b1 does.
Oh yeah, in that case I would just expect OnSend(Goodbye) to block until all pending is flushed.
Severity: normal → S3

Protocol shutdown has been reworked at least once in the last 14 years so let's close this.

Status: NEW → RESOLVED
Closed: 15 days ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.