Closed
Bug 942118
Opened 11 years ago
Closed 10 years ago
Potential IPC deadlock
Categories
(Core :: IPC, defect)
Tracking
()
RESOLVED
FIXED
1.3 C2/1.4 S2(17jan)
People
(Reporter: gerard-majax, Assigned: gerard-majax)
Details
(Whiteboard: [systemsfe])
Attachments
(2 files, 2 obsolete files)
261.38 KB,
image/png
|
Details | |
1.35 KB,
patch
|
bent.mozilla
:
review+
|
Details | Diff | Splinter Review |
After looking at the locking code of mMonitor in ipc/glue/MessageChannel, there is a couple of call sites where we are making use of mListener while holding a lock on mMonitor.
Assignee | ||
Comment 1•11 years ago
|
||
ipc/glue/MessageChannel.cpp:1305: mListener->OnProcessingError(MsgDropped); ipc/glue/MessageChannel.cpp:233: ReportConnectionError("MessageChannel"); -> called under lock, took at ipc/glue/MessageChannel.cpp:230 --> possible issue ipc/glue/MessageChannel.cpp:257: ReportConnectionError("MessageChannel"); -> called under lock, took at ipc/glue/MessageChannel.cpp:255 --> possible issue ipc/glue/MessageChannel.cpp:494: ReportConnectionError("MessageChannel::SendAndWait"); ipc/glue/MessageChannel.cpp:479 if (!SendAndWait(aMsg, aReply)) -> called under lock, took at ipc/glue/MessageChannel.cpp:470 --> possible issue ipc/glue/MessageChannel.cpp:514: ReportConnectionError("MessageChannel::SendAndWait"); ipc/glue/MessageChannel.cpp:479 if (!SendAndWait(aMsg, aReply)) -> called under lock, took at ipc/glue/MessageChannel.cpp:470 --> possible issue at least for the first iteration of loop ipc/glue/MessageChannel.cpp:575: ReportConnectionError("MessageChannel::Call"); -> called under lock, took at ipc/glue/MessageChannel.cpp:573 --> possible issue ipc/glue/MessageChannel.cpp:602: ReportConnectionError("MessageChannel::InterruptCall"); -> called under lock, took at ipc/glue/MessageChannel.cpp:573 --> possible issue at least for the first iteration of loop ipc/glue/MessageChannel.cpp:833: ReportConnectionError("MessageChannel::OnMaybeDequeueOne"); ipc/glue/MessageChannel.cpp:869: if (!DequeueOne(&recvd)) -> called under lock, took at ipc/glue/MessageChannel.cpp:868 --> possible issue
Assignee | ||
Comment 2•11 years ago
|
||
On all those call sites, I see only one set: ipc/glue/MessageChannel.cpp:233: ReportConnectionError("MessageChannel"); ipc/glue/MessageChannel.cpp:257: ReportConnectionError("MessageChannel"); ipc/glue/MessageChannel.cpp:575: ReportConnectionError("MessageChannel::Call"); ipc/glue/MessageChannel.cpp:602: ReportConnectionError("MessageChannel::InterruptCall"); ipc/glue/MessageChannel.cpp:833: ReportConnectionError("MessageChannel::OnMaybeDequeueOne"); ipc/glue/MessageChannel.cpp:494: ReportConnectionError("MessageChannel::SendAndWait"); ipc/glue/MessageChannel.cpp:514: ReportConnectionError("MessageChannel::SendAndWait"); -> This set of call sites must perform a MonitorAutoLock() before callong ReportConnectionError() because they are referring to the Connected() method, which makes use of mMonitor. In other words, every ReportConnectionError() call MUST NOT have a lock on mMonitor(), but all those have one because of Connected() being called just before.
Assignee | ||
Comment 3•11 years ago
|
||
In fast, those call sides are also concerned: ipc/glue/MessageChannel.cpp:664: ReportConnectionError("MessageChannel::DispatchMessage"); ipc/glue/MessageChannel.cpp:722: ReportConnectionError("MessageChannel::DispatchInterruptMessage"); ipc/glue/MessageChannel.cpp:775: ReportConnectionError("MessageChannel::DispatchUrgentMessage"); ipc/glue/MessageChannel.cpp:812: ReportConnectionError("MessageChannel::DispatchRPCMessage"); The structure for those is: { // In order to send the parent RPC messages and guarantee it will // wake up, we must re-use its transaction. AutoEnterRPCTransaction transaction(this, recvd); MonitorAutoUnlock unlock(*mMonitor); DispatchUrgentMessage(*recvd); } if (!Connected()) { ReportConnectionError("MessageChannel::DispatchUrgentMessage"); return false; } So we unlock mMonitor in its scope, meaning we re-lock it when we call Connected(). So we still have the lock on ReportConnectionError call.
Assignee | ||
Comment 4•11 years ago
|
||
The MaybeHandleError() method does the same kind of mListener call, but it is always called under an explicit MonitorAutoUnlock.
Assignee | ||
Comment 5•11 years ago
|
||
I've tested my patch against try, everything is green: https://tbpl.mozilla.org/?tree=Try&rev=4c2edea8fb28
Assignee | ||
Comment 6•11 years ago
|
||
Attachment #8337230 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 7•11 years ago
|
||
Please find attached an explanatory graph of the locking.
Assignee | ||
Comment 8•11 years ago
|
||
Attachment #8337760 -
Attachment is obsolete: true
Updated•11 years ago
|
Whiteboard: [systemsfe]
Target Milestone: --- → 1.3 Sprint 6 - 12/6
Comment on attachment 8337230 [details] [diff] [review] 0001-Bug-942118-Ensure-that-we-call-MessageChannel-Report.patch Review of attachment 8337230 [details] [diff] [review]: ----------------------------------------------------------------- ::: ipc/glue/MessageChannel.cpp @@ +233,1 @@ > ReportConnectionError("MessageChannel"); All of this looks right to me but I wonder if we can't make it cleaner. Can we add |mMonitor->AssertCurrentThreadOwns()| to the beginning of ReportConnectionError, then add the |MonitorAutoUnlock unlock(*mMonitor);| there?
Assignee | ||
Comment 10•11 years ago
|
||
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #9) > Comment on attachment 8337230 [details] [diff] [review] > 0001-Bug-942118-Ensure-that-we-call-MessageChannel-Report.patch > > Review of attachment 8337230 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: ipc/glue/MessageChannel.cpp > @@ +233,1 @@ > > ReportConnectionError("MessageChannel"); > > All of this looks right to me but I wonder if we can't make it cleaner. > > Can we add |mMonitor->AssertCurrentThreadOwns()| to the beginning of > ReportConnectionError, then add the |MonitorAutoUnlock unlock(*mMonitor);| > there? I was not sure it would be the correct way to deal with it, since we perform lock/unlocking before calling functions at all the other call sites. But if you prefer it this way, and since I see no callpath where it would not be valid, I'll update the patch :)
Assignee | ||
Comment 11•10 years ago
|
||
Please find attached an updated version of the patch that addresses your comments. The Try instance is available at https://tbpl.mozilla.org/?tree=Try&rev=3592891d1b2b
Attachment #8337230 -
Attachment is obsolete: true
Attachment #8337230 -
Flags: review?(bent.mozilla)
Attachment #8345743 -
Flags: review?(bent.mozilla)
Comment on attachment 8345743 [details] [diff] [review] 0001-Bug-942118-Ensure-that-we-call-MessageChannel-Report.patch Review of attachment 8345743 [details] [diff] [review]: ----------------------------------------------------------------- Looks great, thanks!
Attachment #8345743 -
Flags: review?(bent.mozilla) → review+
Assignee | ||
Comment 13•10 years ago
|
||
Looks like this one has been forgotten :)
Keywords: checkin-needed
Comment 14•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/ee221e1e179f
Keywords: checkin-needed
Comment 15•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/ee221e1e179f
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: 1.3 Sprint 6 - 12/6 → 1.3 C2/1.4 S2(17jan)
You need to log in
before you can comment on or make changes to this bug.
Description
•