Closed Bug 900062 Opened 11 years ago Closed 11 years ago

Don't MOZ_ASSERT when trying to send nested sync messages

Categories

(Core :: IPC, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: dvander, Assigned: dvander)

References

Details

Attachments

(1 file)

Right now it's easy to crash a debug e10s build because sendSyncMessage can be called with another sendSyncMessage on the stack, and there's a debug assert that catches this. It's easy to trigger because addons can send urgent messages while replying to a synchronous message. If it were just C++ code, I'd say keep the assert, but as-is it's easy to trigger with addons. I couldn't see an easy way to test from nsFrameMessageManager either, but if exposing something and moving the soft-fail check there is better, I can do that instead. (FWIW, we're aiming to introduce a new protocol soon to solve this problem anyway.)
Attachment #783815 - Flags: review?(cjones.bugs)
Comment on attachment 783815 [details] [diff] [review] SoftFailNestedSync.patch Probably unnecessary to say that this doesn't make me all that happy overall, but I think it's fine as an evolutionary step. Thanks for adding the check to the test. r+ with a comment in Send() noting that a better solution is "Under Construction", and the hard assertion will come back.
Attachment #783815 - Flags: review?(cjones.bugs) → review+
I have two concerns about this: Right now it's an invariant that if an IPDL message returns "false" then the entire IPDL connection is torn down (the child process is killed). It does not appear that this patch maintains that invariant, and I'm worried that this will potentially cause leaks because there are some plugin cases which rely on this invariant, and I believe the DOM-IPC B2G code does also. I think that this patch ought to cause the content process to be killed and channel torn down if it encounters this situation. > Right now it's easy to crash a debug e10s build because sendSyncMessage can > be called with another sendSyncMessage on the stack, and there's a debug > assert that catches this. It's easy to trigger because addons can send > urgent messages while replying to a synchronous message. In what cases are addons replying to synchronous messages? Typically this represents a design flaw: the code which replies to a synchronous message should be very tight and well-contained, and shouldn't end up running arbitrary code.
(In reply to Benjamin Smedberg [:bsmedberg] PTO 8-Aug until 18-Aug, workweek high latency 19-Aug through 23-Aug from comment #2) > Right now it's an invariant that if an IPDL message returns "false" then the > entire IPDL connection is torn down (the child process is killed). It does > not appear that this patch maintains that invariant, and I'm worried that > this will potentially cause leaks because there are some plugin cases which > rely on this invariant, and I believe the DOM-IPC B2G code does also. I > think that this patch ought to cause the content process to be killed and > channel torn down if it encounters this situation. Keeping in mind, this is a really temporary hack since we're working on a new protocol to solve the underlying problem. Sync messages only awake for urgent messages, so the only way to encounter this problem is with an addon and browser.tabs.remote=true. Since it's fairly easy for addons to trigger, I'd rather not have the entire child process die. For example, "Flash and Video Downloader" + navigating to youtube.com would be impossible on an e10s build. Outside of that specific combination, the !mPendingReply assert should not be reachable unless something is terribly wrong. (Worth noting, the assert was added by the CPOW code in the first place since we knew we didn't have the support yet.) > In what cases are addons replying to synchronous messages? Typically this > represents a design flaw: the code which replies to a synchronous message > should be very tight and well-contained, and shouldn't end up running > arbitrary code. The control-flow is something like: (1) Child process calls sendSyncMessage informing addons about an event with a DOM CPOW. (2) Parent process receives event, queries something on the CPOW. (3) Parent process, blocked, sends an urgent CPOW message to the child process. (4) Child process wakes up, process the urgent message while still waiting for a sync reply. (5) Touching the DOM node through the CPOW causes a new event to be fired, and sendSyncMessage() is called again, while the original is still on the stack. At this point the parent process is now deadlocked, since it will not wake up to process an in-sync message, and the child is blocked on the second reply. Single-process Firefox handles this fine since it can just nest calls to event handlers. The motivation for bug 901789 is to fix this. It will restrict urgent messages such that they can only be initiated from the parent process. Then in another bug, we can easily introduce a new protocol that mirrors "urgent" - something that can be initiated only from the child process, interrupt urgent outcalls on the parent, and is re-entrant.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: