Closed Bug 546035 Opened 14 years ago Closed 14 years ago

[OOPP] Assertion in RPCChannel's OnMaybeDequeueOne: ABORT: wrong message type [@ _filbuf]

Categories

(Core Graveyard :: Plug-ins, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jimm, Assigned: cjones)

References

Details

Attachments

(3 files, 1 obsolete file)

Attached file parent stack
Console output:

###!!! [RPCChannel][Parent][f:/Mozilla/firefox/mozilla-central/ipc/glue/RPCChannel.cpp:358] Assertion (call.is_rpc() && !call.is_reply()) failed.  wrong message type (triggered by rpc)
  local RPC stack size: 2
  remote RPC stack guess: 0
  deferred stack size: 0
  out-of-turn RPC replies stack size: 0
  Pending queue size: 3, front to back:
    [ async ]
    [ async ]
    [ async ]
    [ async ]
###!!! ABORT: wrong message type: file f:/Mozilla/firefox/mozilla-central/ipc/glue/RPCChannel.cpp, line 583

Parent stack attached. This happened while selecting flash context menu options.
This is likely the topcrash @_filbuf... you have steps to reproduce?
(In reply to comment #1)
> This is likely the topcrash @_filbuf... you have steps to reproduce?

I was testing the url Ria mentioned in bug 539835 - 

http://www.rtl.nl/components/actueel/rtlboulevard/miMedia/2009/week26/do_krezip.avi_plain.xml

I right clicked the advertisement, selected something, right clicked again, selected something, got the crash.
Here's |this| in a google spreadsheet in a subsequent crash.

http://spreadsheets.google.com/ccc?key=0AsaQJjYaC4GYdDlkOWdzTnh4cG16Zm9TZG1fSHk4Umc&hl=en
Something is going wrong in the child here. Whenever I get into this situation, the child is wrapped up in WaitForNotify with an mStack.size() == 1, and the io thread has either signaled via postthreadmessage and that message was lost, or the message was never sent.

We can't get stuck in WaitForNotify in the child ever, especially when context menus are being displayed - user input in the ui will never get processed. This might be bug 545338. I was thinking it might be a case where NotifyWorkerThread is being called when the TrackPopupMenu dispatch loop is running, but afaict that can't happen. All work should be being processed through OnMaybeDequeueOne in those cases.
Blocks: 536666
Getting stuck in WaitForNotify is not bad in itself: it's only bad if the parent doesn't respond (or the response gets lost). bent can you record?
Assignee: nobody → bent.mozilla
The nested event loop will now allow the child to receive more than one deferred message. We have to support that.
Attachment #427465 - Flags: review?(jones.chris.g)
Attachment #427465 - Flags: review?(jmathies)
Comment on attachment 427465 [details] [diff] [review]
Allow more than one deferred messages, v1

Moved that to bug 546797. Different assertion.
Attachment #427465 - Attachment is obsolete: true
Attachment #427465 - Flags: review?(jones.chris.g)
Attachment #427465 - Flags: review?(jmathies)
Should be relative easy to write an IPDL/C++ test for this.
Assignee: bent.mozilla → jones.chris.g
Attachment #431464 - Flags: review?(benjamin)
Attachment #431464 - Flags: review?(benjamin) → review+
Windows hang detector landed in

Ben Turner <bent.mozilla@gmail.com>
	Thu Feb 11 12:19:21 2010 -0800 (at Thu Feb 11 12:19:21 2010 -0800)

so the timing seems right for this hypothesis.
I noticed that triggering this is more complicated in that we also need a pending OnMaybeDequeueOneEvent() event when the RPC reply racing with hang kill is received.  This is definitely possible if a lot of InvalidateRect()s are flying around.
Yay!

$ ipdltest TestHangs
 (child process is 'hanging' now)

###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv


###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv

###!!! [RPCChannel][Parent][/home/cjones/mozilla/mozilla-central/ipc/glue/RPCChannel.cpp:376] Assertion (call.is_rpc() && !call.is_reply()) failed.  wrong message type (triggered by rpc)
  local RPC stack size: 3
  remote RPC stack guess: 0
  deferred stack size: 0
  out-of-turn RPC replies stack size: 0
  Pending queue size: 0, front to back:
###!!! ABORT: wrong message type: file /home/cjones/mozilla/mozilla-central/ipc/glue/RPCChannel.cpp, line 601

###!!! [Child][RPCChannel] Error: Channel error: cannot send/recv


###!!! [Child][RPCChannel] Error: Channel error: cannot send/recv

The test is somewhat complicated and nondeterministic, but c'est la guerre.
Confirmed that the fix handles this.
Summary: [OOPP] Assertion in RPCChannel's OnMaybeDequeueOne: ABORT: wrong message type → [OOPP] Assertion in RPCChannel's OnMaybeDequeueOne: ABORT: wrong message type [@ _filbuf]
We've just been lucky that this hasn't surfaced on linux yet.
OS: Windows 7 → All
(In reply to comment #8)
> Created an attachment (id=431464) [details]
> Check if we're still connected before dispatching a received message
> 

This patch was interfering with the delivery of the GOODBYE message, so I'll actually be landing a modified version.  Should have run *all* the unit tests when checking it!
Chris: Please see https://bugzilla.mozilla.org/show_bug.cgi?id=552002#c1 - I get this stack today in that bug when I was trying to repro a Mac crash when navigating to http://www.scribd.com/doc/25368889/Tina-Konyvek-Csodaszep-es-finom-tortak-sutemenyek and right clicking in the content area.
That bug definitely isn't dispatch-after-disconnected, which I'd like this bug to cover.  Looks like something else, and bad :(.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: