[OOPP] Assertion in RPCChannel's OnMaybeDequeueOne: ABORT: wrong message type [@ _filbuf]

RESOLVED FIXED

Status

()

Core
Plug-ins
RESOLVED FIXED
8 years ago
7 years ago

People

(Reporter: jimm, Assigned: cjones)

Tracking

Trunk
x86
All
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 1 obsolete attachment)

(Reporter)

Description

8 years ago
Created attachment 426849 [details]
parent stack

Console output:

###!!! [RPCChannel][Parent][f:/Mozilla/firefox/mozilla-central/ipc/glue/RPCChannel.cpp:358] Assertion (call.is_rpc() && !call.is_reply()) failed.  wrong message type (triggered by rpc)
  local RPC stack size: 2
  remote RPC stack guess: 0
  deferred stack size: 0
  out-of-turn RPC replies stack size: 0
  Pending queue size: 3, front to back:
    [ async ]
    [ async ]
    [ async ]
    [ async ]
###!!! ABORT: wrong message type: file f:/Mozilla/firefox/mozilla-central/ipc/glue/RPCChannel.cpp, line 583

Parent stack attached. This happened while selecting flash context menu options.

Comment 1

8 years ago
This is likely the topcrash @_filbuf... you have steps to reproduce?
(Reporter)

Comment 2

8 years ago
(In reply to comment #1)
> This is likely the topcrash @_filbuf... you have steps to reproduce?

I was testing the url Ria mentioned in bug 539835 - 

http://www.rtl.nl/components/actueel/rtlboulevard/miMedia/2009/week26/do_krezip.avi_plain.xml

I right clicked the advertisement, selected something, right clicked again, selected something, got the crash.
(Reporter)

Comment 3

8 years ago
Here's |this| in a google spreadsheet in a subsequent crash.

http://spreadsheets.google.com/ccc?key=0AsaQJjYaC4GYdDlkOWdzTnh4cG16Zm9TZG1fSHk4Umc&hl=en
(Reporter)

Comment 4

8 years ago
Something is going wrong in the child here. Whenever I get into this situation, the child is wrapped up in WaitForNotify with an mStack.size() == 1, and the io thread has either signaled via postthreadmessage and that message was lost, or the message was never sent.

We can't get stuck in WaitForNotify in the child ever, especially when context menus are being displayed - user input in the ui will never get processed. This might be bug 545338. I was thinking it might be a case where NotifyWorkerThread is being called when the TrackPopupMenu dispatch loop is running, but afaict that can't happen. All work should be being processed through OnMaybeDequeueOne in those cases.

Updated

8 years ago
Blocks: 536666

Comment 5

8 years ago
Getting stuck in WaitForNotify is not bad in itself: it's only bad if the parent doesn't respond (or the response gets lost). bent can you record?
Assignee: nobody → bent.mozilla
Created attachment 427465 [details] [diff] [review]
Allow more than one deferred messages, v1

The nested event loop will now allow the child to receive more than one deferred message. We have to support that.
Attachment #427465 - Flags: review?(jones.chris.g)
Attachment #427465 - Flags: review?(jmathies)
Comment on attachment 427465 [details] [diff] [review]
Allow more than one deferred messages, v1

Moved that to bug 546797. Different assertion.
Attachment #427465 - Attachment is obsolete: true
Attachment #427465 - Flags: review?(jones.chris.g)
Attachment #427465 - Flags: review?(jmathies)
Created attachment 431464 [details] [diff] [review]
Check if we're still connected before dispatching a received message

Should be relative easy to write an IPDL/C++ test for this.
Assignee: bent.mozilla → jones.chris.g
Attachment #431464 - Flags: review?(benjamin)

Updated

8 years ago
Attachment #431464 - Flags: review?(benjamin) → review+
Windows hang detector landed in

Ben Turner <bent.mozilla@gmail.com>
	Thu Feb 11 12:19:21 2010 -0800 (at Thu Feb 11 12:19:21 2010 -0800)

so the timing seems right for this hypothesis.
I noticed that triggering this is more complicated in that we also need a pending OnMaybeDequeueOneEvent() event when the RPC reply racing with hang kill is received.  This is definitely possible if a lot of InvalidateRect()s are flying around.
Yay!

$ ipdltest TestHangs
 (child process is 'hanging' now)

###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv


###!!! [Parent][RPCChannel] Error: Channel error: cannot send/recv

###!!! [RPCChannel][Parent][/home/cjones/mozilla/mozilla-central/ipc/glue/RPCChannel.cpp:376] Assertion (call.is_rpc() && !call.is_reply()) failed.  wrong message type (triggered by rpc)
  local RPC stack size: 3
  remote RPC stack guess: 0
  deferred stack size: 0
  out-of-turn RPC replies stack size: 0
  Pending queue size: 0, front to back:
###!!! ABORT: wrong message type: file /home/cjones/mozilla/mozilla-central/ipc/glue/RPCChannel.cpp, line 601

###!!! [Child][RPCChannel] Error: Channel error: cannot send/recv


###!!! [Child][RPCChannel] Error: Channel error: cannot send/recv

The test is somewhat complicated and nondeterministic, but c'est la guerre.
Confirmed that the fix handles this.

Updated

8 years ago
Summary: [OOPP] Assertion in RPCChannel's OnMaybeDequeueOne: ABORT: wrong message type → [OOPP] Assertion in RPCChannel's OnMaybeDequeueOne: ABORT: wrong message type [@ _filbuf]

Updated

8 years ago
Duplicate of this bug: 536666
We've just been lucky that this hasn't surfaced on linux yet.
OS: Windows 7 → All
(In reply to comment #8)
> Created an attachment (id=431464) [details]
> Check if we're still connected before dispatching a received message
> 

This patch was interfering with the delivery of the GOODBYE message, so I'll actually be landing a modified version.  Should have run *all* the unit tests when checking it!
Chris: Please see https://bugzilla.mozilla.org/show_bug.cgi?id=552002#c1 - I get this stack today in that bug when I was trying to repro a Mac crash when navigating to http://www.scribd.com/doc/25368889/Tina-Konyvek-Csodaszep-es-finom-tortak-sutemenyek and right clicking in the content area.
That bug definitely isn't dispatch-after-disconnected, which I'd like this bug to cover.  Looks like something else, and bad :(.
You need to log in before you can comment on or make changes to this bug.