[e10s] Answering call in same Firefox instance crashes the content process

RESOLVED FIXED in mozilla35



3 years ago
3 years ago


(Reporter: Jan Melcher, Assigned: schien)




Firefox Tracking Flags



(crash signature)


(1 attachment, 3 obsolete attachments)



3 years ago
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:31.0) Gecko/20100101 Firefox/31.0 (Beta/Release)
Build ID: 20140715214327

Steps to reproduce:

* Activate e10s
* Click the message icon in the menu bar
* Paste the link into the window (or in a new window of the same Nightly instance)
* Start the call
* Answer

Actual results:

The content process freezed, for 2-3 seconds, then crashed (Tab crashed).

[Parent 19710] WARNING: pipe error (3): Connection reset by peer: file /build/buildd/firefox-trunk-34.0~a1~hg20140803r197418/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 452

###!!! [Parent][MessageChannel] Error: Channel error: cannot send/recv

Expected results:

If this is due to calling oneself, an error should occur.


3 years ago
Summary: [e10s] Answering call in same window crashes the content process → [e10s] Answering call in same Firefox instance crashes the content process

Comment 1

3 years ago

Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::dom::UDPSocketChild::RecvCallback(nsCString const&, UDPCallbackData const&, nsCString const&) 	dom/network/src/UDPSocketChild.cpp
1 	xul.dll 	mozilla::net::PUDPSocketChild::OnMessageReceived(IPC::Message const&) 	obj-firefox/ipc/ipdl/PUDPSocketChild.cpp
2 	xul.dll 	mozilla::dom::PContentChild::OnMessageReceived(IPC::Message const&) 	obj-firefox/ipc/ipdl/PContentChild.cpp
3 	xul.dll 	mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) 	ipc/glue/MessageChannel.cpp
4 	xul.dll 	mozilla::ipc::MessageChannel::DispatchMessageW(IPC::Message const&) 	ipc/glue/MessageChannel.cpp
5 	xul.dll 	mozilla::ipc::MessageChannel::OnMaybeDequeueOne() 	ipc/glue/MessageChannel.cpp
6 	xul.dll 	MessageLoop::RunTask(Task*) 	ipc/chromium/src/base/message_loop.cc
7 	xul.dll 	MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) 	ipc/chromium/src/base/message_loop.cc
8 	xul.dll 	MessageLoop::DoWork() 	ipc/chromium/src/base/message_loop.cc
9 	xul.dll 	mozilla::ipc::DoWorkRunnable::Run() 	ipc/glue/MessagePump.cpp
10 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp
11 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp
Blocks: 899758
Severity: normal → critical
Crash Signature: [@ mozilla::dom::UDPSocketChild::RecvCallback(nsCString const&, UDPCallbackData const&, nsCString const&) ]
tracking-e10s: --- → ?
Component: Untriaged → Networking
Ever confirmed: true
Keywords: crash
Product: Firefox → Core
tracking-e10s: ? → -


3 years ago
Component: Networking → DOM

Comment 2

3 years ago
This looks like the UDPSocketChild is already freed, given the 0x5a5a5a76 signature (presumably mSocket is offset appropriately from 5a).
While I reproduce this on debug build, it actually assert in here: http://dxr.mozilla.org/mozilla-central/source/media/mtransport/nr_socket_prsock.cpp#810.

This also explains why mSocket is freed because the NrSocketIpc will be delete explicitly while error occurred in open. http://dxr.mozilla.org/mozilla-central/source/media/mtransport/nr_socket_prsock.cpp#1153

Two actions need to take:
1. find out why UDPSocketChild provide an socket that binds to an unexpected interface
2. fix the error handling for socket creation.
The |expected_addr| has an unexpected protocol type |IPPROTO_TCP| and I found the incorrect protocol type is given from the parameter of |nr_socket_local_create|. I discovered that we use the same addr struct for both ICE UDP and ICE TCP [1] and it might be the root cause of unmatched NrSocketIpc.my_addr_.
ni? @bwc about the decision of where to fix.

[1] http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_component.c#404
Flags: needinfo?(docfaraday)
Created attachment 8479771 [details] [diff] [review]

I've confirmed that NrSocketIpc.my_addr_.protocol is the root cause. With this workaround patch the WebRTC call can establish in e10s window successfully.

Comment 6

3 years ago
Comment on attachment 8479771 [details] [diff] [review]

Review of attachment 8479771 [details] [diff] [review]:

This patch breaks TURN-TCP.

The fix is for someone to implement TCP socket remoting
and then plug it in here. That will also get us TURN-TCP
on B2G
Attachment #8479771 - Flags: review-

Comment 7

3 years ago
Shouldn't fixing the error handling (ie; make the failure path for TCP graceful) solve this problem in a way that doesn't present a barrier to implementing IPC for TCP later?
Flags: needinfo?(docfaraday)
Ok, I forgot that we are using |nr_socket_local_create| to create both TCP and UDP socket. Before Bug 950660 is landed, I'll suggest to make Remote TCP socket creation fail.

ps. Firefox OS can survive from this because TURN-TCP is disable on b2g.
Created attachment 8480346 [details] [diff] [review]
Assignee: nobody → schien
Attachment #8479771 - Attachment is obsolete: true
Attachment #8480346 - Flags: review?(ekr)
Comment on attachment 8480346 [details] [diff] [review]

Review of attachment 8480346 [details] [diff] [review]:

I'm not sure this is what we want since it ig going to bubble up errors
when socket creation fails. Can we instead conditionalize nr_ice_component_initialize_tcp
by adding a registry entry for doing TCP?


Byron, what do you think?
Attachment #8480346 - Flags: review?(ekr)
If you are going to follow this path, please add a log message so that when
nr_ice_component_initialize_tcp fails, we don't ahve to look far to know
Or maybe we can prevent doing TURN-TCP in PeerConnectionImpl.cpp, just like what we did for Firefox OS? http://dxr.mozilla.org/mozilla-central/source/media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp?from=PeerConnectionImpl.cpp&case=true#725
(In reply to Shih-Chiang Chien [:schien] (UTC+8) from comment #12)
> Or maybe we can prevent doing TURN-TCP in PeerConnectionImpl.cpp, just like
> what we did for Firefox OS?
> http://dxr.mozilla.org/mozilla-central/source/media/webrtc/signaling/src/
> peerconnection/PeerConnectionImpl.cpp?from=PeerConnectionImpl.
> cpp&case=true#725

I mean prevent TURN-TCP in content process specifically.
Duplicate of this bug: 1060654
Created attachment 8485417 [details] [diff] [review]

This patch disable TURN-TCP for both b2g and e10s window. Content process will not crash after applying this patch, however the WebRTC from content process to remote cannot establish successfully. The video stream from remote is displayed normally. I will flag r? if we only want to prevent content process crashing in this bug. Or, if we also want WebRTC works on Firefox e10s window, I'll unassign myself for WebRTC network peer to investigate the protocol stack.
Attachment #8480346 - Attachment is obsolete: true
Attachment #8485417 - Flags: feedback?(ekr)
Attachment #8485417 - Flags: feedback?(docfaraday)
Comment on attachment 8485417 [details] [diff] [review]

Review of attachment 8485417 [details] [diff] [review]:

This looks correct. However, we need to determine how we are prioritizing Bug 950660.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +727,5 @@
>        NS_ConvertUTF16toUTF8 credential(server.mCredential);
>        NS_ConvertUTF16toUTF8 username(server.mUsername);
> +      // Bug 1039655 - TURN TCP is not e10s ready
> +      if (transport == kNrIceTransportTcp &&

Please add parentheses to make precedence clear.
Attachment #8485417 - Flags: feedback?(ekr) → feedback+
Created attachment 8485613 [details] [diff] [review]

Attachment #8485417 - Attachment is obsolete: true
Attachment #8485417 - Flags: feedback?(docfaraday)
Attachment #8485613 - Flags: review?(ekr)
Comment on attachment 8485613 [details] [diff] [review]

Review of attachment 8485613 [details] [diff] [review]:

Attachment #8485613 - Flags: review?(ekr) → review+
Keywords: checkin-needed
Keywords: checkin-needed
Last Resolved: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in before you can comment on or make changes to this bug.