UnixSocketConsumer::OnDisconnect() won't be called when remote socket closed

RESOLVED FIXED

Status

--
major
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: echou, Assigned: echou)

Tracking

unspecified
ARM
Gonk (Firefox OS)
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:-)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

6 years ago
I tried to send a file via BluetoothOppManager, which inherits UnixSocketConsumer. After the whole sending process ended, BluetoothOppManager::OnDisconnect() wouldn't be called.

The patches landed through Bug 836523 changed the flow of handling passive disconnection to avoid mConsumer being used after deleted. However, NotifyDisconnect() wouldn't be called if the connected remote device dropped the socket connection.
(Assignee)

Comment 1

6 years ago
Created attachment 715316 [details] [diff] [review]
patch 1: v1: Ensure the flows of handling "socket disconnection" are the same

For "actively calling CloseSocket()" and "passively noticing the socket has been closed", we want to ensure that the process of closing a socket are the same. Therefore I use a runnable to call CloseSocket() on the main thread when it can't read data from the socket anymore.
Assignee: nobody → echou
Attachment #715316 - Flags: review?(kyle)
Attachment #715316 - Flags: feedback?(tzimmermann)
(Assignee)

Comment 2

6 years ago
Created attachment 715416 [details] [diff] [review]
patch 1: v2: Ensure the flows of handling "socket disconnection" are the same

Re-added 

  mReadWatcher.StopWatchingFileDescriptor();
  mWriteWatcher.StopWatchingFileDescriptor();

before new RequestClosingSocketTask(), or main process would seem to be blocked.
Attachment #715316 - Attachment is obsolete: true
Attachment #715316 - Flags: review?(kyle)
Attachment #715316 - Flags: feedback?(tzimmermann)
Attachment #715416 - Flags: review?(kyle)
Attachment #715416 - Flags: feedback?(tzimmermann)
I can't test this patch, because Bluetooth on  m-i is broken for me since last Friday. Just two thoughts about it:

- The new Runnable gets created in the error path of the read call. Is this really the correct place to do this? Please correct me, but the function OnFileCanReadWithoutBlocking will only run if there is data available for reading. And then read needs to fail with an error to run the new code. This just doesn't look right to me.

- I though this patch is supposed to fix bug 840925. I don't understand how.
Thomas, I agree with your second point and I'm still looking for the root cause for bug 840925. It seems like the dbus connection is broken after we remove device without close socket first.
Sorry, my bad. I think I mean bug 841984.

(In reply to Gina Yeh from comment #4)
> Thomas, I agree with your second point and I'm still looking for the root
> cause for bug 840925. It seems like the dbus connection is broken after we
> remove device without close socket first.
Blocks: 841984
(Assignee)

Comment 6

6 years ago
Hi Thomas,

(In reply to Thomas Zimmermann [:tzimmermann] from comment #3)
> I can't test this patch, because Bluetooth on  m-i is broken for me since
> last Friday. Just two thoughts about it:
> 
> - The new Runnable gets created in the error path of the read call. Is this
> really the correct place to do this? Please correct me, but the function
> OnFileCanReadWithoutBlocking will only run if there is data available for
> reading. And then read needs to fail with an error to run the new code. This
> just doesn't look right to me.
> 

Function read() would return -1 when error occurs. If you try to send a photo to remote device, at the end of sending process, many devices will close the socket directly. In this case, you may find -1 is returned with error number 0x68, which is ECONNRESET (Connection reset by peer).

The problem we've got now is, we don't treat "Actively close socket" and "Passively notice that remote user has closed the socket" in the same way: One calls NotifyDisconnect(), and the other one doesn't. That will lead to:

  (1) Wrong socket status
  (2) UnixSocketConsumer won't be notified

That's the reason why this patch works.

> - I though this patch is supposed to fix bug 840925. I don't understand how.

I don't know either. I suggest that we should fix this first then check how those crashes happened.
(Assignee)

Comment 7

6 years ago
ok, this patch can also fix bug 840544. The root cause of Bug 840544 is accessing an invalid impl in CloseSocket(). The detailed scenario is:

1. Remote connected device close its socket.
2. In UnixSocketImpl::OnFileCanReadWithoutBlocking(), function read() return -1 with errno = 0x68 (Connection reset by peer).
3. PostTask(FROM_HERE, new SocketCloseTask(this)) will be called to put SocketCloseTask on IO thread.
4. In SocketCloseTask::Run(), mImpl->Close() will be called, and that will create a DeleteInstanceRunnable object for deleting UnixSocketImpl object, which is also the same as mConsumer.mImpl.
5. Once BluetoothOppManager found that the remote device sent a Disconnect packet to us but never closed the socket, we would call CloseSocket() after delaying 1 sec.
6. In function CloseSocket(), mImpl has already been deleted but will still be used. Crash.
Wondering is this'll fix bug 841925 too. Doubt it, but will give it a test.
Comment on attachment 715416 [details] [diff] [review]
patch 1: v2: Ensure the flows of handling "socket disconnection" are the same

Review of attachment 715416 [details] [diff] [review]:
-----------------------------------------------------------------

Looks ok to me, though I'm interested to hear tzimmermann's feedback too.
Attachment #715416 - Flags: review?(kyle) → review+
Comment on attachment 715416 [details] [diff] [review]
patch 1: v2: Ensure the flows of handling "socket disconnection" are the same

Review of attachment 715416 [details] [diff] [review]:
-----------------------------------------------------------------

Oh, I was wondering how this patch fixes bug 840925. But since this isn't the intention, there isn't much else for me to comment about. Maybe you could add some error checking to lines 697 and 698.

Regards
Attachment #715416 - Flags: feedback?(tzimmermann) → feedback+
(Assignee)

Comment 12

6 years ago
Kyle & Thomas, thanks for your review and suggestion.
https://hg.mozilla.org/mozilla-central/rev/3994cedc448c
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
blocking-b2g: --- → leo+
This is going to need to be rebased for b2g18 uplift on top of bug 845148 and bug 836523.
(Assignee)

Comment 16

6 years ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #15)
> This is going to need to be rebased for b2g18 uplift on top of bug 845148
> and bug 836523.

The problem doesn't appear on b2g18. OnDisconnect() would be called either CloseSocket() actively or get an error in UnixSocketImpl::OnFileCanReadWithoutBlocking(). So the patch doesn't need to be merged into b2g18. In addition, re-nominate the blocking-b2g flag since it won't affect b2g18.
blocking-b2g: leo+ → ---
(Assignee)

Comment 17

6 years ago
sorry, forgot to switch to leo?
blocking-b2g: --- → leo?
Eric, bug 840544 claims to be fixed by this. Should that also be leo?
(Assignee)

Comment 19

6 years ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #18)
> Eric, bug 840544 claims to be fixed by this. Should that also be leo?

Updated. Thank you.
blocking-b2g: leo? → -
triage: blocking- per comment #16
You need to log in before you can comment on or make changes to this bug.