Closed
Bug 843857
Opened 11 years ago
Closed 10 years ago
Fix connection error handling in UnixSocket
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
1.4 S1 (14feb)
People
(Reporter: qdot, Assigned: qdot)
References
Details
(Whiteboard: [systemsfe][p=8])
Attachments
(3 files, 4 obsolete files)
9.25 KB,
patch
|
Details | Diff | Splinter Review | |
1.03 KB,
patch
|
echou
:
review+
|
Details | Diff | Splinter Review |
7.52 KB,
patch
|
tzimmermann
:
review+
|
Details | Diff | Splinter Review |
UnixSocketImpl::Connect() can fail in multiple ways, but we'll never fire back an error event or even clean up the implementation.
Assignee | ||
Comment 2•10 years ago
|
||
Been cleaning up old bugs, figured I'd do this one real quick, and realized that we've got more problems than just not reporting back to the main thread at every point where failure happens. The places we call reset() on the socket are pretty much random, and we never clean up our watchers when we fail as part of listen/connect finishes in onFileCan*. Should we be calling the CloseSocket function there instead of just reset()? Also, copy/pasting the NS_WARNING and task creation/send everywhere is messy, but I did it that way to keep the line numbers for the NS_WARNINGs valid. Thomas: Any thoughts on how to improve this situation? Or am I worrying about the wrong things? :)
Attachment #8345454 -
Flags: feedback?(tzimmermann)
Comment 3•10 years ago
|
||
Comment on attachment 8345454 [details] [diff] [review] 0001-Bug-843857-Make-sure-all-socket-errors-fire-events-b.patch Review of attachment 8345454 [details] [diff] [review]: ----------------------------------------------------------------- Hi Kyle, You're right, this code still has lots of problems. I think we should take a cleaned up version of you're patch to improve things a bit. After an error happened, the user will (hopefully) call UnixSocketConsumer::CloseSocket, which will remove the file descriptor. For 1.4, I have plans for improving the overall implementation of UnixSocket and fixing many of its problems. IMHO UnixSocketImpl should be split up into different classes for accept, listen, connect and data transfer, with each only handling its own operation. When an impl finishes, either with success or failure, it removes its watchers from the I/O loop and informs UnixSocketConsumer about the result. UnixSocketConsumer is then responsible for creating a new impl for the next phase. The overhead is probably slightly higher than now, but the code should be much cleaner and state handling should be simplified. ::: ipc/unixsocket/UnixSocket.cpp @@ +490,4 @@ > } > > void > +UnixSocketImpl::FireSocketError(const nsAString& aErrorMsg) I would leave the error messages were they are now and only do the work here. @@ +494,5 @@ > +{ > + NS_WARNING(aErrorMsg); > + nsRefPtr<OnSocketEventTask> t = > + new OnSocketEventTask(this, OnSocketEventTask::CONNECT_ERROR); > + NS_DispatchToMainThread(t); Make sure you set mConnectionStatus to SOCKET_DISCONNECTED, because of bug 936402. And this method could also remove the watchers from the I/O loop. @@ +502,5 @@ > UnixSocketImpl::Accept() > { > MOZ_ASSERT(!NS_IsMainThread()); > > if (!mConnector) { I think all tests for mConnector can be turned to asserts. There are already MOZ_ASSERTs for aConnector when creating UnixSocketImpl in ConnectSocket or ListenSocket, so mConnector should never be null. @@ +516,5 @@ > if (!mConnector->CreateAddr(true, mAddrSize, mAddr, nullptr)) { > NS_WARNING("Cannot create socket address!"); > + nsRefPtr<OnSocketEventTask> t = > + new OnSocketEventTask(this, OnSocketEventTask::CONNECT_ERROR); > + NS_DispatchToMainThread(t); Again, mConnectionStatus = SOCKET_DISCONNECTED. Here and in all instances below. Or just replace all of these patterns with calls to FireSocketError.
Attachment #8345454 -
Flags: feedback?(tzimmermann) → feedback+
Comment 4•10 years ago
|
||
OMG, sorry for the awful grammar and orthography. :(
Assignee | ||
Comment 5•10 years ago
|
||
Implemented suggestions from feedback.
Attachment #8345454 -
Attachment is obsolete: true
Attachment #8347862 -
Flags: review?(tzimmermann)
Comment 6•10 years ago
|
||
Comment on attachment 8347862 [details] [diff] [review] Patch 1 (v2) - Make sure all connect errors fire to main thread and clean up Review of attachment 8347862 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, thanks!
Attachment #8347862 -
Flags: review?(tzimmermann) → review+
Assignee | ||
Comment 7•10 years ago
|
||
https://hg.mozilla.org/integration/b2g-inbound/rev/f3c055585a4f
Comment 8•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/f3c055585a4f
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → 1.3 C3/1.4 S3(31jan)
Comment 9•10 years ago
|
||
Backed out for blowing up B2G emulator tests. Please run this through Try before landing again. https://hg.mozilla.org/mozilla-central/rev/eb8c4ec52bc0 https://tbpl.mozilla.org/php/getParsedLog.php?id=33546421&tree=B2g-Inbound https://tbpl.mozilla.org/php/getParsedLog.php?id=33546743&tree=B2g-Inbound
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 1.3 C3/1.4 S3(31jan) → ---
Comment 10•10 years ago
|
||
Comment on attachment 8347862 [details] [diff] [review] Patch 1 (v2) - Make sure all connect errors fire to main thread and clean up Review of attachment 8347862 [details] [diff] [review]: ----------------------------------------------------------------- Hi Kyle, I've tried to trace the root cause of Mochitest failure with this patch. https://tbpl.mozilla.org/php/getParsedLog.php?id=33546421&tree=B2g-Inbound I believe the reason of the testing timeout is that B2G emulator failed to initialize. The emulator can't be initiated correctly by "./run-emulator.sh" or Mochitest since it would file a socket error at line 527 (as I comment below). The Mochitest would fail if the emulator is not ready. I'm not familiar with B2G emulator boot process, it seems like we can't create socket file descriptor during B2G booting. Best regards ::: ipc/unixsocket/UnixSocket.cpp @@ +523,5 @@ > if (mFd.get() < 0) { > mFd = mConnector->Create(); > if (mFd.get() < 0) { > + NS_WARNING("Cannot create socket fd!"); > + FireSocketError(); It would file a socket error during the emulator boot process.
Comment 11•10 years ago
|
||
Hi Jamin, Thanks for leaving a comment. Could you provide us with the test case that triggers the bug?
Flags: needinfo?(jaliu)
Comment 12•10 years ago
|
||
Hi Thomas, Step 1. ./config.sh emulator;./build.sh Step 2. cd $B2G_DIR Step 3. ./mach mochitest-remote Step 3 would run every Mochitests on B2G emulator and take a lot of time to complete. (more than one hour) You can find the complete list of test set in gecko/objdir-gonk/_tests/testing/mochitest/tests/mochitest.ini In normal case, the the emulator would turn on and show the Mochitest UI to display the testing progress. However, the B2G emulator can't get to homescreen with change f3c055585a4f. You can use ./run-emulator.ph to trigger the same behavior too.
Flags: needinfo?(jaliu)
Comment 13•10 years ago
|
||
I did some debugging and found that the emulator does not support sockets of type EL2CAP. With some extra debugging output added, the logcat says I/Gecko ( 45): [45] WARNING: Could not open bluetooth socket! errno=94 'Socket type not supported' mType=4: file /home/mozilla/Projects/mozilla/src/mozilla-central/dom/bluetooth/bluez/BluetoothUnixSocketConnector.cpp, line 207 I/Gecko ( 45): [45] WARNING: Cannot create socket fd!: file /home/mozilla/Projects/mozilla/src/mozilla-central/ipc/unixsocket/UnixSocket.cpp, line 526 I/Gecko ( 45): [45] WARNING: NS_ENSURE_TRUE(gDBusConnection) failed: file /home/mozilla/Projects/mozilla/src/mozilla-central/ipc/dbus/DBusThread.cpp, line 320 I/GeckoBluetooth( 45): OnSocketConnectError: [server] I/GonkDBus( 45): ...bind(61) gave errno 98 I/Gecko ( 45): [45] WARNING: Could not open bluetooth socket! errno=94 'Socket type not supported' mType=4: file /home/mozilla/Projects/mozilla/src/mozilla-central/dom/bluetooth/bluez/BluetoothUnixSocketConnector.cpp, line 207 I/Gecko ( 45): [45] WARNING: Cannot create socket fd!: file /home/mozilla/Projects/mozilla/src/mozilla-central/ipc/unixsocket/UnixSocket.cpp, line 526 I/Gecko ( 45): [45] WARNING: Bluetooth has already been enabled/disabled beforeor the toggling is failed.: file /home/mozilla/Projects/mozilla/src/mozilla-central/dom/bluetooth/BluetoothService.cpp, line 551 I/Gecko ( 45): [45] WARNING: No observer registered for path /B2G/bluetooth/manager: file /home/mozilla/Projects/mozilla/src/mozilla-central/dom/bluetooth/BluetoothService.cpp, line 453 I/GeckoBluetooth( 45): OnSocketConnectError: [server] I/GonkDBus( 45): ...bind(62) gave errno 98 I/Gecko ( 45): [45] WARNING: Could not open bluetooth socket! errno=94 'Socket type not supported' mType=4: file /home/mozilla/Projects/mozilla/src/mozilla-central/dom/bluetooth/bluez/BluetoothUnixSocketConnector.cpp, line 207 I/Gecko ( 45): [45] WARNING: Cannot create socket fd!: file /home/mozilla/Projects/mozilla/src/mozilla-central/ipc/unixsocket/UnixSocket.cpp, line 526 I/GeckoBluetooth( 45): OnSocketConnectError: [server] I/GonkDBus( 45): ...bind(62) gave errno 98 I/Gecko ( 45): [45] WARNING: Could not open bluetooth socket! errno=94 'Socket type not supported' mType=4: file /home/mozilla/Projects/mozilla/src/mozilla-central/dom/bluetooth/bluez/BluetoothUnixSocketConnector.cpp, line 207 and so on.
Comment 14•10 years ago
|
||
According to the blog post at [1], eL2CAP support requires a kernel version of at least 2.6.36. But the ICS emulator runs 2.6.29, so there probably isn't a way to simply 'switch on' the protocol and be done with this problem. Let's rather fix the error handling in BT. [1] http://www.bluez.org/l2cap-extended-features-on-2-6-36/
Comment 15•10 years ago
|
||
BluetoothOppManager calls Listen in its error handler for failed connecting attempts. Calling Listen consequently fails with the same problem (i.e., EL2CAP not available) and triggers the same error handler. This is a q'n'd fix that works for me; no idea if this has side effects. ;)
Attachment #8366029 -
Flags: review?(echou)
Attachment #8366029 -
Flags: feedback?(kyle)
Assignee | ||
Comment 16•10 years ago
|
||
Comment on attachment 8366029 [details] [diff] [review] [01] Bug 843857: Don't try to listen on OPP socket if connecting failed I am soooo not qualified to even fb OOP stuff at this point so I'm just gonna let eric review this. :)
Attachment #8366029 -
Flags: feedback?(kyle)
Comment 17•10 years ago
|
||
Comment on attachment 8366029 [details] [diff] [review] [01] Bug 843857: Don't try to listen on OPP socket if connecting failed Review of attachment 8366029 [details] [diff] [review]: ----------------------------------------------------------------- Unfortunately the patch must not work. Imagine that the remote device BT is disabled. Then we won't listen to incoming connection anymore once the patch applies. :( Moreover, I'm not sure if listening to el2cap is the root cause since Listen() is called in BluetoothOppManager::Init() as well, which should be called once Bluetooth is up.
Attachment #8366029 -
Flags: review?(echou) → review-
Comment 18•10 years ago
|
||
Hi Eric > Unfortunately the patch must not work. Imagine that the remote device BT is > disabled. Then we won't listen to incoming connection anymore once the patch > applies. :( This confuses me. OnSocketConnectError will only be called during while the connection gets established. If the remote device disconnects, you should see a OnSocketDisconnect instead. > Moreover, I'm not sure if listening to el2cap is the root cause since > Listen() is called in BluetoothOppManager::Init() as well, which should be > called once Bluetooth is up. It is. Init calls Listen. Listen fails because of EL2CAP and calls OnSocketConnectError. OnSocketConnectError calls Listen, which again fails because of EL2CAP, and so on.
Comment 19•10 years ago
|
||
FYI: This patch adds some debugging noise on top of 'Patch 1'. I used this yesterday, but don't have the logcat any longer. :(
Comment 20•10 years ago
|
||
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #18) > Hi Eric > > > Unfortunately the patch must not work. Imagine that the remote device BT is > > disabled. Then we won't listen to incoming connection anymore once the patch > > applies. :( > > This confuses me. OnSocketConnectError will only be called during while the > connection gets established. If the remote device disconnects, you should > see a OnSocketDisconnect instead. > The scenario I mentioned is not remote device disconnects actively. Imagine that you have a Bluetooth headset, which has been already paired with your FxOS device. Keep your headset off, and then turn on the FxOS device and try to use it to connect with the remote device (which means mSocket->Connect() is called). OnSocketConnectError should be fired instead of OnSocketDisconnect, IIRC. > > Moreover, I'm not sure if listening to el2cap is the root cause since > > Listen() is called in BluetoothOppManager::Init() as well, which should be > > called once Bluetooth is up. > > It is. Init calls Listen. Listen fails because of EL2CAP and calls > OnSocketConnectError. OnSocketConnectError calls Listen, which again fails > because of EL2CAP, and so on. Got it. Now I understand the problem. ;)
Comment 21•10 years ago
|
||
What about this approach? If we run as BT client (i.e., !mIsServer) and connect fails, we'll switch over to server mode and start listening. If this fails in server mode, we give up.
Attachment #8366029 -
Attachment is obsolete: true
Attachment #8366627 -
Flags: review?(echou)
Comment 22•10 years ago
|
||
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #5) > Created attachment 8347862 [details] [diff] [review] > Patch 1 (v2) - Make sure all connect errors fire to main thread and clean up > > Implemented suggestions from feedback. BTW, there is a semicolon missing in this patch.
Assignee | ||
Comment 23•10 years ago
|
||
Uploading new patch version with semicolon fixed (the backed out version that I landed was the fixed version but forgot to update bug)
Attachment #8347862 -
Attachment is obsolete: true
Comment 24•10 years ago
|
||
Comment on attachment 8366627 [details] [diff] [review] [01] Bug 843857: Don't listen if connect fails on BT server Review of attachment 8366627 [details] [diff] [review]: ----------------------------------------------------------------- Yes, we can do this. :)
Attachment #8366627 -
Flags: review?(echou) → review+
Updated•10 years ago
|
Target Milestone: --- → 1.4 S1 (14feb)
Updated•10 years ago
|
Whiteboard: [systemsfe]
Comment 25•10 years ago
|
||
V3 is missing a header. This patch (v4) is v2 + the missing semicolon; still attributed to Kyle.
Attachment #8366775 -
Attachment is obsolete: true
Attachment #8370706 -
Flags: review+
Comment 26•10 years ago
|
||
Thanks everyone. I'll land the patches as soon as b2g-inbound is open again.
Comment 27•10 years ago
|
||
https://hg.mozilla.org/integration/b2g-inbound/rev/3d8caa7ec5eb https://hg.mozilla.org/integration/b2g-inbound/rev/d5907ae0d95a https://tbpl.mozilla.org/?tree=B2g-Inbound&rev=d5907ae0d95a
Assignee | ||
Comment 28•10 years ago
|
||
(In reply to Thomas Zimmermann [:tzimmermann] [:tdz] from comment #25) > Created attachment 8370706 [details] [diff] [review] > [02] Make sure all connect errors fire to main thread and clean up (v4) > > V3 is missing a header. This patch (v4) is v2 + the missing semicolon; still > attributed to Kyle. Oops. Thanks for fixing.
Comment 29•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/3d8caa7ec5eb https://hg.mozilla.org/mozilla-central/rev/d5907ae0d95a
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Whiteboard: [systemsfe] → [systemsfe][p=8]
You need to log in
before you can comment on or make changes to this bug.
Description
•