Closed Bug 1052304 Opened 10 years ago Closed 10 years ago

[NFC][Bluetooth] Transfer file many times would hit a fail that will make sender could not BT transfer file anymore

Categories

(Firefox OS Graveyard :: Bluetooth, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(firefox33 wontfix, firefox34 fixed, firefox35 fixed, b2g-v2.0 wontfix, b2g-v2.1 verified, b2g-v2.2 verified)

VERIFIED FIXED
2.1 S6 (10oct)
Tracking Status
firefox33 --- wontfix
firefox34 --- fixed
firefox35 --- fixed
b2g-v2.0 --- wontfix
b2g-v2.1 --- verified
b2g-v2.2 --- verified

People

(Reporter: ashiue, Assigned: jaliu)

References

Details

(Whiteboard: [p=3])

Attachments

(2 files, 7 obsolete files)

3.89 KB, patch
Details | Diff | Splinter Review
3.92 KB, patch
Details | Diff | Splinter Review
Gaia      de28796a8956a48bb98ca67df6a33e0622d642d1
Gecko     https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/5256345f62bd
BuildID   20140810160202
Version   32.0

STR:
1. Enable NFC, disable BT on both phones
2. Share file via NFC many times (maybe over 30)

Expected result:
No matter encounter any fail transfer, BT transfer function would not be blocked.

Actual result:
Hit a fail transfer which will make sender could not BT transfer file anymore unless reboot device
QA Whiteboard: [COM=Bluetooth]
According to https://bugzilla.mozilla.org/show_bug.cgi?id=1043878#c12.
Assignee: nobody → shuang
See Also: → 1051652
Steal!
Assignee: shuang → echou
[Blocking Requested - why for this release]:
Nominate this to triage because it breaks the user experience of NFC sharing.
blocking-b2g: --- → 2.1?
Triage: blocking
Assignee: echou → jaliu
blocking-b2g: 2.1? → 2.1+
QA Whiteboard: [COM=Bluetooth] → [COM=NFC]
This was reported on 2.0 but nominated to block on 2.1. Adding qawanted for branch checks to verify all the affected branches.
Keywords: qawanted
This issue occurs on Flame KK 2.2, Flame KK 2.1, Flame KK 2.0, and the v180 Flame KK 2.0 base.  OpenC cannot use NFC and could not be checked.

Environmental Variables:
Device: Flame 2.2
BuildID: 20140923065343
Gaia: 37b8a812c642ca616bf9457cb9b71e45261cdfa8
Gecko: 9e193395b912
Version: 35.0a1 (2.2) 
Firmware Version: 
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0

Environmental Variables:
Device: Flame 2.1
BuildID: 20140923102943
Gaia: b542080231bb9cdbdb282ab926965ad3f91f7460
Gecko: 34a66e70a67d
Version: 34.0a2 (2.1) 
Firmware Version: 
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Environmental Variables:
Device: Flame 2.0
BuildID: 20140923115841
Gaia: 6449cc35a8f0704d95acac374ba857bde4b86d6c
Gecko: 4edd7f40672b
Version: 32.0 (2.0) 
Firmware Version: 
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
QA Whiteboard: [COM=NFC] → [QAnalyst-Triage?][COM=NFC]
Flags: needinfo?(jmitchell)
Keywords: qawanted
QA Whiteboard: [QAnalyst-Triage?][COM=NFC] → [QAnalyst-Triage+][COM=NFC]
Flags: needinfo?(jmitchell)
Hi Jamin,
Just a reminder. This is 2.1+ so I assume it's prioritized in sprint6.
Mind to propose an ETA?
Flags: needinfo?(jaliu)
Flags: needinfo?(jaliu)
Target Milestone: --- → 2.1 S6 (10oct)
I'd reproduced this bug few times on flame-kk 2.1.
It usually occurs when user try to send photos many times via NFC in a short period of time.
When two file transfer processes are overlaping, the status of BluetoothOppManager might be messed up.

Under the general conditions, mSocket would be nullptr and mServerSocket is connected when BluetoothOppManager::SendFile() is triggered by NFC. However, the value of mSocket and mServerSocket are abnormal when I hit the bug.
I suspect that some of the member variables could be read and modifred by different file transfer processes with incorrect order, such like 'mIsServer', 'mSocket', 'mServerSocket'.
P.S. We use 'PostDelayedTask()' to delay mSocket->CloseSocket() and use 'setTimeout()' to delay BluetoothAdapter.sendFile(). Thus, the order of code execution is not very intuitive.
 
If file transfer process encounter some kind of error and can't finish the entire process, "mSocket->CloseSocket()" might not be called and remains connected.
The incorrect status would affect next file transfer process and it can't be recovered until reboot.

I'll try to avoid this situation by revising the usage of BT API in Gaia or provide an error recovery mechanism in Gecko.
Regards
Any update for this bug? It is a 2.1+. I wonder if it can be fixed before 10/10. Thanks.
Flags: needinfo?(jaliu)
The root cause of this bug is a unfinished file transfer that was interrupted by previous file transfer. Gallery app and Blutooth API allow user to transfer multiple photos and put them into a queue.

In practice, user can send multi-photos by selecting multiple photos and send them at once, or by clicking |Share -> Bluetooth -> Paired device| multiple times in a short period of time. Both actions will use one Bluetooth socket to send multiple files.

However, if user click |Share -> Bluetooth -> Paired device| slowly, the first OPP procedure would close the Bluetooth socket first and the second OPP procedure would open a socket and close it again, it depends on the queue is empty or not.

If a user add a OPP task to a long queue, the task would use same socket connection of previous task.
If a user add a OPP task to an empty queue, the task would create a new socket connection.
However, if a user add a OPP task to a queue which contains only one task and previous task finish right after we put this OPP task into queue, the procedure of two task would overlap and mess up the status of OPP manager.

The really problem is that it can't be recovered from this status.
I'll make a patch to fix this incorrect status of OPP manager. 

P.S. Consider to physical conditions and Bluetooth stack behavior, we can't guarantee every file transfer is successful. But we have to guarantee it wouldn't break the functionally of next usage.
Flags: needinfo?(jaliu)
Comment on attachment 8500465 [details] [diff] [review]
(for 2.1) Avoid to add Bluetooth OPP task to file queue when socket is going to be disconnected. (v0)

Review of attachment 8500465 [details] [diff] [review]:
-----------------------------------------------------------------

f- since this change impacts normal ongoing transfer as well. How do you tell whether the transfer is ongoing normally or not?

::: dom/bluetooth/bluedroid/BluetoothOppManager.cpp
@@ +362,5 @@
> +  // mSocket isn't nullptr and the role is not a server. It means previous file
> +  // transfer hasn't finished.
> +  if (mSocket && !mIsServer) {
> +      bool isFileSending = mCurrentBlobIndex != -1;
> +      bool noPendingFile = mBatches.Length() && ((mBatches[0].mBlobs.Length() - mCurrentBlobIndex) > 1);

I think the condition is hasPendingFile rather than no.

@@ +364,5 @@
> +  if (mSocket && !mIsServer) {
> +      bool isFileSending = mCurrentBlobIndex != -1;
> +      bool noPendingFile = mBatches.Length() && ((mBatches[0].mBlobs.Length() - mCurrentBlobIndex) > 1);
> +      if (isFileSending && noPendingFile) {
> +        BT_WARNING("[OPP] Socket is going to be disconnected, a separate file

nit: remove 'a'
Attachment #8500465 - Flags: feedback?(btian) → feedback-
Status: NEW → ASSIGNED
QA Whiteboard: [QAnalyst-Triage+][COM=NFC] → [QAnalyst-Triage+][COM=Bluetooth]
- Support the scenarios that multiple receivers involve.
Attachment #8500465 - Attachment is obsolete: true
Attachment #8500881 - Flags: feedback?(btian)
(In reply to Ben Tian [:btian] from comment #12)
> Comment on attachment 8500465 [details] [diff] [review]
> (for 2.1) Avoid to add Bluetooth OPP task to file queue when socket is going
> to be disconnected. (v0)
> 
> Review of attachment 8500465 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/bluetooth/bluedroid/BluetoothOppManager.cpp
> @@ +362,5 @@
> > +  // mSocket isn't nullptr and the role is not a server. It means previous file
> > +  // transfer hasn't finished.
> > +  if (mSocket && !mIsServer) {
> > +      bool isFileSending = mCurrentBlobIndex != -1;
> > +      bool noPendingFile = mBatches.Length() && ((mBatches[0].mBlobs.Length() - mCurrentBlobIndex) > 1);
> 
> I think the condition is hasPendingFile rather than no.
Yes, you're right.
It should be |hasPendingFile| and use |!hasPendingFile| as condition of |if| statement.
Thank you.

> f- since this change impacts normal ongoing transfer as well. How do you
> tell whether the transfer is ongoing normally or not?

If user send multiple photos by selecting multiple photos and send them at once, the mCurrentBlobIndex should be default value when the second file call adapter.sendFile() and it wouldn't be affected by this patch.

If user sends separate files in a short period of time *and* one of task hit the condition in #Attachment #8500465 [details] [diff], it's reasonable to make that task invalid, even it triggered by gallery app without NFC.

When bluetooth socket is going to be disconnected, the Blob queue will be discarded by DiscardBlobsToSend() in OnSocketDisconnect() anyway. It could reduce the risk of messing up OPP manager if we don't put the invalid task into queue at the first place.

For example, patch Attachment #8500881 [details] [diff] don't affect these user scenarios even the transfer is ongoing when new request is sending.
1. Send mutiple photos via Bluetooth sharing
2. Send a photo via NFC sharing and repeat this action multiple times
3. Send mutiple photos via Bluetooth sharing and do it again immediately
Comment on attachment 8500881 [details] [diff] [review]
(for 2.1) Avoid to add Bluetooth OPP task to file queue when socket is going to be disconnected. (v1)

Review of attachment 8500881 [details] [diff] [review]:
-----------------------------------------------------------------

I'm still concerned that the patch may jeopardize normal file transfers since it handles only the listed scenarios rather than fixing a defect of OPP manager/socket. The real root cause is still unclear to me. Per offline discussion, please investigate SocketMessageWatcher for more clue for this problem. Thanks.

(In reply to Jamin Liu [:jaliu] from comment #14)
> For example, patch Attachment #8500881 [details] [diff] [diff] doesn't affect these
> user scenarios even the transfer is ongoing when new request is sending.
> 1. Send multiple photos via Bluetooth sharing
> 2. Send a photo via NFC sharing and repeat this action multiple times
> 3. Send multiple photos via Bluetooth sharing and do it again immediately
Attachment #8500881 - Flags: feedback?(btian)
- removed a typo
Attachment #8501601 - Attachment is obsolete: true
Comment on attachment 8501614 [details] [diff] [review]
(for 2.1) Notify OPP manager when Bluetooth socket message error occurs. (v1)

Review of attachment 8501614 [details] [diff] [review]:
-----------------------------------------------------------------

Please see my comment below.

::: dom/bluetooth/BluetoothService.cpp
@@ +439,5 @@
>    profile = BluetoothOppManager::Get();
>    NS_ENSURE_TRUE(profile, NS_ERROR_FAILURE);
>    if (profile->IsConnected()) {
>      profile->Disconnect(nullptr);
> +  } else {

I suggest to remove this Reset() as v2.0 behavior.

::: dom/bluetooth/bluedroid/BluetoothSocket.cpp
@@ +365,5 @@
>        return;
>      }
>  
> +    if (aConnectionStatus != 0) {
> +      if (!mImpl->IsShutdownOnMainThread()) {

Remove this condition since it's already checked.

@@ +366,5 @@
>      }
>  
> +    if (aConnectionStatus != 0) {
> +      if (!mImpl->IsShutdownOnMainThread()) {
> +        mImpl->mConsumer->NotifyError();

return here.

@@ +378,5 @@
>    void OnError(BluetoothStatus aStatus) MOZ_OVERRIDE
>    {
>      MOZ_ASSERT(NS_IsMainThread());
>      BT_LOGR("BluetoothSocketInterface::Accept failed: %d", (int)aStatus);
> +    if (!mImpl->IsShutdownOnMainThread()) {

nit: a newline before. Also leave comment to explain behavior here.

@@ +509,5 @@
>                 int aConnectionStatus) MOZ_OVERRIDE
>    {
>      MOZ_ASSERT(NS_IsMainThread());
>  
> +    if (aConnectionStatus !=0) {

I suggest to revise this function as AcceptResultHandler::Accept().

nit: space after !=

@@ +526,5 @@
>    void OnError(BluetoothStatus aStatus) MOZ_OVERRIDE
>    {
>      MOZ_ASSERT(NS_IsMainThread());
>      BT_WARNING("Connect failed: %d", (int)aStatus);
> +    if (!mImpl->IsShutdownOnMainThread()) {

Ditto.
Attachment #8501614 - Flags: review?(btian)
Attachment #8501614 - Attachment is obsolete: true
Attachment #8501664 - Flags: review?(btian)
(In reply to Ben Tian [:btian] from comment #19)
> Comment on attachment 8501614 [details] [diff] [review]
> (for 2.1) Notify OPP manager when Bluetooth socket message error occurs. (v1)
> 
> Review of attachment 8501614 [details] [diff] [review]:
> -----------------------------------------------------------------
Thank you for reviewing the patch.
I've uploaded a new patch based on you comment. :)
Comment on attachment 8500881 [details] [diff] [review]
(for 2.1) Avoid to add Bluetooth OPP task to file queue when socket is going to be disconnected. (v1)

When OppManager.sendFile() is called, it check mSocket and mServerSocker to 
determine whether to use current socket connection or create a new socket.
If Bluetooth connection socket encounters an error and OppManager doesn't aware that, mSocket would be treated as a valid socket and causes incorrect status which I described at #comment 8.
attachment #8500881 [details] [diff] [review] try to avoid to encounter socket connection error by discarding Bluetooth OPP task when socket is going to be disconnected. 
attachment #8501664 [details] [diff] [review] provides a way to notify OPP manager when Bluetooth socket message error occurs for error handling.

I believe we only need one of them to resolve this bug.
Attachment #8500881 - Attachment is obsolete: true
Comment on attachment 8501664 [details] [diff] [review]
(for 2.1) Notify OPP manager when Bluetooth socket message error occurs. (v2)

Review of attachment 8501664 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comment addressed. Please also prepare m-c patch and request for aurora approval. Thanks.

::: dom/bluetooth/bluedroid/BluetoothSocket.cpp
@@ +379,5 @@
>      MOZ_ASSERT(NS_IsMainThread());
>      BT_LOGR("BluetoothSocketInterface::Accept failed: %d", (int)aStatus);
> +
> +    if (!mImpl->IsShutdownOnMainThread()) {
> +      // This notification would trigger BluetoothOppManager.OnSocketDisconnect

I suggest to write as following:

// Instead of NotifyError(), call NotifyDisconnect() to trigger BluetoothOppManager::OnSocketDisconnect() as DroidSocketImpl::OnFileCanReadWithoutBlocking() in Firefox OS 2.0 in order to keep the same behavior and reduce regression risk.

@@ +514,5 @@
>    {
>      MOZ_ASSERT(NS_IsMainThread());
>  
> +    if (mImpl->IsShutdownOnMainThread()) {
> +      BT_LOGD("mConsumer is null, aborting receive!");

nit: aborting 'send'

@@ +534,5 @@
>      MOZ_ASSERT(NS_IsMainThread());
>      BT_WARNING("Connect failed: %d", (int)aStatus);
> +
> +    if (!mImpl->IsShutdownOnMainThread()) {
> +      // This notification would trigger BluetoothOppManager.OnSocketDisconnect

Ditto.
Attachment #8501664 - Flags: review?(btian) → review+
Thank Ben for reviewing the patch.
Attachment #8501664 - Attachment is obsolete: true
Comment on attachment 8502197 [details] [diff] [review]
(for 2.1) Notify OPP manager when Bluetooth socket message error occurs. (v3), r=btian

Approval Request Comment
[Feature/regressing bug #]:
Bug 894320 - [B2G][NFC][User Story]: Support sharing of Image/Video content

[User impact if declined]:
Transfer file many times would hit a fail that will make sender could not perform BT transfer file anymore. (including NFC file sharing and Blutooth file sharing)

[Describe test coverage new/current, TBPL]:
NFC file sharing use Bluetooth as transfer channel.
There are few Bluetooth and NFC marionette tests on TBPL. However, tinder box currently runs test on ICS-emulator. Since the Bluetooth use different back-end stack on KK, the testes on TBPL can't really cover this fix.

Currently, the fix can only be verified by manual test.

The correctness of Bluetooth file transfer can be covered by scheduled Bluetooth PTS test.
Bug 1064739 - [BlueDroid][PTS][Meta] Bluetooth PTS test (v2.1 in aurora)

However, it don't include the NFC part.

[Risks and why]: 
It modified Bluetooth socket which is used by Bluetooth file transfer, including photo sharing in Gallery app. (with or without NFC) 
It's related to few user stories.

[String/UUID change made/needed]: No
Attachment #8502197 - Flags: approval-mozilla-aurora?
(In reply to Jamin Liu [:jaliu] from comment #27)
> [Describe test coverage new/current, TBPL]:
> NFC file sharing use Bluetooth as transfer channel.
> There are few Bluetooth and NFC marionette tests on TBPL. However, tinder
> box currently runs test on ICS-emulator. Since the Bluetooth use different
> back-end stack on KK, the testes on TBPL can't really cover this fix.
> 
> Currently, the fix can only be verified by manual test.

I've tested this patch on flame-kk with latest gecko/gaia v2.1.

Here is the verification process.
1) Reproduce the symptom of this bug without this patch on v2.1 branch.
2) Follow the STR at #Description and make sure BT file transfer is functional even after it encounters a connection failure.
3) Test these user stories and make sure there is no regression.
  1. Send a photo via Bluetooth sharing
  2. Send mutiple photos via Bluetooth sharing
  3. Send a photo via NFC sharing
  4. Send mutiple photos via Bluetooth sharing and do it again immediately
  5. Receive a photo via Bluetooth sharing
  6. Receive mutiple photos via Bluetooth sharing
  7. Receive a photo via NFC sharing.
This bug was marked as "status-b2g-v2.0: affected", however, I think the symptom and root cause on 2.0 are different.

From my observe, NFC sharing can't work on flame-kk on 2.0.
If user try to send a photo via NFC, a notification "File count not be sent" would pop up immediately.
A error message "sendNDEF(hr) failed" can be found by using adb logcat to indicate the request nfcPeer.sendNDEF(hr) failed. Since |sendNDEF| failed, the process stopped before it call Bluetooth API.

Please refer to Bug 1080382.
In my experiment, it can be fixed by replacing Gonk layer.
To be specific, if I flash v184 flame-kk image and replace gecko/gaia 2.0 by PVT shallow flash, the NFC sharing works fine.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/996f6e82d206
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Flags: needinfo?(bbajaj)
Comment on attachment 8502197 [details] [diff] [review]
(for 2.1) Notify OPP manager when Bluetooth socket message error occurs. (v3), r=btian

Requesting QA verification around Bletooth/NFC and this bug once this is landed on 2.1
Attachment #8502197 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Keywords: verifyme
Attachment #8501607 - Attachment is obsolete: true
Verified on
[2.1]
Gaia-Rev        f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-aurora/rev/75cd3a8fa031
Build-ID        20141012160202
Version         34.0a2

[2.2]
Gaia-Rev        3b81896f04a02697e615fa5390086bd5ecfed84f
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/199fb29c3467
Build-ID        20141012160201
Version         35.0a1
Status: RESOLVED → VERIFIED
set 2.0? to see if it needs to be uplifted to 2.0
blocking-b2g: 2.1+ → 2.0?
no partner is requested this on 2.0; suggesting to denominate it for now...
blocking-b2g: 2.0? → ---
(In reply to Mike Lien[:mlien] from comment #36)
> set 2.0? to see if it needs to be uplifted to 2.0

I think the same symptom of this bug can't be reproduced on Firefox 2.0.
Please refer to comment 29.
refer to comment 29 and comment 37, mark v2.0 status as wontfix
You need to log in before you can comment on or make changes to this bug.