Closed Bug 1191715 Opened 9 years ago Closed 9 years ago

gecko crash during nfc/bluetooth transferring

Categories

(Firefox OS Graveyard :: Bluetooth, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:2.2r+, firefox43 fixed, b2g-v2.2r fixed, b2g-master verified)

VERIFIED FIXED
FxOS-S7 (18Sep)
blocking-b2g 2.2r+
Tracking Status
firefox43 --- fixed
b2g-v2.2r --- fixed
b2g-master --- verified

People

(Reporter: gasolin, Assigned: ben.tian)

References

(Blocks 1 open bug)

Details

(Whiteboard: [ETA:9/7])

Attachments

(4 files, 1 obsolete file)

during testing NFC on 8/5 flame pvt img,

Sending file When NFC on and BT on or off, sometimes I got logs (monitored via `adb logcat | grep "Nfc"` command):

D/nfcd    ( 3145): void NfcIpcSocket::Loop(const char*, bool):  116 of bytes to be sent... data=0x0 ret=0
D/NfcNci  ( 3145): void RouteDataSet::DeleteDatabase(): default db size=0; sec elem db size=0

After these logs, the device will turn off the screen and auto reboot. Is it a known issue?


Here is the more detailed log:

I/BrcmNfcNfa( 3145): LLCP_SendData () Local SAP:0x20, Remote SAP:0x10
D/NfcNci  ( 3145): static void PeerToPeer::NfaClientCallback(tNFA_P2P_EVT, tNFA_P2P_EVT_DATA*): enter; event=7
D/NfcNci  ( 3145): static void PeerToPeer::NfaClientCallback(tNFA_P2P_EVT, tNFA_P2P_EVT_DATA*): NFA_P2P_DATA_EVT; h=0x581; remote sap=0x10
I/BrcmNfcNfa( 3145): NFA_P2pReadData (): handle:0x581
I/BrcmNfcNfa( 3145): LLCP_ReadDataLinkData () Local SAP:0x20, Remote SAP:0x10
D/NfcNci  ( 3145): bool PeerToPeer::Receive(unsigned int, uint8_t*, uint16_t, uint16_t&): exit; nfa h: 0x581  ok: 1  actual len: 60
D/nfcd    ( 3145): void MessageHandler::ProcessResponse(NfcResponseType, NfcErrorCode, void*): enter response=2, error=0
D/nfcd    ( 3145): void NfcIpcSocket::WriteToOutgoingQueue(uint8_t*, size_t): enter, data=0xb76e8ec8, dataLen=16
D/nfcd    ( 3145): void NfcIpcSocket::WriteToOutgoingQueue(uint8_t*, size_t): Writing 16 bytes to gecko
D/nfcd    ( 3145): void* NfcService::EventLoop(): NFCService msg=14
D/nfcd    ( 3145): void MessageHandler::ProcessNotification(NfcNotificationType, void*): processNotificaton notification=4
D/nfcd    ( 3145): void NfcIpcSocket::WriteToOutgoingQueue(uint8_t*, size_t): enter, data=0xb76e8eb8, dataLen=112
D/nfcd    ( 3145): void NfcIpcSocket::WriteToOutgoingQueue(uint8_t*, size_t): Writing 112 bytes to gecko
D/nfcd    ( 3145): void NfcIpcSocket::Loop(const char*, bool):  116 of bytes to be sent... data=0x0 ret=0
D/NfcNci  ( 3145): void RouteDataSet::DeleteDatabase(): default db size=0; sec elem db size=0
[Blocking Requested - why for this release]: crash during nfc transferring
blocking-b2g: --- → 2.5?
Attached file transfer_crash_log.txt
full adb logcat log from pairing to crash during transferring
QA Whiteboard: [COM=NFC]
From the log the crash did not happen in nfcd but in gecko. you always see the message[1] because when gecko crash, nfcd will loss connection with gecko and close it self.

And i saw following error message but i am not sure if it is related to this bug
I/GeckoBluetooth(  205): OnError: BluetoothSocketInterface::Accept failed: 5

need Arno or BT team's help to check this

[1]
D/nfcd    ( 3145): void NfcIpcSocket::Loop(const char*, bool):  116 of bytes to be sent... data=0x0 ret=0
D/NfcNci  ( 3145): void RouteDataSet::DeleteDatabase(): default db size=0; sec elem db size=0
No longer blocks: b2g-nfcd
@ben could you find someone to check this?
Flags: needinfo?(btian)
(In reply to Dimi Lee[:dimi][:dlee] from comment #3)
> And i saw following error message but i am not sure if it is related to this
> bug
> I/GeckoBluetooth(  205): OnError: BluetoothSocketInterface::Accept failed: 5

The log simply means the listening socket is closed for incoming transfer. You can see the log every time you sends a file to another device, since we have to close the listening socket and create another socket for outgoing transfer.
Flags: needinfo?(btian)
Is it a regression? What about the repro rate?
I would say this should be a blocker given it's a basic function break.
blocking-b2g: 2.5? → 2.5+
seems like a dup of bug 1182644?
Flags: needinfo?(btian)
I don't know. Do they share similar call stack trace?
Flags: needinfo?(btian)
Ben, in this test case, do you know why BT connection is terminated? Is it a normal disconnection procedure?
Flags: needinfo?(btian)
Ken, it's normal as I mentioned in comment 5.
Flags: needinfo?(btian)
Hi Arno, we need your suggestions, thanks.
Flags: needinfo?(arno)
(In reply to Ken Chang[:ken] from comment #11)
> Hi Arno, we need your suggestions, thanks.

From the comments so far it seems (1) nfcd does not crash, (2) termination of BT connection is normal and (3) Gecko seems to crash. I would suggest to enable full debugging of the NFC/BT stacks to generate better logs and see where things go south.
Flags: needinfo?(arno)
I just tried a flame kk build on master but it looks okay.
ni? for Alison to check if this still exists, or could be related to Bug 1198968
Flags: needinfo?(ashiue)
Summary: nfcd crash during nfc/bluetooth transferring → gecko crash during nfc/bluetooth transferring
Attached file nfc_crash_0831.log
I tried with following build, and encountered twice device reboot issue. Please refer the attached log.

Build ID               20150831150202
Gaia Revision          c80e8ff25425b007181fd6e3de0500a0358fab37
Gaia Date              2015-08-31 16:35:09
Gecko Revision         https://hg.mozilla.org/mozilla-central/rev/cafb1c90f794a73100a8f0afb9fe3301df0f2bde
Gecko Version          43.0a1
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20150831.183117
Firmware Date          Mon Aug 31 18:31:30 EDT 2015
Bootloader             L1TC000118D0

However, transfer files between 2 master builds is easily fail, I would fire another bug to track this issue.
Flags: needinfo?(ashiue)
I tried in my own debug build and see the assertion

08-31 23:28:20.887: A/MOZ_Assert(21623): Assertion failure: !IsConnected(), at ../../../gecko/dom/bluetooth/bluedroid/BluetoothOppManager.cpp:383

This requires lots of times to reproduce.
Component: NFC → Bluetooth
Fred & Alison, do you use debug build to reproduce in comment 0 and comment 14? I'd like to confirm whether we are looking at the same problem since the assertion is only hit on debug build.

(In reply to Yoshi Huang[:allstars.chh](OOO 9.4 ~ 9.20) from comment #15)
> I tried in my own debug build and see the assertion
> 
> 08-31 23:28:20.887: A/MOZ_Assert(21623): Assertion failure: !IsConnected(),
> at ../../../gecko/dom/bluetooth/bluedroid/BluetoothOppManager.cpp:383
> 
> This requires lots of times to reproduce.
Flags: needinfo?(gasolin)
Flags: needinfo?(ashiue)
I full flash the pvt build (https://goo.gl/k71BHj).
Flags: needinfo?(ashiue)
Call stack. It seems there's re-entry problem.

==
#0  mozilla::dom::bluetooth::BluetoothOppManager::AfterOppDisconnected (this=this@entry=0xadb19ec0) at ../../../../gecko/central/dom/bluetooth/bluedroid/BluetoothOppManager.cpp:608
#1  0xb44c5606 in OnSocketDisconnect (this=0xadb19ec0, aSocket=<optimized out>) at ../../../../gecko/central/dom/bluetooth/bluedroid/BluetoothOppManager.cpp:1568
#2  mozilla::dom::bluetooth::BluetoothOppManager::OnSocketDisconnect (this=0xadb19ec0, aSocket=<optimized out>) at ../../../../gecko/central/dom/bluetooth/bluedroid/BluetoothOppManager.cpp:1537
#3  0xb44c9dcc in mozilla::dom::bluetooth::BluetoothSocket::OnDisconnect (this=<optimized out>) at ../../../../gecko/central/dom/bluetooth/bluedroid/BluetoothSocket.cpp:821
#4  0xb4209d7e in mozilla::ipc::SocketBase::NotifyDisconnect (this=this@entry=0xaa97e100) at ../../../../gecko/central/ipc/unixsocket/SocketBase.cpp:233
#5  0xb44c9fd8 in mozilla::dom::bluetooth::BluetoothSocket::Close (this=0xaa97e100) at ../../../../gecko/central/dom/bluetooth/bluedroid/BluetoothSocket.cpp:796
#6  0xb4209958 in mozilla::ipc::SocketRequestClosingTask::Run (this=<optimized out>) at ../../../../gecko/central/ipc/unixsocket/SocketBase.cpp:376
#7  0xb40e3cb4 in MessageLoop::RunTask (this=0xb21d41a0, task=0xaa9a5348) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:364
#8  0xb40e637e in MessageLoop::DeferOrRunPendingTask (this=<optimized out>, pending_task=...) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:372
#9  0xb40e7fc4 in DoWork (this=<optimized out>) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:459
#10 MessageLoop::DoWork (this=0xb21d41a0) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:438
#11 0xb40f0730 in mozilla::ipc::DoWorkRunnable::Run (this=<optimized out>) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/glue/MessagePump.cpp:220
#12 0xb3f94978 in ProcessNextEvent (aResult=0xbefd36b7, aMayWait=<optimized out>, this=0xb6a02550) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:874
#13 nsThread::ProcessNextEvent (this=0xb6a02550, aMayWait=<optimized out>, aResult=0xbefd36b7) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:762
#14 0xb3fa5dde in NS_ProcessNextEvent (aThread=<optimized out>, aMayWait=aMayWait@entry=true) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/glue/nsThreadUtils.cpp:277
#15 0xb3f964da in Shutdown (this=0xaa74c190) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:675
#16 nsThread::Shutdown (this=0xaa74c190) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:634
#17 0xb44c4160 in mozilla::dom::bluetooth::BluetoothOppManager::AfterOppDisconnected (this=0xadb19ec0) at ../../../../gecko/central/dom/bluetooth/bluedroid/BluetoothOppManager.cpp:609
#18 0xb44c4fe2 in mozilla::dom::bluetooth::BluetoothOppManager::ClientDataHandler (this=0xadb19ec0, aMessage=<optimized out>) at ../../../../gecko/central/dom/bluetooth/bluedroid/BluetoothOppManager.cpp:1095
#19 0xb44ca6ce in mozilla::dom::bluetooth::BluetoothSocket::ReceiveSocketData (this=<optimized out>, aBuffer=...) at ../../../../gecko/central/dom/bluetooth/bluedroid/BluetoothSocket.cpp:754
#20 0xb40e3cb4 in MessageLoop::RunTask (this=0xb21d41a0, task=0xace68680) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:364
#21 0xb40e637e in MessageLoop::DeferOrRunPendingTask (this=<optimized out>, pending_task=...) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:372
#22 0xb40e7fc4 in DoWork (this=<optimized out>) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:459
#23 MessageLoop::DoWork (this=0xb21d41a0) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:438
#24 0xb40f0730 in mozilla::ipc::DoWorkRunnable::Run (this=<optimized out>) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/glue/MessagePump.cpp:220
#25 0xb3f94978 in ProcessNextEvent (aResult=0xbefd384f, aMayWait=<optimized out>, this=0xb6a02550) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:874
#26 nsThread::ProcessNextEvent (this=0xb6a02550, aMayWait=<optimized out>, aResult=0xbefd384f) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:762
#27 0xb3fa5dde in NS_ProcessNextEvent (aThread=<optimized out>, aMayWait=aMayWait@entry=false) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/glue/nsThreadUtils.cpp:277
#28 0xb40f2c60 in mozilla::ipc::MessagePump::Run (this=0xb6a55910, aDelegate=0xb21d41a0) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/glue/MessagePump.cpp:95
#29 0xb40e3c40 in MessageLoop::RunInternal (this=this@entry=0xb21d41a0) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:234
#30 0xb40e3cf4 in RunHandler (this=0xb21d41a0) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:227
#31 MessageLoop::Run (this=0xb21d41a0) at /home/bentian/WORKSPACE/projects/gecko/central/ipc/chromium/src/base/message_loop.cc:201
#32 0xb4b0db6e in nsBaseAppShell::Run (this=0xb0e47640) at /home/bentian/WORKSPACE/projects/gecko/central/widget/nsBaseAppShell.cpp:156
#33 0xb4e69f0a in nsAppStartup::Run (this=0xb20996d0) at /home/bentian/WORKSPACE/projects/gecko/central/toolkit/components/startup/nsAppStartup.cpp:281
---Type <return> to continue, or q <return> to quit---
#34 0xb4e9000a in XREMain::XRE_mainRun (this=this@entry=0xbefd39c4) at ../../../../gecko/central/toolkit/xre/nsAppRunner.cpp:4292
#35 0xb4e90238 in XREMain::XRE_main (this=this@entry=0xbefd39c4, argc=argc@entry=1, argv=argv@entry=0xb6a2b190, aAppData=aAppData@entry=0xb6f88c70 <_ZL8sAppData>) at ../../../../gecko/central/toolkit/xre/nsAppRunner.cpp:4389
#36 0xb4e903aa in XRE_main (argc=1, argv=0xb6a2b190, aAppData=0xb6f88c70 <_ZL8sAppData>, aFlags=<optimized out>) at ../../../../gecko/central/toolkit/xre/nsAppRunner.cpp:4478
#37 0xb6f6a83c in do_main (argc=argc@entry=1, argv=argv@entry=0xb6a2b190) at ../../../../gecko/central/b2g/app/nsBrowserApp.cpp:167
#38 0xb6f6a94a in b2g_main (argc=argc@entry=1, argv=argv@entry=0xbefd4c84) at ../../../../gecko/central/b2g/app/nsBrowserApp.cpp:299
#39 0xb6f6a6cc in RunProcesses (aReservedFds=..., argv=0xbefd4c84, argc=1) at ../../../../gecko/central/b2g/app/B2GLoader.cpp:232
#40 main (argc=1, argv=0xbefd4c84) at ../../../../gecko/central/b2g/app/B2GLoader.cpp:297
The reentry problem results from weird nsThread shutdown behavior.

==
In bluetooth file transfer case, main thread shuts down |mReadFileThread| that reads file after disconnection [1]. Main thread then puts a shutdown event to |mReadFileThread| [2] and continues processing other event [3]. However before |mReadFileThread| is shutdown, main thread re-enters the same function and causes double free crash here [4].

[1] https://dxr.mozilla.org/mozilla-central/source/dom/bluetooth/bluedroid/BluetoothOppManager.cpp#608
[2] https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThread.cpp?case=true&from=nsThread.cpp#666
[3] https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThread.cpp?case=true&from=nsThread.cpp#675
[4] https://dxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/pthreads/ptthread.c?from=ptthread.c#580

Program received signal SIGSEGV, Segmentation fault.
PR_JoinThread (thred=0x5a5a5a5a) at ../../../../../../gecko/central/nsprpub/pr/src/pthreads/ptthread.c:580
580	    if ((0xafafafaf == thred->state)
(gdb) bt
#0  PR_JoinThread (thred=0x5a5a5a5a) at ../../../../../../gecko/central/nsprpub/pr/src/pthreads/ptthread.c:580
#1  0xb3f964ea in Shutdown (this=0xa9682a20) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:680
#2  nsThread::Shutdown (this=0xa9682a20) at /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:634
jesup,

Can you suggest possible fix on comment 19 problem regarding nsThread shutdown? I saw you've modified related code [1] in bug 998880. Or anyone you know who can help on this?

[1] https://hg.mozilla.org/mozilla-central/diff/2e2566a604f1/xpcom/threads/nsThread.cpp
Flags: needinfo?(rjesup)
It seems |nsThread::Shutdown()| would be triggered under several error/tear-down conditions. Providing some facilities to deal with the double-shutdown issue from the fundamental thread class, or handling this double-shutdown issue inside it directly would be good.

Probably using a local variable or another member data to take over the the value of |nsThread::mThread| before waiting for the while-loop[1] could solve this crash issue?

[1] https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThread.cpp?case=true&from=nsThread.cpp&offset=0#674
(In reply to Ben Tian [:btian] from comment #20)
> Can you suggest possible fix on comment 19 problem regarding nsThread
> shutdown? I saw you've modified related code [1] in bug 998880. Or anyone
> you know who can help on this?

In general, don't call Shutdown() twice on the same thread.  If processing more events could lead to another Shutdown(mBlahThread), do something like this:
 nsCOMPtr<nsIThread> temp = mBlahThread.forget();
 Shutdown(temp);

That way events processed while in Shutdown() can't try to recurse.
Flags: needinfo?(rjesup)
jesup,

Do you think Bruce's suggestion feasible to go? I agree with him that fixing it inside thread class is better to workaround in everywhere using the class.

(In reply to Bruce Sun [:brsun] from comment #21)
> It seems |nsThread::Shutdown()| would be triggered under several
> error/tear-down conditions. Providing some facilities to deal with the
> double-shutdown issue from the fundamental thread class, or handling this
> double-shutdown issue inside it directly would be good.
> 
> Probably using a local variable or another member data to take over the the
> value of |nsThread::mThread| before waiting for the while-loop[1] could
> solve this crash issue?
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThread.
> cpp?case=true&from=nsThread.cpp&offset=0#674
Flags: needinfo?(gasolin) → needinfo?(rjesup)
Steal the bug.

jesup, can you feedback this patch per your comment 22 suggestion?

Changes:
- adopt jesup's suggestion in comment 22.
- move |!IsConnected()| assertion from |StartSendingNextFile| to |ProcessNextBatch| since connection may be established before for multiple file sending.
- fix typo
Assignee: shuang → btian
Attachment #8656990 - Flags: review?(shuang)
Attachment #8656990 - Flags: feedback?(rjesup)
Comment on attachment 8656990 [details] [diff] [review]
Patch 1 (v1): Store local pointer of thread to shutdwon to avoid reentry crash

Review of attachment 8656990 [details] [diff] [review]:
-----------------------------------------------------------------

Please add BlueZ part.
Add BlueZ part per reviewer's comment.
Attachment #8656990 - Attachment is obsolete: true
Attachment #8656990 - Flags: review?(shuang)
Attachment #8656990 - Flags: feedback?(rjesup)
Attachment #8657009 - Flags: review?(shuang)
Attachment #8657009 - Flags: feedback?(rjesup)
Attachment #8657009 - Flags: feedback?(rjesup) → feedback+
Forwarding the question about nsThread::Shutdown() reentrancy to XPCOM module owner.  If that were to be done, it would need to be done in a separate (XPCOM) bug.  I don't have a strong opinion on it, though it might be smart of MOZ_ASSERT/MOZ_CRASH if we don't handle re-entrant shutdown there.
Flags: needinfo?(rjesup) → needinfo?(nfroyd)
Ben,
Shall we uplift to v2.2R?
Flags: needinfo?(btian)
Actually there is already a check against nsThread::mThread at the beginning of nsThread::Shutdown(). Wouldn't it be better to steal mThread during the first entry of nsThread::Shutdown() so that it will be safe to reenter the function? Or at least we should abort on reentrance to not make it less obvious use-after-free crash.
(In reply to Ben Tian [:btian] from comment #19)
> The reentry problem results from weird nsThread shutdown behavior.
> 
> ==
> In bluetooth file transfer case, main thread shuts down |mReadFileThread|
> that reads file after disconnection [1]. Main thread then puts a shutdown
> event to |mReadFileThread| [2] and continues processing other event [3].
> However before |mReadFileThread| is shutdown, main thread re-enters the same
> function and causes double free crash here [4].
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/source/dom/bluetooth/bluedroid/
> BluetoothOppManager.cpp#608
> [2]
> https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThread.
> cpp?case=true&from=nsThread.cpp#666
> [3]
> https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThread.
> cpp?case=true&from=nsThread.cpp#675
> [4]
> https://dxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/pthreads/
> ptthread.c?from=ptthread.c#580
> 
> Program received signal SIGSEGV, Segmentation fault.
> PR_JoinThread (thred=0x5a5a5a5a) at
> ../../../../../../gecko/central/nsprpub/pr/src/pthreads/ptthread.c:580
> 580	    if ((0xafafafaf == thred->state)
> (gdb) bt
> #0  PR_JoinThread (thred=0x5a5a5a5a) at
> ../../../../../../gecko/central/nsprpub/pr/src/pthreads/ptthread.c:580
> #1  0xb3f964ea in Shutdown (this=0xa9682a20) at
> /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:680
> #2  nsThread::Shutdown (this=0xa9682a20) at
> /home/bentian/WORKSPACE/projects/gecko/central/xpcom/threads/nsThread.cpp:634

Please include more context in your stack next time.

Spinning event loops in Shutdown is obviously bad.  I think what you really want here is something like the patch in bug 1200922, which adds nsIThread::asyncShutdown.  I would be *really* curious why BluetoothOppManager::AfterOppDisconnected() is getting called twice...that seems like a bug in and of itself.  Can you explain that to me?

(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #30)
> Actually there is already a check against nsThread::mThread at the beginning
> of nsThread::Shutdown(). Wouldn't it be better to steal mThread during the
> first entry of nsThread::Shutdown() so that it will be safe to reenter the
> function? Or at least we should abort on reentrance to not make it less
> obvious use-after-free crash.

I think things left in the event queue might try to touch mThread, so we can't null it out at the beginning of nsThread::Shutdown().  Trying to abort on re-entering nsThread::Shutdown() seems like a hard problem to me, since storing information in the nsThread object itself would be subject to the same kind of use-after-free that the above stack shows (assuming I am understanding things correctly).
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #31)
> Please include more context in your stack next time.
> 
> Spinning event loops in Shutdown is obviously bad.  I think what you really
> want here is something like the patch in bug 1200922, which adds
> nsIThread::asyncShutdown.  I would be *really* curious why
> BluetoothOppManager::AfterOppDisconnected() is getting called twice...that
> seems like a bug in and of itself.  Can you explain that to me?

|BluetoothOppManager::AfterOppDisconnected| is called whenever bluetooth connection is lost, either the device actively disconnects or passively be notified of disconnection. With the check [1] we thought the function can be called multiple times since subsequent calls do nothing as |mReadFileThread| is already set as nullptr.

[1] https://dxr.mozilla.org/mozilla-central/source/dom/bluetooth/bluedroid/BluetoothOppManager.cpp#608
Flags: needinfo?(btian)
(In reply to Shawn Huang [:shawnjohnjr] from comment #29)
> Ben,
> Shall we uplift to v2.2R?

Yes since it's a potential crash bug. Set 2.2R affected.
Blocks: 1202535
(In reply to Randell Jesup [:jesup] from comment #27)
> Forwarding the question about nsThread::Shutdown() reentrancy to XPCOM
> module owner.  If that were to be done, it would need to be done in a
> separate (XPCOM) bug.  I don't have a strong opinion on it, though it might
> be smart of MOZ_ASSERT/MOZ_CRASH if we don't handle re-entrant shutdown
> there.

Opened bug 1202535 to track fix inside nsThread. I'll land fix in bluetooth first for this bug.
https://hg.mozilla.org/mozilla-central/rev/802ca06788d9
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → FxOS-S7 (18Sep)
(In reply to Ben Tian [:btian] from comment #33)
> (In reply to Shawn Huang [:shawnjohnjr] from comment #29)
> > Ben,
> > Shall we uplift to v2.2R?
> 
> Yes since it's a potential crash bug. Set 2.2R affected.

Then we'll need it in 2.2R.
blocking-b2g: 2.5+ → 2.2r+
Hi, this failed to apply to 2.2r :

renamed 1191715 -> opp-crash.patch
applying opp-crash.patch
patching file dom/bluetooth/bluedroid/BluetoothOppManager.cpp
Hunk #4 FAILED at 786
1 out of 4 hunks FAILED -- saving rejects to file dom/bluetooth/bluedroid/BluetoothOppManager.cpp.rej
patching file dom/bluetooth/bluez/BluetoothOppManager.cpp
Hunk #4 FAILED at 773
1 out of 4 hunks FAILED -- saving rejects to file dom/bluetooth/bluez/BluetoothOppManager.cpp.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working directory
errors during apply, please fix and refresh opp-crash.patch

could you take a look ? thanks!
Flags: needinfo?(btian)
Please apply the rebased 2.2r patch. Thanks.
Flags: needinfo?(btian)
This issue is verified fixed in the latest Flame and Spark 2.5 Master builds.
The device is able to transfer files successfully, and does not experience a crash or reboot once the transfer is complete.
Confirmed using STR from bug 1182644, which provided reliable reproduction of the issue on affected builds.

Environmental Variables:
Device: Aries 2.5
BuildID: 20150909215207
Gaia: 47459eead04385e22f967012b824f5abdddcfb7c
Gecko: dd2a1d737a64d9a3f23714ec5cc623ec8933b51f
Gonk: 2916e2368074b5383c80bf5a0fba3fc83ba310bd
Version: 43.0a1 (2.5)
Firmware Version: D5803_23.1.A.1.28_NCB.ftf
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0

Environmental Variables:
Device: Flame 2.5
BuildID: 20150910030223
Gaia: 47459eead04385e22f967012b824f5abdddcfb7c
Gecko: dd2a1d737a64d9a3f23714ec5cc623ec8933b51f
Gonk: c4779d6da0f85894b1f78f0351b43f2949e8decd
Version: 43.0a1 (2.5)
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmercado)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmercado)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: