Closed Bug 1011110 Opened 10 years ago Closed 10 years ago

Crash while toggling bluetooth and running other stability scripts

Categories

(Firefox OS Graveyard :: Bluetooth, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

(blocking-b2g:1.4+, firefox30 wontfix, firefox31 wontfix, firefox32 fixed, b2g-v1.4 fixed, b2g-v2.0 fixed)

RESOLVED FIXED
2.0 S3 (6june)
blocking-b2g 1.4+
Tracking Status
firefox30 --- wontfix
firefox31 --- wontfix
firefox32 --- fixed
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed

People

(Reporter: ggrisco, Assigned: khuey)

References

()

Details

(Keywords: crash, Whiteboard: [caf-crash 214][caf priority: p2][CR 664942][b2g-crash][p=2])

Crash Data

Attachments

(3 files, 2 obsolete files)

Crash in PBluetoothRequest::Transition while turning bluetooth on/off

signature:

[@ mozalloc_abort(char const*) | NS_DebugBreak | mozilla::dom::bluetooth::PBluetoothRequest::Transition(mozilla::dom::bluetooth::PBluetoothRequest::State, mozilla::ipc::Trigger, mozilla::dom::bluetooth::PBluetoothRequest::State*) | mozilla::dom::bluetooth::PBluetoothRequestParent::Send__delete__(mozilla::dom::bluetooth::PBluetoothRequestParent*, mozilla::dom::bluetooth::BluetoothReply const&) ]
xpcom_runtime_abort([Parent 205] ###!!! ABORT: __delete__()d actor: file PBluetoothRequest.cpp, line 29)
This happened during enabled process.

05-15 15:00:14.080   205 15378 E bt-btm  : BTM_SecRegister:p_cb_info->p_le_callback == 0xb0a371a1 
05-15 15:00:14.080   205 15378 E bt-btm  : BTM_SecRegister: btm_cb.api.p_le_callback = 0xb0a371a1 
05-15 15:00:14.090   205   741 E bt-btif : btif_config_get(L189): assert failed: section && *section && key && *key && name && *name && bytes && type
05-15 15:00:14.090   205   741 E bt-btif : btif_config_get(L189): assert failed: section && *section && key && *key && name && *name && bytes && type
05-15 15:00:14.090   205   741 E bt-btif : btif_storage_get_adapter_property service_mask:0x40040
05-15 15:00:14.090   205 15378 W bt-l2cap: L2CAP - L2CA_Register() called for PSM: 0x0019
05-15 15:00:14.090   205 15378 W bt-l2cap: L2CAP - L2CA_Register() called for PSM: 0x0017
05-15 15:00:14.090   205 15378 W bt-l2cap: L2CAP - L2CA_Register() called for PSM: 0x001b
05-15 15:00:14.100   205 15381 E bt_mct  : hci lib postload completed
05-15 15:00:14.100   205   741 I bte_conf: Attempt to load did conf from /etc/bluetooth/bt_did.conf
05-15 15:00:14.100   205   741 I bte_conf: [1] primary_record=1 vendor_id=0x001D vendor_id_source=0x0001 product_id=0x1200 version=0x1436
05-15 15:00:14.100   205   741 I bte_conf: Attempt to load did conf from /etc/bluetooth/bt_did.conf
05-15 15:00:14.100   205   741 I bte_conf: Attempt to load did conf from /etc/bluetooth/bt_did.conf
05-15 15:00:14.100   205   741 I GeckoBluetooth: AdapterStateChangeCallback: BT_STATE 1
05-15 15:00:14.100   205   741 D bt-btif : btif_av_state_idle_handler event:BTA_AV_ENABLE_EVT flags 0
05-15 15:00:14.100   205   741 D bt-btif : btif_av_state_idle_handler event:BTA_AV_REGISTER_EVT flags 0
05-15 15:00:14.110   205   205 I Gecko   : [Parent 205] ###!!! ABORT: __delete__()d actor: file PBluetoothRequest.cpp, line 29
05-15 15:00:14.110   205   205 E Gecko   : mozalloc_abort: [Parent 205] ###!!! ABORT: __delete__()d actor: file PBluetoothRequest.cpp, line 29
blocking-b2g: 1.4? → 1.4+
Keywords: crash
Whiteboard: [CR 664942] → [CR 664942][b2g-crash]
I think we can try to reproduce this bug using some extreme ways:
It looks like Bluetooth just got disabled, however, after 90ms, bluedroid enable again.

05-15 15:00:12.480   205   741 I bt_vendor: bt-vendor : BT_VND_OP_POWER_CTRL: Off
05-15 15:00:12.480   205   741 I bt_vendor: Starting hciattach daemon
05-15 15:00:12.480   205   741 I bt_vendor: try to set false
05-15 15:00:12.480   205 15156 I GKI_LINUX: gki_task_entry: gki_task task_id=0 [BTU] terminating
05-15 15:00:12.480   205   741 I GKI_LINUX: GKI_exit_task: GKI_exit_task 0 done
05-15 15:00:12.480   205   741 I GKI_LINUX: GKI_destroy_task: GKI_shutdown(): task [BTU] terminated
05-15 15:00:12.480   205   741 I GeckoBluetooth: AdapterStateChangeCallback: BT_STATE 0
05-15 15:00:12.570   205   205 I bluedroid: enable
05-15 15:00:12.570   205   205 I bt_hci_bdroid: init
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #4)
> I think we can try to reproduce this bug using some extreme ways:

Hi Shawn, can you clarify if you are going to attempt to reproduce or if you are asking for some action from me?

Thanks,

Greg
Flags: needinfo?(shuang)
Summary: Crash while toggline bluetooth and running other stability scripts → Crash while toggling bluetooth and running other stability scripts
Greg,
I just tried to figure out any STR to reproduce this bug. After checking 'EXTRA file', I found enable/disable interval is 90ms, after that mozalloc_abort occurred.

I'm not asking any actions because this test is just simply enable/disable test case.
What's the test script's behavior?
Flags: needinfo?(shuang)
Crash observed on: 

Device: msm8226
Firmware: AU_LABEL
Moz BuildID: 20140511000204
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=17fb44880e95bc7ae363a609d811bf5a9a067b5b
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=2f11e3aba98eb785ec24504fe9988ab61a03b64d
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #6)

> What's the test script's behavior?

This is what I have from test team:

1. Make calls from phone (mobile originated)
2. Open camera
3. Receive calls (mobile terminated)
4. Device kept in idle mode for some time.
5. Performed Bluetooth on/off multiple times.
Whiteboard: [CR 664942][b2g-crash] → [CR 664942][b2g-crash][ETA=22][p=2]
This only happened whenever something goes wrong with BluetoothReplyRunnable, AFAIK, since ACTOR's previous state becomes to unknown. This might lead PBluetoothRequest::Transition to NS_RUNTIMEABORT.
Whiteboard: [CR 664942][b2g-crash][ETA=22][p=2] → [CR 664942][b2g-crash][ETA:5/22][p=2]
I'm running Bluetooth on/off auto-test with gdb attach overnight now, see if I can use gdb to get more debug information. Right now, I can only identify actor was been deleted.

Thomas, any suggestion would like to share?
Flags: needinfo?(tzimmermann)
Sorry, no. Someone with a deeper insight into our IPDL implementation might be able help.
Flags: needinfo?(tzimmermann)
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.102
Moz BuildID: 20140515000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=2e97bee6bb79d3577dba1bf2a1bbfcba64ee99ab
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=35f27a8e9b3f651748aa22095553024556272de8
https://bug997962.bugzilla.mozilla.org/attachment.cgi?id=8426077
There is a racing problem when accessing static nsTArrays on different threads, it might be problematic in general, specially BluetoothReplyRunnable pointer stores in those nsTArrays in Bug 997962. We shall try this patch first to see any stability improvement.
Greg,
Can we apply patches based on https://bugzilla.mozilla.org/show_bug.cgi?id=997962#c43 and see if we can fix in general?
Flags: needinfo?(ggrisco)
Whiteboard: [CR 664942][b2g-crash][ETA:5/22][p=2] → [CR 664942][b2g-crash][ETA:5/27][p=2]
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #15)
> Greg,
> Can we apply patches based on
> https://bugzilla.mozilla.org/show_bug.cgi?id=997962#c43 and see if we can
> fix in general?

That patch has been obsoleted. Please use https://bugzilla.mozilla.org/attachment.cgi?id=8426118 instead.
Another thing that I'm trying to do is to enable IPC debug log.
Here is what I did:
In adb shell, do:
#stop b2g
#export MOZ_IPC_MESSAGE_LOG=1
#b2g.sh

This will output IPC transaction log to logcat for better understanding why things happened.
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #17)
> Another thing that I'm trying to do is to enable IPC debug log.
> Here is what I did:
> In adb shell, do:
> #stop b2g
> #export MOZ_IPC_MESSAGE_LOG=1
> #b2g.sh
> 
> This will output IPC transaction log to logcat for better understanding why
> things happened.
This only works when the b2g build with flag B2G_DEBUG=1
While enabling bluetooth normally, IPC msg, Msg___delete__ sent from PBluetoothRequestParent to PBluetoothRequestChild. I'm still trying to enable these logs and try to reproduce this bug locally.

05-21 09:42:54.100 I/GeckoIPC( 4229): [time:1400679774116858][4229->4659][PBluetoothParent] Sending Msg_Enabled([TODO])
05-21 09:42:54.100 I/GeckoIPC( 4229): [time:1400679774117074][4229->4659][PBluetoothParent] Sending Msg_Notify([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4229): [time:1400679774118763][4229->4659][PBluetoothParent] Sending Msg_Notify([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4659): [time:1400679774120410][4659<-4229][PBluetoothChild] Received Msg_Enabled([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4659): [time:1400679774120525][4659<-4229][PBluetoothChild] Received Msg_Notify([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4659): [time:1400679774120651][4659<-4229][PBluetoothChild] Received Msg_Notify([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4659): [time:1400679774122496][4659->4229][PBluetoothChild] Sending Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4229): [time:1400679774123173][4229<-4659][PBluetoothParent] Received Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4659): [time:1400679774123420][4659->4229][PBluetoothChild] Sending Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4229): [time:1400679774123516][4229->4659][PBluetoothRequestParent] Sending Msg___delete__([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4229): [time:1400679774123763][4229<-4659][PBluetoothParent] Received Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4229): [time:1400679774124058][4229->4659][PBluetoothRequestParent] Sending Msg___delete__([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4659): [time:1400679774124122][4659<-4229][PBluetoothRequestChild] Received Msg___delete__([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4659): [time:1400679774124578][4659<-4229][PBluetoothRequestChild] Received Msg___delete__([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4659): [time:1400679774125342][4659->4229][PBluetoothChild] Sending Msg_RegisterSignalHandler([TODO])
05-21 09:42:54.110 I/GeckoIPC( 4229): [time:1400679774126175][4229<-4659][PBluetoothParent] Received Msg_RegisterSignalHandler([TODO])
05-21 09:42:54.120 I/GeckoIPC( 4659): [time:1400679774127825][4659->4229][PBluetoothChild] Sending Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.120 I/GeckoIPC( 4229): [time:1400679774128475][4229<-4659][PBluetoothParent] Received Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.120 I/GeckoIPC( 4229): [time:1400679774128660][4229->4659][PBluetoothRequestParent] Sending Msg___delete__([TODO])
05-21 09:42:54.130 I/GeckoIPC( 4659): [time:1400679774142196][4659->4229][PBluetoothChild] Sending Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.130 I/GeckoIPC( 4229): [time:1400679774143253][4229<-4659][PBluetoothParent] Received Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.130 I/GeckoIPC( 4229): [time:1400679774143406][4229->4659][PBluetoothRequestParent] Sending Msg___delete__([TODO])
05-21 09:42:54.130 I/GeckoIPC( 4659): [time:1400679774144221][4659->4229][PBluetoothChild] Sending Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.130 I/GeckoIPC( 4229): [time:1400679774144748][4229<-4659][PBluetoothParent] Received Msg_PBluetoothRequestConstructor([TODO])
05-21 09:42:54.130 I/GeckoIPC( 4659): [time:1400679774146490][4659<-4229][PBluetoothRequestChild] Received Msg___delete__([TODO])
05-21 09:42:54.170 I/GeckoIPC( 4229): [time:1400679774181173][4229->4659][PBluetoothRequestParent] Sending Msg___delete__([TODO])
05-21 09:42:54.170 I/GeckoIPC( 4659): [time:1400679774183142][4659<-4229][PBluetoothRequestChild] Received Msg___delete__([TODO])
05-21 09:42:54.170 I/GeckoIPC( 4659): [time:1400679774186305][4659<-4229][PBluetoothRequestChild] Received Msg___delete__([TODO])
05-21 09:42:54.270 I/GeckoIPC( 4229): [time:1400679774285863][4229->4659][PBluetoothParent] Sending Msg_Notify([TODO])
05-21 09:42:54.270 I/GeckoIPC( 4659): [time:1400679774286129][4659<-4229][PBluetoothChild] Received Msg_Notify([TODO])
05-21 09:42:54.270 I/GeckoIPC( 4659): [time:1400679774286274][4659->4229][PBluetoothChild] Sending Msg_RegisterSignalHandler([TODO])
05-21 09:42:54.270 I/GeckoIPC( 4229): [time:1400679774286536][4229<-4659][PBluetoothParent] Received Msg_RegisterSignalHandler([TODO])
05-21 09:42:55.180 I/GeckoIPC( 4229): [time:1400679775190585][4229->4659][PBluetoothParent] Sending Msg_Notify([TODO])
05-21 09:42:55.240 I/GeckoIPC( 4659): [time:1400679775252220][4659<-4229][PBluetoothChild] Received Msg_Notify([TODO])
05-21 09:42:55.240 I/GeckoIPC( 4659): [time:1400679775252937][4659->4229][PBluetoothChild] Sending Msg_RegisterSignalHandler([TODO])
05-21 09:42:55.240 I/GeckoIPC( 4229): [time:1400679775253775][4229<-4659][PBluetoothParent] Received Msg_RegisterSignalHandler([TODO])
05-21 09:42:57.140 I/GeckoIPC( 4229): [time:1400679777154398][4229->4659][PBluetoothParent] Sending Msg_Notify([TODO])
05-21 09:42:57.160 I/GeckoIPC( 4659): [time:1400679777176431][4659<-4229][PBluetoothChild] Received Msg_Notify([TODO])
05-21 09:42:57.170 I/GeckoIPC( 4659): [time:1400679777179584][4659->4229][PBluetoothChild] Sending Msg_RegisterSignalHandler([TODO])
05-21 09:42:57.170 I/GeckoIPC( 4229): [time:1400679777184305][4229<-4659][PBluetoothParent] Received Msg_RegisterSignalHandler([TODO])
05-21 09:42:58.680 I/GeckoIPC( 4229): [time:1400679778688316][4229->4659][PBluetoothParent] Sending Msg_Notify([TODO])
05-21 09:42:58.690 I/GeckoIPC( 4659): [time:1400679778700842][4659<-4229][PBluetoothChild] Received Msg_Notify([TODO])
05-21 09:42:58.690 I/GeckoIPC( 4659): [time:1400679778701496][4659->4229][PBluetoothChild] Sending Msg_RegisterSignalHandler([TODO])
05-21 09:42:58.690 I/GeckoIPC( 4229): [time:1400679778702906][4229<-4659][PBluetoothParent] Received Msg_RegisterSignalHandler([TODO])
I still cannot reproduce this bug using my auto Bluetooth on/off test scripts on Nexus 5/Flame after 8000 times on/off tests.

I think this can be one of use-after-free-of-IPDL-actor bugs.
While Bluetooh has been turned on, the following Request types were performed:
1. TDefaultAdapterPathRequest
2. TPairedDevicePropertiesRequest
3. TDefaultAdapterPathRequest
4. TStartDiscoveryRequest  (If no previous connected headset)
5. TPairedDevicePropertiesRequest
6. TStopDiscoveryRequest (If Discovery timeout, triggers StopDiscovery)

Among these six operations, if IPDL actor has been deleted, crash happened.
http://dxr.mozilla.org/mozilla-central/source/dom/bluetooth/ipc/BluetoothParent.cpp#49
I can't figure out there is any chance that mRequest will be deleted.
Actor will be deleted only in DeallocPBluetoothRequestParent, and the path is:
PBluetoothRequestParent::Send__delete__ -> PBluetoothParent::RemoveMangee -> BluetoothParent::DeallocPBluetoothRequestParent->BluetoothRequestParent::~BluetoothRequestParent.
It looks like using raw pointer is bad idea. After checking other modules, they have NS ref-counting, so I guess adding ref-counting for BluetoothRequestParent probably can this bug. I will add NS_INLINE_DECL_THREADSAFE_REFCOUNTING(BluetoothRequestParent) and test it. Meanwhile, I'm still finding the root cause why BluetoothRequestParent destructor been called before PBluetoothRequestParent::Send__delete__ got called.
Attachment #8427000 - Flags: feedback? → feedback?(tzimmermann)
libxul.so!mozilla::dom::bluetooth::PBluetoothRequestParent::Send__delete__(mozilla::dom::bluetooth::PBluetoothRequestParent*, mozilla::dom::bluetooth::BluetoothReply const&) [PBluetoothRequestParent.cpp : 69 + 0x7]
     r0 = 0x0000001d    r1 = 0x00000000    r2 = 0x00000000    r3 = 0x000c0001
     r4 = 0x97339f40    r5 = 0x9fbff700    r6 = 0xbedc36a8    r7 = 0x000c0001
     r8 = 0x9fbff2a0    r9 = 0x00000001   r10 = 0x00000000    fp = 0x00000000
     sp = 0xbedc3690    pc = 0xb4ed62a7

r0 = 0x0000001d, r1 = 0x00000000 seems invalid. So something goes wrong with both PBluetoothReuqestParent* and  BluetoothReply*. Patch on Comment 23 is not enough. Still need to check BluetoothReply*
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.102
Moz BuildID: 20140515000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=2e97bee6bb79d3577dba1bf2a1bbfcba64ee99ab
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=35f27a8e9b3f651748aa22095553024556272de8
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.102
Moz BuildID: 20140515000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=2e97bee6bb79d3577dba1bf2a1bbfcba64ee99ab
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=35f27a8e9b3f651748aa22095553024556272de8
sorry for the spam, please ignore comments 25, 26.
Flags: needinfo?(ggrisco)
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #24)
> libxul.so!mozilla::dom::bluetooth::PBluetoothRequestParent::
> Send__delete__(mozilla::dom::bluetooth::PBluetoothRequestParent*,
> mozilla::dom::bluetooth::BluetoothReply const&) [PBluetoothRequestParent.cpp
> : 69 + 0x7]
>      r0 = 0x0000001d    r1 = 0x00000000    r2 = 0x00000000    r3 = 0x000c0001
>      r4 = 0x97339f40    r5 = 0x9fbff700    r6 = 0xbedc36a8    r7 = 0x000c0001
>      r8 = 0x9fbff2a0    r9 = 0x00000001   r10 = 0x00000000    fp = 0x00000000
>      sp = 0xbedc3690    pc = 0xb4ed62a7
> 
> r0 = 0x0000001d, r1 = 0x00000000 seems invalid. So something goes wrong with
> both PBluetoothReuqestParent* and  BluetoothReply*. Patch on Comment 23 is
> not enough. Still need to check BluetoothReply*

If r0 is 'this' pointer, r1, r2 are fucntion parameters, these registers had been modified.
So I can only say that adding reference counting on BluetoothRequestParent should help use-after-free-of-IPDL-actor. But i still cannot explain why r0-r2 registers are all invalid.
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.102
Moz BuildID: 20140515000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=2e97bee6bb79d3577dba1bf2a1bbfcba64ee99ab
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=35f27a8e9b3f651748aa22095553024556272de8
Comment on attachment 8427000 [details] [diff] [review]
0001-Add-thread-safe-ref-counting-for-BluetoothRequestPar.patch

Per discussed with echou, redirect to bent. Thanks.
Attachment #8427000 - Flags: review?(benjamin) → review?(bent.mozilla)
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.102
Moz BuildID: 20140515000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=2e97bee6bb79d3577dba1bf2a1bbfcba64ee99ab
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=35f27a8e9b3f651748aa22095553024556272de8
Comment on attachment 8427000 [details] [diff] [review]
0001-Add-thread-safe-ref-counting-for-BluetoothRequestPar.patch

Review of attachment 8427000 [details] [diff] [review]:
-----------------------------------------------------------------

How can this fix your problem?  You are only reference counting this in the exact spots where you new/deleted it before, so this doesn't change the lifetime at all.
Attachment #8427000 - Flags: review-
Comment on attachment 8427000 [details] [diff] [review]
0001-Add-thread-safe-ref-counting-for-BluetoothRequestPar.patch

Review of attachment 8427000 [details] [diff] [review]:
-----------------------------------------------------------------

Following STR does not work with this patch. 

1) pair with device A
2) unpair with device A
3) Repeat (1) and (2) 10 times. 

Please upload a new fix
Flags: needinfo?(shuang)
Attachment #8427000 - Flags: review?(bent.mozilla)
Flags: needinfo?(shuang)
Attachment #8427000 - Attachment is obsolete: true
Attachment #8427000 - Flags: feedback?(echou)
Eric,

Please help move this ahead as this is blocking partner testing.
Flags: needinfo?(echou)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #32)
> Comment on attachment 8427000 [details] [diff] [review]
> 0001-Add-thread-safe-ref-counting-for-BluetoothRequestPar.patch
> 
> Review of attachment 8427000 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> How can this fix your problem?  You are only reference counting this in the
> exact spots where you new/deleted it before, so this doesn't change the
> lifetime at all.
I can't reproduce locally, somehow it's hard for me to speed the progress.
ReplyRunnable is constructed in BluetoothRequestParent constructor, BluetoothRequestParent is constructed in |BluetoothParent::AllocPBluetoothRequestParent()|, destructed in |BluetoothParent::DeallocPBluetoothRequestParent()|. |DeallocPBluetoothRequestParent()| only happened after |mRequest->Send__delete__()| had been executed.

http://dxr.mozilla.org/mozilla-central/source/dom/bluetooth/ipc/BluetoothParent.cpp#49
I really can't explain here, mRequest got freed based on the crash stack frame #3, if i interpreted correctly. Kyle, do you have any suggestion from IPC perspective?
Flags: needinfo?(khuey)
Attached patch Patch (obsolete) — Splinter Review
Try this.
Flags: needinfo?(khuey)
Tapas, can you try the patch in comment 36?
Flags: needinfo?(tkundu)
Comment on attachment 8430977 [details] [diff] [review]
Patch

Review of attachment 8430977 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/bluetooth/ipc/BluetoothParent.cpp
@@ +59,2 @@
>    void
>    Revoke()

I tried, and found this |Revoke()| function will never be executed during Bluetooth on/off. I wonder when will this |Revoke| be called?
#comment 38 has given this info already :) . Clearing NI on me.
Flags: needinfo?(tkundu)
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #38)
> Comment on attachment 8430977 [details] [diff] [review]
> Patch
> 
> Review of attachment 8430977 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/bluetooth/ipc/BluetoothParent.cpp
> @@ +59,2 @@
> >    void
> >    Revoke()
> 
> I tried, and found this |Revoke()| function will never be executed during
> Bluetooth on/off. I wonder when will this |Revoke| be called?

It's called when the other process dies unexpectedly (say if the app doing bluetooth is OOM killed in the middle of it).

(In reply to Tapas Kumar Kundu from comment #39)
> #comment 38 has given this info already :) . Clearing NI on me.

Unless I'm missing something comment 38 doesn't give the results of a test run on my patch ...
Flags: needinfo?(khuey) → needinfo?(tkundu)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #40)

Patch looks fine in my own testing of bluetooth on/off . I didn't see issues like #comment 33. But I asked my colleagues to test this patch for stability test.  I will confirm here once they confirm me.
Comment on attachment 8430977 [details] [diff] [review]
Patch

Review of attachment 8430977 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/bluetooth/ipc/BluetoothParent.cpp
@@ +68,5 @@
>      MOZ_CRASH("This should never be called!");
>    }
> +
> +  virtual void
> +  ReleaseMembers() MOZ_OVERRIDE

Please go ahead and mark this class as MOZ_FINAL.
Attachment #8430977 - Flags: review?(bent.mozilla) → review+
Attached patch PatchSplinter Review
Attachment #8430977 - Attachment is obsolete: true
Attachment #8432840 - Flags: review+
Just waiting on the confirmation from comment 41 to land this.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #44)
> Just waiting on the confirmation from comment 41 to land this.

This change is still under testing. You can land it if it looks ok to you.
https://hg.mozilla.org/mozilla-central/rev/1cd1d27985e9
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
https://hg.mozilla.org/releases/mozilla-b2g30_v1_4/rev/775862aee0ea
Whiteboard: [CR 664942][b2g-crash][ETA:5/27][p=2] → [CR 664942][b2g-crash][p=2]
Target Milestone: --- → 2.0 S3 (6june)
Whiteboard: [CR 664942][b2g-crash][p=2] → [caf priority: p2][CR 664942][b2g-crash][p=2]
Flags: needinfo?(echou)
We didn't see this anymore. It seems like it is fixed. Thanks for your help
Flags: needinfo?(tkundu)
Whiteboard: [caf priority: p2][CR 664942][b2g-crash][p=2] → [caf-crash 214][caf priority: p2][CR 664942][b2g-crash][p=2]
Test case needs to be added to cover this issue
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(dharris)
Test case added in moztrap:

https://moztrap.mozilla.org/manage/case/14325/
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(dharris)
Flags: in-moztrap+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: