Closed
Bug 976883
Opened 10 years ago
Closed 10 years ago
[Sora][BT] Firefox system crash after pair and unpair 15 times.
Categories
(Firefox OS Graveyard :: Bluetooth, defect, P1)
Firefox OS Graveyard
Bluetooth
Tracking
(blocking-b2g:1.3+, firefox28 wontfix, firefox29 wontfix, firefox30 fixed, b2g-v1.3 fixed, b2g-v1.3T fixed, b2g-v1.4 fixed)
People
(Reporter: sync-1, Assigned: echou)
References
()
Details
(Keywords: crash, Whiteboard: [b2g-crash][osrestartcrash])
Attachments
(9 files)
635.68 KB,
application/octet-stream
|
Details | |
6.21 MB,
application/x-zip-compressed
|
Details | |
38.80 KB,
application/zip
|
Details | |
56.87 KB,
application/zip
|
Details | |
91.41 KB,
text/plain
|
Details | |
84.51 KB,
text/plain
|
Details | |
120.76 KB,
text/plain
|
Details | |
1.75 KB,
patch
|
shawnjohnjr
:
review+
ben.tian
:
feedback+
|
Details | Diff | Splinter Review |
1.68 KB,
patch
|
bajaj
:
approval-mozilla-b2g28+
|
Details | Diff | Splinter Review |
DEFECT DESCRIPTION: Firefox system crash after pair and unpair about 15 times. REPRODUCING PROCEDURES: 1. Active BT; 2. Pair and unpair with other device for about 15 tiems; 3. Come across system crash and request sent report to mozilla.--KO EXPECTED BEHAVIOUR: Should come across system crash.
(In reply to sync-1 from comment #2) > Created attachment 8381826 [details] > Firefox OS crashed log "service 'xxx' died" is the root reason for this issue.
Comment 4•10 years ago
|
||
It's more possible relative with platform, I think.
Assignee | ||
Comment 5•10 years ago
|
||
Jamin will take a first look.
Component: Gaia::Bluetooth File Transfer → Bluetooth
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → jaliu
Assignee | ||
Comment 6•10 years ago
|
||
Quick update: Sora uses FxOS 1.3 / JB / BlueZ. Jamin will analyze the attached log and see if he can reproduce on Buri first (FxOS 1.3 / ICS / BlueZ). Reporter, could you attach the hcidump log for further investigation?
Updated•10 years ago
|
Updated•10 years ago
|
Severity: normal → critical
Whiteboard: [b2g-crash] → [b2g-crash][osrestartcrash]
Comment 8•10 years ago
|
||
Hi reporter, Thank you for your report. I can't reproduce this bug on my device Buri with v1.3_last. I have few questions for reporter. I've watched the attached videos and I noticed there is no device icon in BT setting page. 1. Could you please tell me what kind of BT device you try to pair to ? (headset?, phone?, notebook?) Would it be possible for you to provide the 'class' of the BT device ? P.S. You can find the "class of device" by checking the "classes file" in your FxOS phone. The path of classes list: /data/misc/bluetoothd/$YOUR_BT_ADDRESS/classes The path name list: /data/misc/bluetoothd/$YOUR_BT_ADDRESS/names For example: # adb shell cat /data/misc/bluetoothd/0C:BD:51:20:A2:0E/classes B4:98:42:7E:0A:58 0x58020c # adb shell cat /data/misc/bluetoothd/0C:BD:51:20:A2:0E/names B4:98:42:7E:0A:58 Sony_Headset_A ---> The class of 'Sony_Headset_A' is 0x58020c. 2. Could you please provide the info. of the build version you used? BuildID: YYYYMMDDHHMMSS (Settings App -> Device Information -> More Information -> Build Identifier) Gecko Version: [hg changeset hash] or [git commit hash]. Thank you.
Flags: needinfo?(sync-1)
Is it related to bug 828860? It really looks like connection made with LE devices. Seeing this bug, I just suspect this bug is related to LE. Which device that you tried to pair with? < HCI Command: LE Create Connection (0x08|0x000d) plen 25 bdaddr C2:40:4F:F5:FA:C7 type 1 > HCI Event: Command Status (0x0f) plen 4 LE Create Connection (0x08|0x000d) status 0x00 ncmd 1 < HCI Command: LE Create Connection Cancel (0x08|0x000e) plen 0 > HCI Event: Command Complete (0x0e) plen 4 LE Create Connection Cancel (0x08|0x000e) ncmd 1 > HCI Event: LE Meta Event (0x3e) plen 19 LE Connection Complete status 0x02 handle 0, role master < HCI Command: LE Create Connection (0x08|0x000d) plen 25 bdaddr CC:60:DD:53:22:F5 type 1 > HCI Event: Command Status (0x0f) plen 4 LE Create Connection (0x08|0x000d) status 0x00 ncmd 1 > HCI Event: LE Meta Event (0x3e) plen 19 LE Connection Complete status 0x00 handle 2, role master < ACL data: handle 2 flags 0x00 dlen 11 L2CAP(d): cid 0x0006 len 7 [psm 0] > HCI Event: Disconn Complete (0x05) plen 4 status 0x00 handle 2 reason 0x3e
bug reporter: We need to minidump to check this bug. But I seriously suspect this bug is related to bug 828860 based on hcidump. Can you attached minidump? Michael, See Comment 39 that you posted before, https://bugzilla.mozilla.org/show_bug.cgi?id=828860#c39, can you share with partner workaround see if the problem still happen? Although we have not gotta minidump it's the same bug.
Flags: needinfo?(mvines)
Comment 11•10 years ago
|
||
Hi reporter, If you try to make a minidump for the crash, could you save hcidump for the entire procedure as well ? (i.e. from BT Enable to BT Pairing) That would be very useful. Best regards
Comment 12•10 years ago
|
||
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #10) > Michael, > See Comment 39 that you posted before, > https://bugzilla.mozilla.org/show_bug.cgi?id=828860#c39, can you share with > partner workaround see if the problem still happen? Although we have not > gotta minidump it's the same bug. https://www.codeaurora.org/cgit/quic/la/platform/vendor/qcom/msm8610/commit/?h=b2g_jb_3.2&id=241bd7f147edf4fba447f007ff36055d3fe342b7
Flags: needinfo?(mvines)
Comment 13•10 years ago
|
||
Had a go at this one on my latest sora build, with RIL from ro.build.date=Mon Feb 17 10:59:04 CST 2014 I'm curious as to how fast you are doing the pair/unpair cycles. If I'm allowing up to a minute in between each, then I'm not seeing the issue.
Comment 14•10 years ago
|
||
Information of our code is "FFOS1.3 Mozilla build ID: 20140208004002". The device I try to pairing name is "Boom Band" and class is 0x000000. "Boom Band" is a customized bluetooth device for TCL-Android phone by Baidu. Is the minidump log is in /data/b2g/mozilla/$BYPHONE.default/minidumps? If yes, i will provide latter.
(In reply to Kunny Liu from comment #14) > Information of our code is "FFOS1.3 Mozilla build ID: 20140208004002". > The device I try to pairing name is "Boom Band" and class is 0x000000. "Boom > Band" is a customized bluetooth device for TCL-Android phone by Baidu. Yes. It looks like Boom Band is 'LE' device, which at least match on my Comment 10. Based on Comment 12, I believe bluez LE support has been disabled already. So it might be something else cause crash. > > Is the minidump log is in /data/b2g/mozilla/$BYPHONE.default/minidumps? If > yes, i will provide latter. Ya. But please use minidump_stackwalk to resolve all the symbols, since we don't have your symbol files. Or you can directly run gdb and attach b2g process.
Comment 16•10 years ago
|
||
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #15) > Based on Comment 12, I believe bluez LE support has been disabled already. > So it might be something else cause crash. It would be good to confirm that the patch in comment 12 has actually been applied though as Sora probably has a different device/... board .mk files.
Comment 17•10 years ago
|
||
(In reply to Michael Vines [:m1] [:evilmachines] from comment #16) > (In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #15) > > Based on Comment 12, I believe bluez LE support has been disabled already. > > So it might be something else cause crash. > > It would be good to confirm that the patch in comment 12 has actually been > applied though as Sora probably has a different device/... board .mk files. I have check our code in ./device/qcom/msm8610/msm8610.mk : # Bluetooth configuration files PRODUCT_COPY_FILES += \ system/bluetooth/data/main.conf:system/etc/bluetooth/main.conf \ system/bluetooth/data/audio.conf:system/etc/bluetooth/audio.conf I think this patch has been included.
Comment 18•10 years ago
|
||
Shawn Is this related to: https://bugzilla.mozilla.org/show_bug.cgi?id=976458?
Flags: needinfo?(shuang)
Comment 19•10 years ago
|
||
I can't get the minidump file because the directory("/data/b2g/mozilla/<path_to_profile_dir>/minidumps") without any file after crash.
(In reply to Preeti Raghunath(:Preeti) from comment #18) > Shawn > > Is this related to: https://bugzilla.mozilla.org/show_bug.cgi?id=976458? No. This bug is not related to bug 976458.
Flags: needinfo?(shuang)
(In reply to Kunny Liu from comment #19) > Created attachment 8382795 [details] > hcidump&adb log 20140227 > > I can't get the minidump file because the > directory("/data/b2g/mozilla/<path_to_profile_dir>/minidumps") without any > file after crash. Two actions we can do here: 1. Get a 'Boom Band' BLE device and we can test ourselves. But I'm not sure it's available for us to buy. 2. Does anyone team member from the reporter's team know how to attach gdb to get backtrace or know how to get minidump? You can check this article: http://tech.mozilla.com.tw/posts/2470/%E9%82%84%E5%8E%9F-b2g-crash-%E6%A1%88%E7%99%BC%E7%8F%BE%E5%A0%B4
Flags: needinfo?(sync-1)
Hi Kunny - For attaching gdb, please refer to the following steps: $ cd your-source-code-folder $ adb shell b2g-ps(this might change since I am not sure if QCT has wrapped this script to something else) # Find the pid of the app that is going to crash in the reproduce steps $ ./run-gdb.sh attach <pid> (gdb) continue # reproduce the issue to crash the handset. (gdb) bt Dear Shawn, please kindly correct me if the above information is not right.
Flags: needinfo?(shuang)
Flags: needinfo?(liukun)
I guess QC build did not include run-gdb.sh and they have different build environment. Overall, the steps are correct.
Flags: needinfo?(shuang)
Comment 24•10 years ago
|
||
We need more information to know if this happens with other bluetooth headsets.
Comment 25•10 years ago
|
||
Does this happen with any other BT devices? Need more info to move ahead
Comment 26•10 years ago
|
||
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #23) > I guess QC build did not include run-gdb.sh and they have different build environment. Please use |rungdb| instead. The rest looks the same.
Comment 27•10 years ago
|
||
(In reply to Michael Vines [:m1] [:evilmachines] from comment #26) > (In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #23) > > I guess QC build did not include run-gdb.sh and they have different build environment. > > Please use |rungdb| instead. The rest looks the same. I can't find “rungdb” program in my project or b2g phone. I try to get the minidump file and will provide it latter.
Flags: needinfo?(liukun)
Comment 28•10 years ago
|
||
(In reply to Kunny Liu from comment #27) > I can't find “rungdb” program in my project or b2g phone. $ . build/envsetup.sh $ lunch ... $ rungdb attach <pid>
Comment 29•10 years ago
|
||
Comment 30•10 years ago
|
||
Can you parse minidump with symbols? Since we don't have your symbol files. I cannot see the backtrace.
Flags: needinfo?(liukun)
Comment 32•10 years ago
|
||
(In reply to Shawn Huang [:shuang] [:shawnjohnjr] from comment #31) > Can you parse minidump with symbols? Since we don't have your symbol files. > I cannot see the backtrace. Yes, I have run this command: "./minidump_stackwalk /local/source/ffos/minidump/7029e7f1-08fc-7745-32e4a005-307f1da3.dmp /local/source/ffos/soul3.5/out/target/product/msm8610/obj/objdir-gecko/dist/crashreporter-symbols/ > dmp_result2.txt"
Flags: needinfo?(liukun)
Comment 33•10 years ago
|
||
into the b2g source code directory: 1. ". ./build/envsetup.sh" 2. "lunch msm8610-eng" 3. "export B2G_NOOPT=1" 4. "make -j8" 5. "make buildsymbols" download images to phone. run cmd: "MOZ_CRASHREPORTER="1" /system/bin/b2g.sh" in adb shell. catch minidump log and parse it by: "./minidump_stackwalk /local/source/ffos/minidump/1b090a4a-e7cd-8923-13a409b1-7b4ba113.dmp /local/source/ffos/soul3.5/out/target/product/msm8610/obj/objdir-gecko/dist/crashreporter-symbols/ > dmp_result4.txt" i think above steps is right.
Kunny, Thanks a lot. This log is very helpful. I will update as soon as possible.
Michael, It looks like patch in https://www.codeaurora.org/cgit/quic/la/platform/vendor/qcom/msm8610/commit/?h=b2g_jb_3.2&id=241bd7f147edf4fba447f007ff36055d3fe342b7, it won't stop LE operation from stack level. What we saw LeConnParams property changed sent back from bt stack. See: https://www.codeaurora.org/cgit/quic/la/platform/external/bluetooth/bluez/tree/src/device.c?h=b2g/jb_mr1_rb2.17#n1320 device_update_le_conn_params(device, interval, latency, timeout). It was reported from kernel hci_event.c function hci_le_conn_complete_evt or hci_le_conn_update_complete_evt. There are two interfaces for bluez to control hci interface. Since the new bluez goes through mgmtops.c instead of hciopts, in plugins/mgmtopts.c the function mgmt_enable_le does not disable le actually. The old way (hciopts) actually implemented enable/disable LE function. See: https://www.codeaurora.org/cgit/quic/la/platform/external/bluetooth/bluez/tree/plugins/mgmtops.c?h=b2g/jb_mr1_rb2.17#n2192 https://www.codeaurora.org/cgit/quic/la/platform/external/bluetooth/bluez/tree/plugins/hciops.c?h=b2g/jb_mr1_rb2.17#n3506 Since we finally know the root cause, we can fix from gecko side. However, this is a just kindly reminder that mgmt interface cannot disable LE.
Flags: needinfo?(mvines)
Comment 36•10 years ago
|
||
Ah, thanks for the info. We can easily change hciops_enable_le() to return -ENOSYS as well, just like mgmt_enable_le(), as well. But regardless seems like Gecko shouldn't crash due as well. LMK
Flags: needinfo?(mvines)
Comment 37•10 years ago
|
||
(In reply to Michael Vines [:m1] [:evilmachines] from comment #36) > Ah, thanks for the info. We can easily change hciops_enable_le() to return > -ENOSYS as well, just like mgmt_enable_le(), as well. But regardless seems > like Gecko shouldn't crash due as well. LMK https://www.codeaurora.org/cgit/quic/la/platform/external/bluetooth/bluez/tree/plugins/hciops.c?h=b2g/jb_mr1_rb2.17#n3506 But this file has not been compiled into the system.img. I think change hciops_enable_le() to return -ENOSYS is useless for our project.
(In reply to Michael Vines [:m1] [:evilmachines] from comment #36) > Ah, thanks for the info. We can easily change hciops_enable_le() to return > -ENOSYS as well, just like mgmt_enable_le(), as well. But regardless seems > like Gecko shouldn't crash due as well. LMK Yes. This is a gecko bug for sure. I'm not saying we won't fix this, I just tried to explain the problem. Indeed, we made a mistake.
(In reply to Kunny Liu from comment #37) > https://www.codeaurora.org/cgit/quic/la/platform/external/bluetooth/bluez/ > tree/plugins/hciops.c?h=b2g/jb_mr1_rb2.17#n3506 > > But this file has not been compiled into the system.img. I think change > hciops_enable_le() to return -ENOSYS is useless for our project. Current bluez uses mgmtops interface, not hciops. mgmt shall send hci command to enable/disable LE, instead of return ENOSYS. cp.simul = (dev->features[6] & LMP_LE_BREDR) ? 0x01 : 0x00; if (hci_send_cmd(dev->sk, OGF_HOST_CTL, OCF_WRITE_LE_HOST_SUPPORTED, WRITE_LE_HOST_SUPPORTED_CP_SIZE, &cp) < 0) return -errno; Anyway, I don't want to confuse people here, this bug shall be fixed in gecko to handle unknown PropertyChanged event from bluez. Thanks.
Assignee | ||
Comment 40•10 years ago
|
||
* Handle unknown property sent from bt stack properly. Since Gina won't be able to review for several days, ask for Shawn's help to review.
Assignee: jaliu → echou
Attachment #8385898 -
Flags: review?(shuang)
Attachment #8385898 -
Flags: feedback?(btian)
Assignee | ||
Comment 41•10 years ago
|
||
Hi Kunny, Please try the patch I just attached in comment 40 and see if it can solve the issue. Thanks.
Flags: needinfo?(liukun)
Comment 42•10 years ago
|
||
Comment on attachment 8385898 [details] [diff] [review] patch 1: v1: add error handing for unknown property Review of attachment 8385898 [details] [diff] [review]: ----------------------------------------------------------------- f=me with comment addressed. ::: dom/bluetooth/bluez/linux/BluetoothDBusService.cpp @@ +675,4 @@ > } > > if (i == aPropertyTypeLen) { > + BT_LOGR("unknown property: %s", property); We should ensure 'property' is non-nullptr here or before.
Attachment #8385898 -
Flags: feedback?(btian) → feedback+
Assignee | ||
Comment 43•10 years ago
|
||
(In reply to Ben Tian [:btian] from comment #42) > Comment on attachment 8385898 [details] [diff] [review] > patch 1: v1: add error handing for unknown property > > Review of attachment 8385898 [details] [diff] [review]: > ----------------------------------------------------------------- > > f=me with comment addressed. > > ::: dom/bluetooth/bluez/linux/BluetoothDBusService.cpp > @@ +675,4 @@ > > } > > > > if (i == aPropertyTypeLen) { > > + BT_LOGR("unknown property: %s", property); > > We should ensure 'property' is non-nullptr here or before. property must be non-null since it passes the if-statement: if (dbus_message_iter_get_arg_type(&aIter) != DBUS_TYPE_STRING) { return false; }
Comment 44•10 years ago
|
||
ok, I will merge this patch to our project and test.
Flags: needinfo?(liukun)
Comment on attachment 8385898 [details] [diff] [review] patch 1: v1: add error handing for unknown property Review of attachment 8385898 [details] [diff] [review]: ----------------------------------------------------------------- It looks good to me, but need to wait for confirmation from Kunny.
Attachment #8385898 -
Flags: review?(shuang) → review+
Comment 46•10 years ago
|
||
I have tried about 50 times to pair with "Boom Band" and can't reproduce this issue.
Comment 47•10 years ago
|
||
Thanks for your work!
Assignee | ||
Comment 48•10 years ago
|
||
(In reply to Kunny Liu from comment #46) > I have tried about 50 times to pair with "Boom Band" and can't reproduce > this issue. Great. Thanks. :)
Assignee | ||
Comment 50•10 years ago
|
||
https://hg.mozilla.org/integration/b2g-inbound/rev/ab45561e141f
Updated•10 years ago
|
blocking-b2g: 1.3? → 1.3+
Comment 51•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/ab45561e141f
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → 1.4 S3 (14mar)
Assignee | ||
Comment 52•10 years ago
|
||
* 1.3-specific patch
Comment 53•10 years ago
|
||
Please request approval-mozilla-b2g28 on this patch when this is ready for uplift.
status-b2g-v1.3:
--- → affected
status-b2g-v1.4:
--- → fixed
status-firefox28:
--- → wontfix
status-firefox29:
--- → wontfix
status-firefox30:
--- → fixed
Assignee | ||
Comment 54•10 years ago
|
||
Comment on attachment 8386508 [details] [diff] [review] patch1: for v1.3 NOTE: This flag is now for security issues only. Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings. [Approval Request Comment] Bug caused by (feature/regressing bug #): not a regression User impact if declined: random crash during pair/unpair (low occurrence rate) Testing completed: m-c and manual test (by me and Kunny, see comment 46) Risk to taking this patch (and alternatives if risky): very low since the patch handles an unexpected result. String or UUID changes made by this patch: none
Attachment #8386508 -
Flags: approval-mozilla-b2g28?
Comment 55•10 years ago
|
||
Comment on attachment 8386508 [details] [diff] [review] patch1: for v1.3 Approving the low risk patch and requesting verification once this lands on 1.3
Attachment #8386508 -
Flags: approval-mozilla-b2g28? → approval-mozilla-b2g28+
Updated•10 years ago
|
status-b2g-v1.3T:
--- → fixed
Updated•10 years ago
|
Flags: in-moztrap?
Updated•10 years ago
|
Flags: in-moztrap? → in-moztrap+
You need to log in
before you can comment on or make changes to this bug.
Description
•