[Bluetooth] crash in libxul.so!EventFilter

RESOLVED DUPLICATE of bug 976883

Status

Firefox OS
General
P1
critical
RESOLVED DUPLICATE of bug 976883
6 years ago
4 years ago

People

(Reporter: m1, Assigned: ericchou)

Tracking

({crash})

unspecified
B2G C4 (2jan on)
ARM
Gonk (Firefox OS)
crash

Firefox Tracking Flags

(blocking-b2g:tef+, blocking-basecamp:-)

Details

(Whiteboard: [b2g-crash] [cr 439030][triage:1/16], crash signature)

Attachments

(9 attachments)

Created attachment 700240 [details]
decoded minidump of crash

Seen during BT testing, 100% reproducible.

STR:
1. Turn on Bluetooth on the phone and do inquiry.
2. Make sure you have at least one BT Low Energy device in vicinity.
3. Once you find LE device try to connect to it.

---

Top frames:
Crash reason:  SIGSEGV
Crash address: 0x4c

Thread 39 (crashed)
 0  libxul.so!EventFilter [nsTSubstring.h : 292 + 0x8]
     r4 = 0x434e7550    r5 = 0x00000000    r6 = 0x4189b394    r7 = 0x4163ad77
     r8 = 0x00000001    r9 = 0x418b1dbc   r10 = 0x49f93538    fp = 0x47affe34
     sp = 0x47aff820    lr = 0x40dffdeb    pc = 0x40e0c694
    Found by: given as instruction pointer in context
 1  libdbus.so!dbus_connection_dispatch [dbus-connection.c : 4679 + 0x7]
     r4 = 0x47ffe480    r5 = 0x434e7550    r6 = 0x00000000    r7 = 0x00000001
     r8 = 0x00000021    r9 = 0x41673509   r10 = 0x49f93538    fp = 0x47affe34
     sp = 0x47affdc0    pc = 0x41f2e8fd
    Found by: call frame info
 2  libxul.so!mozilla::ipc::DBusThread::EventLoop [DBusThread.cpp : 392 + 0x5]
     r4 = 0x486e1c00    r5 = 0x487a7b24    r6 = 0x486e1c08    r7 = 0x487043a8
     r8 = 0x4163b2c7    r9 = 0x41673509   r10 = 0x4167353c    fp = 0x47affe34
     sp = 0x47affe08    pc = 0x411503c1
    Found by: call frame info
I'd take a look if you have an LE device for me.
blocking-basecamp: --- → ?
(Reporter)

Comment 2

6 years ago
Can you just debug based on the call stack?  Otherwise you may need to fly to Asia
:D Ok, I'll try with the stack for now.
Can you provide a logcat?
Assignee: nobody → tzimmermann
Thomas,
I have BLE HRM. I can try to reproduce it tomorrow morning.

Updated

6 years ago
Severity: normal → critical
Crash Signature: [@ EventFilter ]
Keywords: crash
Whiteboard: [b2g-crash]
blocking-b2g: tef? → tef+
blocking-basecamp: ? → -
(Reporter)

Updated

6 years ago
Flags: needinfo?(mvines)
Whiteboard: [b2g-crash] → [b2g-crash] [cr 439030]
Created attachment 700297 [details] [diff] [review]
Don't acces empty property array

The crash happens in EqualsLiteral. The only call to this function in EventFilter is near line 1440.

Since I cannot reproduce the error, my guess is that the array of properties that we're using for the call is empty. Could you apply the patch and test if it prevents the crash. You won't be able to pair with the device though.

The patch is in git format against a recent b2g/gecko.
Attachment #700297 - Flags: feedback?(mvines)
Do we actually support Bluetooth 4.0 (aka LE)?

I did some searching on the state of the Linux support and found that it has only been supported since 2.6.39 and been disabled for several releases onwards. Since the phone runs on Linux 3.0 (the release after 2.6.39), it wouldn't surprise me if the protocol is not actually enabled.
(Reporter)

Comment 8

6 years ago
Comment on attachment 700297 [details] [diff] [review]
Don't acces empty property array

I have the logcat if you still want it (long story) but looks like maybe not needed.  lmk
Attachment #700297 - Flags: feedback?(mvines) → feedback+
Flags: needinfo?(mvines)
(In reply to Michael Vines [:m1] from comment #8)
> Comment on attachment 700297 [details] [diff] [review]
> Don't acces empty property array
> 
> I have the logcat if you still want it (long story) but looks like maybe not
> needed.  lmk

Please attach it. Thanks!
Hmm, nothing suspicious.

Could you run 'hcidump' from the adb shell while trying to pair, and attach its output? Thanks.
Flags: needinfo?(mvines)
(Reporter)

Comment 12

6 years ago
I don't have the device here, and test folks are offline right now.   But are you asking to run |hcidump| with attachment 700297 [details] [diff] [review] or without?
Flags: needinfo?(mvines)
(In reply to Michael Vines [:m1] from comment #12)
> I don't have the device here, and test folks are offline right now.   But
> are you asking to run |hcidump| with attachment 700297 [details] [diff] [review]
> [review] or without?

Ok, thanks so far. Please run hcidump without the patch. I want to see what HCI messages crash b2g.
(Reporter)

Comment 14

6 years ago
Sure, I'll ask our test folks to run this.  I get on a plane pretty early tomorrow so might have some lag in turnaround time, maybe not.   ni=me
Flags: needinfo?(mvines)
@tzimmermann,
We took entire BlueZ stack from codeaurora. It actually enable LE.
I've tested with my Polar LE Heart Rate Monitor using Unagi.
Unfortunately, I cannot reproduce crash.
I can see name can be retrieved from Scan Response LE Advertising Report. After that, LE connection successful. It failed at authentication. Not sure LE SMP is ready. Since Polar LE HRM enabled security.
Created attachment 700436 [details]
Hcidump Ascii format-Unagi with Polar H7 HRM

Attached hcidump sample with LE device. No crash.
Oh, strange; but thanks for trying anyway.

I skimmed over the hcidump you provided. Do you have a chance of getting the authentication to succeed?
@tzimmermann,
I've managed to try "Wahoo LE HRM" (which disabled security) and "Polar H7 HRM". I cannot hit crash. It seems we have to wait for their logs to figure out.
1. "Polar H7 HRM": It seems BlueZ does not init LE SMP pairing procedure. And I don't think LE SMP in partner's kernel can work.
2. "Wahoo LE HRM" does not enable authentication, what i can see LE connection still can be successfully established. After that LE connection drop. It's because no further transaction from GATT layer.
What Michael mentioned "3. Once you find LE device try to connect to it.". I don't think we can establish LE GATT connection to LE device from current device Setting, there is no such. Sorry, I don't have any clue about what kinds of LE device cause crash.

Comment 19

6 years ago
Created attachment 701140 [details]
hcidump when BLE is in field
Flags: needinfo?(mvines)
Hi Michael,
I saw ggrisco attached hcidump. It establish L2CAP connection. But from Bluetooth Setting it shall never connect to LE device, now it only connects with HFP device. May we know how do you perform connection? Probably not from Setting application?
Flags: needinfo?(mvines)
(Reporter)

Comment 21

6 years ago
Feedback from the test folks:
-------
I performed it from settings menu only. 

Can we suggest following fix for better software operations on BT: If we intend to support only HFP support then consider disabling LE inquiry from settings, this will benefit in two ways:

1. Avoid this crash as LE support is not enabled from frameworks.
2. Make Bluetooth inquiry faster.
-------
Flags: needinfo?(mvines)
Shawn, what do you think of Michael's suggestion in comment #21?
Flags: needinfo?(shuang)
Hey Eric,

as we discussed last week, here is the bug. I can't do much anyway. Do you know how long it will take to fix this problem? I got an email from Faramarz, asking for an ETA.
Assignee: tzimmermann → echou
Regarding comment 21,
LE support cannot be simply disable by flag inside BlueZ. Actually it checks if Bluetooth chipset supports LE feature or not. It's not what b2g-bluetooth can control. 
The reason is the BlueZ D-Bus API doesn’t differentiate between traditional BR/EDR devices and LE devices so there are essentially three possible address types: BR/EDR, LE public and LE random.
Flags: needinfo?(shuang)
(Assignee)

Comment 25

6 years ago
Hi Michael,

Could you please try again with the feedback+ patch applied? We still can't be 100% sure what the root cause is since we can't reproduce this by pairing with other LE devices. It's worth making a small test to see if Thomas' guess is right. If the symptom still exists, we will send you another patch which is only with added logs. Then we could have more evidence to figure out what the problem is.

Eric
(Reporter)

Comment 26

6 years ago
(Greg, can you help w/ ^^^ please)
Flags: needinfo?(ggrisco)
Whiteboard: [b2g-crash] [cr 439030] → [b2g-crash] [cr 439030][triage:1/16]

Comment 27

6 years ago
I made a build with the feedback+ patch now and will send to tester who has BLE device (in India).  Hopefully I can report results tomorrow.
Flags: needinfo?(ggrisco)
Hi Greg, any update from testing?

Comment 29

6 years ago
Testing did not get back to me with results yet, sorry for the delay.  I will try to get it ASAP.

Comment 30

6 years ago
Update from testing is that the patch did not solve the issue and crash is still seen.  I'm attaching the hcidump from this test.

Comment 31

6 years ago
Created attachment 703753 [details]
hcidump when BLE is in field (after applying feedback patch)
Created attachment 703796 [details]
Dbus related tool

Here is the way to provide more logs. We need both logcat and dbus-log.txt
Usage in ubuntu linux:
1. Unzip dbus-tool.zip
2. adb push dbus-monitor /system/bin
   adb shell chmod 777 /system/bin/dbus-monitor
3. adb push dbus-uuidgen /system/bin
   adb shell chmod 777 /system/bin/dbus-uuidgen

4. Before running dbus-monitor, you need to run "dbus-uuidgen --ensure" in unagi first.
5. adb shell dbus-monitor --system "type='signal'" |tee dbus-log.txt
6. Start to test. Now dbus log will be recorded.
Created attachment 703798 [details]
bluez lib

Add extra log due to I suspect it's related to primary service discovery.
adb push libbluetoothd.so /system/lib
(Assignee)

Comment 34

6 years ago
Just checked the adb logcat attached in comment 10. The crash should happen at:

01-05 16:36:37.379   457   458 I Gecko   : [Child 457] WARNING: pipe error (3): Connection reset by peer: file /local/mnt/workspace/lnxbuild/project/release_dev_msm7627a_597578/checkout/gecko/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 431
01-05 16:36:37.379   457   457 I Gecko   : 
01-05 16:36:37.379   457   457 I Gecko   : ###!!! [Child][SyncChannel] Error: Channel error: cannot send/recv
01-05 16:36:37.379   457   457 I Gecko   : 

I'll try to figure out what the root cause is next week. Before that, please take the tool and the modified libbluetoothd.so provided by Shawn, then upload the log after test is finished, and we'll check if we could find any clues ASAP.

Greg, please feel free to ask if you have any questions. Thanks for your help.

Eric
Flags: needinfo?(mvines)
Please help to get logs:dbus log (Comment 32) and logcat (Comment 33)
Shawn
(Reporter)

Comment 36

6 years ago
(passing ni? along to Greg)
Flags: needinfo?(mvines) → needinfo?(ggrisco)

Comment 37

6 years ago
Created attachment 705569 [details]
dbus log (response to comment 32)
Flags: needinfo?(ggrisco)

Comment 38

6 years ago
I was able to create a patch that disables LE in firmware settings.  With this patch, LE device does not show up in list and therefore crash can no longer be seen.  I attached the dbus logs as requested in case you still want to take a look at the crash, but I will probably close this bug now, thanks.
(Reporter)

Comment 39

6 years ago
With an out-of-gecko workaround in place we can let this gecko crash stand until LE devices become a requirement.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → WORKSFORME
LE?  I tried connecting the hamachi to leo and I got this crash:

https://crash-stats.mozilla.com/report/index/76c85aa2-074c-450b-bec0-6fe672130503

Apparantly there was another bluetooth fix that landed today so I'll try again tomorrow.
Can you describe how to reproduce from your side?
Flags: needinfo?(nhirata.bugzilla)
I no longer crash in today's builds.

I think fixes from 
https://bugzilla.mozilla.org/show_bug.cgi?id=848414
may have stopped the reproducibility of the crash.

Basically if you take the commercial ril builds from both hamachi and leo for 5/2/2013 and then try to connect the hamachi to the leo, you should crash with this bug.
Flags: needinfo?(nhirata.bugzilla)
Leaving this as WFM until I find another way to reproduce the crash.
We have found the root case. This bug had been fixed in Bug 976883. They are the same bug.
Resolution: WORKSFORME → DUPLICATE
Duplicate of bug: 976883
You need to log in before you can comment on or make changes to this bug.