Closed
Bug 838212
Opened 12 years ago
Closed 12 years ago
Crash in bluetooth ipc code while making calls and BT connect/disconnect
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(blocking-b2g:-, b2g18 affected, b2g18-v1.0.1 affected)
RESOLVED
WORKSFORME
blocking-b2g | - |
People
(Reporter: ggrisco, Assigned: echou)
Details
(Keywords: crash, Whiteboard: [b2g-crash][cr 448014])
1. Connect to bluetooth headset (Motorola S9 HD)
2. Make some calls and verify if audio is present on the headset
While doing some random testing by connecting/disconnecting headset and making calls, this crash was observed.
minidump wasn't very useful, but EXTRA file contained this:
Notes=EGL? EGL+ GL Context? GL Context+ GL Layers? GL Layers+ xpcom_runtime_abort([Parent 125] ###!!! ABORT: unknown union type: file /local/mnt/workspace/ggrisco/ics.b2g.blue/out/target/product/msm7627a/obj/objdir-gecko/ipc/ipdl/PBluetoothRequestParent.cpp, line 449)
Found the abort in PBluetoothRequestParent::write(). The reply type didn't match either "success" or "error" thus causing the abort.
Reporter | ||
Updated•12 years ago
|
blocking-b2g: --- → tef?
Updated•12 years ago
|
Assignee: nobody → echou
blocking-b2g: tef? → tef+
Updated•12 years ago
|
Assignee | ||
Comment 1•12 years ago
|
||
Based on the log provided by Greg, I traced our code and figured out that the problem could be:
The crash happened when a BluetoothRequestParent::ReplyRunnable object began to "Run()". When processed to the line |if (!mRequest->Send__delete__(mRequest, *mReply))| in the function Run(), it crashed because mReply was invalid, which may have been freed or not even initialized (by calling SetReply()).
However, I've checked all BluetoothReplyRunnable objects used in our code but couldn't find anything suspected. Unfortunately, I tried to reproduce this but failed, instead I kept hitting another crash described in bug 836523.
I'll keep working on this and see if I can find something out. At the mean time, please provide more information as much as possible (STR, frequency).
Reporter | ||
Comment 2•12 years ago
|
||
(In reply to Eric Chou [:ericchou] [:echou] from comment #1)
> I'll keep working on this and see if I can find something out. At the mean
> time, please provide more information as much as possible (STR, frequency).
It was only seen once and so far cannot be reproduced, although I'm not sure how many attempts we've made. So, unfortunately, there isn't a more definite set of STR.
Comment 3•12 years ago
|
||
Hi Eric,
I had a meeting today about closing the remaining tef+ bugs. Is there anything I can do to help with this one?
Updated•12 years ago
|
status-b2g18:
--- → affected
status-b2g18-v1.0.1:
--- → affected
Comment 4•12 years ago
|
||
Thomas, Eric is likely out on holiday for this week. Help digging into this would definitely be appreciated!
Assignee | ||
Comment 5•12 years ago
|
||
(In reply to Thomas Zimmermann [:tzimmermann] from comment #3)
> Hi Eric,
>
> I had a meeting today about closing the remaining tef+ bugs. Is there
> anything I can do to help with this one?
Hi Thomas,
Sure. Please take this bug and see if you could figure something out. I've investigated but couldn't find the potential root cause. You may want to check comment 1 for more information.
I will check mail as possible as I can during holidays, so please feel free to have me review your patches. Of course I think Kyle would love to help, too.
Thanks.
Eric
Updated•12 years ago
|
Assignee: echou → tzimmermann
Status: NEW → ASSIGNED
Updated•12 years ago
|
Status: ASSIGNED → NEW
Comment 6•12 years ago
|
||
So far, I had no luck with reproducing this bug, but at least I found and (hopefully) fixed bug 840544. I'll try again tomorrow.
Comment 7•12 years ago
|
||
I've successfully reproduced the bug using the latest mozilla-b2g18 checkout with the patches for fixing bug 836523 applied. It seems to me that this is not triggered by pure connect/disconnect cycles, I have to turn off and on Bluetooth or unpair the device and reconnect it later for it to happen. If I reboot the phone with the device already paired no amount of connect/disconnect cycles causes it to crash. The logcat output when the crash happens always looks like this (1713 is the b2g process):
F/libc ( 1713): Fatal signal 11 (SIGSEGV) at 0x003a0074 (code=1)
I/DEBUG ( 112): debuggerd committing suicide to free the zombie!
I/DEBUG ( 2023): debuggerd: Feb 11 2013 16:18:24
E/AudioHardwareMSM76XXA( 2021): failed to open AUDIO_NORMAL_FILTER /system/etc/AudioFilter.csv: No such file or directory (2).
E/AudioHardwareMSM76XXA( 2021): failed to open AUTO_VOLUME_CONTROL /system/etc/AutoVolumeControl.txt: No such file or directory (2)
E/AudioHardwareMSM76XXA( 2021): setMode:mode = 0, status = -17
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_volume(0, 0, 7)
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_volume(6, 0, 7)
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_volume(12, 0, 7)
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_volume(3, 0, 7)
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_volume(16, 0, 7)
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_volume(25, 0, 7)
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_volume(8, 0, 1)
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_volume(9, 0, 1)
E/QualcommCamera( 2021): Qint android::get_number_of_cameras(): E
I/QualcommCameraHardware( 2021): getCameraInfo: IN
I/QualcommCameraHardware( 2021): getCameraInfo: loading libqcamera at 0xb000ee28
V/QualcommCameraHardware( 2021): Storing the current target type as 3
I/QualcommCameraHardware( 2021): getCameraInfo: numOfCameras = 1
I/QualcommCameraHardware( 2021): Camera sensor 0 info:
I/QualcommCameraHardware( 2021): camera_id: 0
I/QualcommCameraHardware( 2021): modes_supported: 5
I/QualcommCameraHardware( 2021): position: 0
I/QualcommCameraHardware( 2021): sensor_mount_angle: 90
V/QualcommCameraHardware( 2021): getCameraInfo: dlclose(libqcamera)
I/QualcommCameraHardware( 2021): getCameraInfo: OUT
W/AudioFlinger( 2021): Thread AudioOut_1 cannot connect to the power manager service
E/AudioHardwareMSM76XXA( 2021): Routing audio to Speakerphone
E/AudioHardwareMSM76XXA( 2021): doAudioRouteOrMute, mFM_route = -1 , device=6
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_device(6, 1, 0)
E/AudioHardwareMSM76XXA( 2021): rpc_snd_set_volume(38, 0, 7)
I've captured a back-trace with GDB when it happens:
#0 AppendElements<mozilla::ipc::UnixSocketRawData*> (this=0x48ee0830) at ../../dist/include/nsTArray.h:877
#1 AppendElement<mozilla::ipc::UnixSocketRawData*> (this=0x48ee0830) at ../../dist/include/nsTArray.h:894
#2 mozilla::ipc::UnixSocketImpl::QueueWriteData (this=0x48ee0830) at /home/gsvelto/projects/mozilla-b2g18/ipc/unixsocket/UnixSocket.cpp:68
#3 mozilla::ipc::SocketSendTask::Run (this=0x48ee0830) at /home/gsvelto/projects/mozilla-b2g18/ipc/unixsocket/UnixSocket.cpp:372
#4 0x411d7048 in MessageLoop::RunTask (this=0x4217edf0, task=0x48ee0830) at /home/gsvelto/projects/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:333
#5 0x411d7e7a in MessageLoop::DeferOrRunPendingTask (this=0x48ee0830, pending_task=<value optimized out>)
at /home/gsvelto/projects/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:341
#6 0x411d8a58 in MessageLoop::DoWork (this=0x4217edf0) at /home/gsvelto/projects/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:441
#7 0x411e8eee in base::MessagePumpLibevent::Run (this=0x40402430, delegate=0x4217edf0) at /home/gsvelto/projects/mozilla-b2g18/ipc/chromium/src/base/message_pump_libevent.cc:310
#8 0x411d6ff8 in MessageLoop::RunInternal (this=0x48ee0830) at /home/gsvelto/projects/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:215
#9 0x411d70ae in MessageLoop::RunHandler (this=0x4217edf0) at /home/gsvelto/projects/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:208
#10 MessageLoop::Run (this=0x4217edf0) at /home/gsvelto/projects/mozilla-b2g18/ipc/chromium/src/base/message_loop.cc:182
#11 0x411df3bc in base::Thread::ThreadMain (this=0x40407280) at /home/gsvelto/projects/mozilla-b2g18/ipc/chromium/src/base/thread.cc:156
#12 0x411e9390 in ThreadFunc (closure=0x1) at /home/gsvelto/projects/mozilla-b2g18/ipc/chromium/src/base/platform_thread_posix.cc:39
#13 0x4011ce18 in __thread_entry (func=0x411e9389 <ThreadFunc>, arg=0x40407280, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217
#14 0x4011c96c in pthread_create (thread_out=<value optimized out>, attr=0xbeaad8a8, start_routine=0x411e9389 <ThreadFunc>, arg=0x40407280) at bionic/libc/bionic/pthread.c:357
#15 0x00000000 in ?? ()
What causes the segfault is |this| being null in the invocation of nsTArray::AppendElement, crawling back the stack this leads to the mOutgoingQ field in UnixSocketImpl. I'm tying to reproduce it in a debug build to see if I can extract a little more data out of it.
Comment 8•12 years ago
|
||
This looks more like bug 840925 to me. Something in the Bluetooth stack tries to write to a socket which as actually been closed already. But thank you for trying to reproduce this, anyway.
Comment 9•12 years ago
|
||
(In reply to Thomas Zimmermann [:tzimmermann] from comment #8)
> This looks more like bug 840925 to me. Something in the Bluetooth stack
> tries to write to a socket which as actually been closed already. But thank
> you for trying to reproduce this, anyway.
Yes indeed, it looks like this is a potential duplicate. I've narrowed it down a bit and I can make it happen a little bit more frequently when doing the following:
- pair your headset
- connect, disconnect then connect again
- unpair while connected -> the crash happens
There's no need to be making calls or anything else. connecting/disconnecting alone doesn't seem to cause the issue either. I've tried tens of connect/disconnect cycles during a call w/o being able to crash the phone. BTW the fact that the memory had been overwritten is consistent with what I saw in the stack trace which was a funnily messed up UnixSocketImpl instance.
I would say we should mark this as a duplicate of bug 840925 or vice-versa.
Comment 10•12 years ago
|
||
Now that I think about it a little more marking it as a duplicate is not right because the original report tripped over a different problem. I'll try to cause a crash by further connect/disconnect cycles and if I can't then it might be worth considering closing it as WORKSFORME if nobody else is capable of reproducing it.
Comment 11•12 years ago
|
||
Yeah, I just wanted to say that. This problem here only happened once, and the bug you triggered is 840943. I saw you added an STR to 840943. Thanks! I think we should try to un-tef+ this bug here, and rather try to fix some of the others.
Comment 12•12 years ago
|
||
I agree, for what it's worth I've tried another call in which I run 50 quick connect/disconnect cycles and never hit the problem. This was on an Otoro with a Jabra EASYGO headset, <wild speculation> since the hardware is different than the original reporter it might just be that this could be a hardware-specific issue </wild speculation>.
Comment 13•12 years ago
|
||
This bug only ever happened once and can't be reproduced. Reconsidering blocker status.
blocking-b2g: tef+ → tef?
Comment 15•12 years ago
|
||
He Eric!
Here is the ticket. I had no luck reproducing this specific issue. I think we should just close it.
Assignee: tzimmermann → echou
Assignee | ||
Comment 16•12 years ago
|
||
Please reopen if it happens again.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•