B2G Bluetooth: unexpected crash in DBusThread::StopEventLoop

RESOLVED FIXED in mozilla15

Status

()

Core
DOM: Device Interfaces
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: vicamo, Assigned: qdot)

Tracking

Trunk
mozilla15
ARM
Gonk (Firefox OS)
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

5 years ago
(gdb) c
Continuing.
-*- RadioInterfaceLayer: Starting RIL Worker
[New Thread 2620.2622]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2620.2622]
mozilla::ipc::DBusThread::StopEventLoop (this=0x0) at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/dbus/DBusThread.cpp:506
506	  MutexAutoLock lock(mMutex);
(gdb) bt
#0  mozilla::ipc::DBusThread::StopEventLoop (this=0x0) at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/dbus/DBusThread.cpp:506
#1  0x40c99fa6 in DisconnectDBus (aMonitor=0xbebdd630, aSuccess=0xbebdd63f)
    at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/dbus/DBusThread.cpp:554
#2  0x40c923e4 in DispatchToFunction<void (*)(mozilla::hal::SwitchDevice, mozilla::Monitor*), mozilla::hal::SwitchDevice, mozilla::Monitor*> (
    this=0x4690e8) at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/tuple.h:454
#3  RunnableFunction<void (*)(mozilla::hal::SwitchDevice, mozilla::Monitor*), Tuple2<mozilla::hal::SwitchDevice, mozilla::Monitor*> >::Run (
    this=0x4690e8) at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/task.h:415
#4  0x40ce2706 in MessageLoop::RunTask (this=0x100ffdf4, task=0xbebdd63f)
    at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/message_loop.cc:318
#5  0x40ce3700 in MessageLoop::DeferOrRunPendingTask (this=0x0, pending_task=<value optimized out>)
    at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/message_loop.cc:326
#6  0x40ce438e in MessageLoop::DoWork (this=0x100ffdf4)
    at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/message_loop.cc:426
#7  0x40cf6048 in base::MessagePumpLibevent::Run (this=0xf130, delegate=0x100ffdf4)
    at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/message_pump_libevent.cc:310
#8  0x40ce26a2 in MessageLoop::RunInternal (this=0xbebdd63f)
    at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/message_loop.cc:208
#9  0x40ce2782 in MessageLoop::RunHandler (this=0x100ffdf4)
    at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/message_loop.cc:201
#10 MessageLoop::Run (this=0x100ffdf4) at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/message_loop.cc:175
#11 0x40ceb6ce in base::Thread::ThreadMain (this=0xf0f8) at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/thread.cc:156
#12 0x40cf6542 in ThreadFunc (closure=0x0) at /home/vicamo/WorkSpace/mozilla/b2g/sgs2-ics/gecko/ipc/chromium/src/base/platform_thread_posix.cc:27
#13 0x40106c28 in __thread_entry (func=0x40cf6539 <ThreadFunc>, arg=0xf0f8, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217
#14 0x4010677c in pthread_create (thread_out=<value optimized out>, attr=0xbebdd910, start_routine=0x40cf6539 <ThreadFunc>, arg=0xf0f8)
    at bionic/libc/bionic/pthread.c:357
#15 0x00000000 in ?? ()
I got the same crash. It seems the problem is bluedroid is not initialized properly.

This part of code (in SystemWorkerManager.cpp:402):

  if(EnsureBluetoothInit()) {
#endif
    StartDBus();
#ifdef MOZ_WIDGET_GONK
  }

If bluedroid is not initialized, StartDBus() will not be called. Then it will crash in:

    if (NS_FAILED(instance->Init())) {
      instance->Shutdown();
      return nsnull;
    }

We might need to add a check like IsBluetoothInitialized() before calling StopDBus() in SystemWorkerManager.
Hi Kyle, could you help to fix this?
Assignee: nobody → kyle
(In reply to Cervantes Yu from comment #1)

> We might need to add a check like IsBluetoothInitialized() before calling
> StopDBus() in SystemWorkerManager.

It's actually a little simpler than that. I have MOZ_ASSERTs where I should have code that will always run and check the validity of the compilation unit static thread pointers, which is as good as an init'd check. I just need to change the MOZ_ASSERTs to if()'s.

That said, how'd you guys run into this? Is there a phone we're working on that doesn't have bluedroid in the place we expect it, or doesn't have bluetooth for the radio yet? We might need a more tolerant failure case that just doesn't let bluetooth come up but the rest of the system will continue to run.
(Reporter)

Comment 4

5 years ago
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #3)
> That said, how'd you guys run into this? Is there a phone we're working on
> that doesn't have bluedroid in the place we expect it, or doesn't have
> bluetooth for the radio yet? We might need a more tolerant failure case that
> just doesn't let bluetooth come up but the rest of the system will continue
> to run.

Hi Kyle,

Actually it crashed on our SGS2 devices for some unknown reason. To be more detailed, I can't tell you whether or not this issue appears only inside gdb or it's actually a different defect. The logcat/debuggerd doesn't say anything related, but gdb shows two possible crash in our devices: one for corrupted libxpcom, another for DisconnectDBus segfault. The former might, not always, be solved by re-installing gecko, while the latter has no known re-produce steps/solution yet.
Oh, I see what's going on, it's nothing as bad as library corruption on my side, just a wrong check. I actually already have bluetooth in a "fail but continue" state, InitBluetooth ALWAYS returns NS_OK (which now needs to be another bug entirely for devices without bluetooth...). The RIL is failing on your devices for some reason (since it says Starting RIL Thread before the crash), so it's not bringing up the thread and NS_ENSURE_SUCCESS kicks out early (Yay macro'd returns :( ) This calls Shutdown, but since we haven't even made it to InitBluetooth yet, we get the StopDBus call, which isn't handled because the check is in an ASSERT. Patch forthcoming.
Created attachment 625187 [details] [diff] [review]
Fix crash on bluetooth thread destruction when thread not up
Attachment #625187 - Flags: review?(jones.chris.g)
Comment on attachment 625187 [details] [diff] [review]
Fix crash on bluetooth thread destruction when thread not up

r=me to also remove the NS_ENSURE_* bullshit.
Attachment #625187 - Flags: review?(jones.chris.g) → review+
Created attachment 626660 [details] [diff] [review]
v2: Fix crash on bluetooth thread destruction when thread not up (final)

Final version, changed NS_ENSURE_SUCCESS calls.
Attachment #625187 - Attachment is obsolete: true
Version: unspecified → Trunk
https://hg.mozilla.org/integration/mozilla-inbound/rev/0e26544d7731
https://hg.mozilla.org/mozilla-central/rev/0e26544d7731
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla15
You need to log in before you can comment on or make changes to this bug.