Closed Bug 841648 Opened 11 years ago Closed 11 years ago

Bluetooth: SIGSEGV in mozilla::ipc::UnixSocketImpl::SetUpIO

Categories

(Core :: DOM: Device Interfaces, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: gwagner, Assigned: echou)

References

Details

Attachments

(2 files)

STR: Try to send a file via bluetooth to the device:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 15675.15678]
0x425dc850 in mozilla::ipc::UnixSocketImpl::SetUpIO (this=0x487ba040) at /Volumes/2mac/gaia/3src/ipc/unixsocket/UnixSocket.cpp:117
117	    MOZ_ASSERT(!mIOLoop);
(gdb) bt
#0  0x425dc850 in mozilla::ipc::UnixSocketImpl::SetUpIO (this=0x487ba040) at /Volumes/2mac/gaia/3src/ipc/unixsocket/UnixSocket.cpp:117
#1  0x425dcf6a in mozilla::ipc::StartImplReadingTask::Run (this=0x48740900) at /Volumes/2mac/gaia/3src/ipc/unixsocket/UnixSocket.cpp:413
#2  0x426a8968 in MessageLoop::RunTask (this=0x44d89dec, task=0x48740900) at /Volumes/2mac/gaia/3src/ipc/chromium/src/base/message_loop.cc:333
#3  0x426a89c4 in MessageLoop::DeferOrRunPendingTask (this=0x44d89dec, pending_task=...) at /Volumes/2mac/gaia/3src/ipc/chromium/src/base/message_loop.cc:341
#4  0x426a8d2e in MessageLoop::DoWork (this=0x44d89dec) at /Volumes/2mac/gaia/3src/ipc/chromium/src/base/message_loop.cc:441
#5  0x426e2752 in base::MessagePumpLibevent::Run (this=0x40402460, delegate=0x44d89dec) at /Volumes/2mac/gaia/3src/ipc/chromium/src/base/message_pump_libevent.cc:311
#6  0x426a8524 in MessageLoop::RunInternal (this=0x44d89dec) at /Volumes/2mac/gaia/3src/ipc/chromium/src/base/message_loop.cc:215
#7  0x426a84be in MessageLoop::RunHandler (this=0x44d89dec) at /Volumes/2mac/gaia/3src/ipc/chromium/src/base/message_loop.cc:208
#8  0x426a8466 in MessageLoop::Run (this=0x44d89dec) at /Volumes/2mac/gaia/3src/ipc/chromium/src/base/message_loop.cc:182
#9  0x426c1aa2 in base::Thread::ThreadMain (this=0x40407280) at /Volumes/2mac/gaia/3src/ipc/chromium/src/base/thread.cc:156
#10 0x426e2f2a in ThreadFunc (closure=0x40407280) at /Volumes/2mac/gaia/3src/ipc/chromium/src/base/platform_thread_posix.cc:39
#11 0x400fbe18 in __thread_entry (func=0x426e2f11 <ThreadFunc>, arg=0x40407280, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217
#12 0x400fb96c in pthread_create (thread_out=<value optimized out>, attr=0xbea467e4, start_routine=0x426e2f11 <ThreadFunc>, arg=0x40407280) at bionic/libc/bionic/pthread.c:357
#13 0x00000000 in ?? ()

(gdb) p this
$1 = (class mozilla::ipc::UnixSocketImpl * const) 0x487ba040
(gdb) p *this
$2 = {<base::MessagePumpLibevent::Watcher> = {_vptr.Watcher = 0x44614a80}, mConsumer = {ptr = 0x496fe5f0}, mIOLoop = 0x44d89dec, mOutgoingQ = warning: can't find linker symbol for virtual table for `nsTArray<mozilla::ipc::UnixSocketRawData*>' value
warning:   found `EmptyEnumeratorImpl::kInstance' instead

{<nsTArray_Impl<mozilla::ipc::UnixSocketRawData*, nsTArrayInfallibleAllocator>> = {<nsTArray_base<nsTArrayInfallibleAllocator>> = {
        mHdr = 0x446667e4}, <nsTArray_SafeElementAtHelper<mozilla::ipc::UnixSocketRawData*, nsTArray_Impl<mozilla::ipc::UnixSocketRawData*, nsTArrayInfallibleAllocator> >> = {<No data fields>}, <No data fields>}, <No data fields>}, mIncoming = {mRawPtr = 0x0}, mReadWatcher = {is_persistent_ = true, event_ = 0x0}, mWriteWatcher = {is_persistent_ = false, event_ = 0x0}, mFd = {value = 128, 
    _mCheckNotUsedAsTemporary = {statementDone = true}}, mConnector = {mRawPtr = 0x4a6c6a10}, mCurrentTaskIsCanceled = false, mTask = 0x0, mAddress = warning: can't find linker symbol for virtual table for `nsCString' value
warning:   found `EmptyString()::sEmpty' instead
{<nsACString_internal> = {
      mData = 0x44666b50 "", mLength = 0, mFlags = 1}, <No data fields>}, mAddrSize = 10, mAddr = {sa_family = 31, sa_data = "\307rH\215\270\f\245\245\245\245\245\245"}, 
  mLock = {<mozilla::BlockingResourceBase> = {static kResourceTypeName = 0x446464a0, mChainPrev = 0x0, mDDEntry = 0x4a6c6a20, static sCallOnce = {initialized = 1, inProgress = 1, 
        status = PR_SUCCESS}, static sResourceAcqnChainFrontTPI = 3, static sDeadlockDetector = 0x4040a130}, mLock = 0x46d5d650}}
OS: Mac OS X → Gonk (Firefox OS)
Hardware: x86 → ARM
Blocks: 836523
Blocks: 846586
Assignee: nobody → echou
The root cause is that UnixSocketImpl::SetUpIO() would be called more than once when a connection is accepted. Let's take BluetoothOppManager as an example, the scenario is like:

(1) A listening socket is created for waiting for incoming socket connections of OPP service. UnixSocketConsumer::ListenSocket() is called, then a SocketAcceptTask instance is created and run, and UnixSocketImpl::SetUpIO() is called in the end. It's the first time mIOLoop is set.
(2) A socket connection request sent from a remote device is received, so UnixSocketImpl::OnFileCanReadWithoutBlocking() is called. Since the socket status is SOCKET_LISTENING, it enters the second if-block.
(3) At the end of this if-block, a StartImplReadingTask() instance will be dispatched to IO thread. The only thing done by this task is calling mImpl->SetUpIO(). In that function, since mIOLoop had already been assigned in step (1), MOZ_ASSERT(!mIOLoop) would fail.
Clear mIOLoop to nullptr when mFd of the socket resets
Attachment #719848 - Flags: review?(kyle)
According to bug 776182 comment 15, the objective of this assertion is "prevent from not initializing mIOLoop" but not "don't let mIOLoop be assigned more than once". If so, then I think we can just use MessageLoopForIO::current() instead of maintaining mIOLoop.
Attachment #719850 - Flags: review?(kyle)
(In reply to Eric Chou [:ericchou] [:echou] from comment #1)
> The root cause is that UnixSocketImpl::SetUpIO() would be called more than
> once when a connection is accepted.

But this is the way that the code is supposed to work.

> (1) A listening socket is created for waiting for incoming socket
> connections of OPP service. UnixSocketConsumer::ListenSocket() is called,
> then a SocketAcceptTask instance is created and run, and
> UnixSocketImpl::SetUpIO() is called in the end. It's the first time mIOLoop
> is set.

Here we tell the IO thread to watch the listening socket. Otherwise...

> (2) A socket connection request sent from a remote device is received, so
> UnixSocketImpl::OnFileCanReadWithoutBlocking() is called. Since the socket
> status is SOCKET_LISTENING, it enters the second if-block.

... we woudn't receive the connection request here.

> (3) At the end of this if-block, a StartImplReadingTask() instance will be
> dispatched to IO thread. The only thing done by this task is calling
> mImpl->SetUpIO(). In that function, since mIOLoop had already been assigned
> in step (1), MOZ_ASSERT(!mIOLoop) would fail.

We replaced the file descriptor within the second if-block. Running SetUpIO again tells the IO thread to now watch the connection's file descriptor. I think the patch you just provided does the right thing by simply clearing mIOLoop.
(In reply to Thomas Zimmermann [:tzimmermann] from comment #4)
> (In reply to Eric Chou [:ericchou] [:echou] from comment #1)
> > The root cause is that UnixSocketImpl::SetUpIO() would be called more than
> > once when a connection is accepted.
> 
> But this is the way that the code is supposed to work.
> 

Yes, I know. So we have to deal with that.

> > (1) A listening socket is created for waiting for incoming socket
> > connections of OPP service. UnixSocketConsumer::ListenSocket() is called,
> > then a SocketAcceptTask instance is created and run, and
> > UnixSocketImpl::SetUpIO() is called in the end. It's the first time mIOLoop
> > is set.
> 
> Here we tell the IO thread to watch the listening socket. Otherwise...
> 

Otherwise the 'accept()' won't even be called because now it's in UnixSocketImpl::OnFileCanReadWithoutBlocking().

> > (2) A socket connection request sent from a remote device is received, so
> > UnixSocketImpl::OnFileCanReadWithoutBlocking() is called. Since the socket
> > status is SOCKET_LISTENING, it enters the second if-block.
> 
> ... we woudn't receive the connection request here.
> 

Sorry for the unclear sentence. I meant that the watcher got libevent and called OnFileCanReadWithoutBlocking().

> > (3) At the end of this if-block, a StartImplReadingTask() instance will be
> > dispatched to IO thread. The only thing done by this task is calling
> > mImpl->SetUpIO(). In that function, since mIOLoop had already been assigned
> > in step (1), MOZ_ASSERT(!mIOLoop) would fail.
> 
> We replaced the file descriptor within the second if-block. Running SetUpIO
> again tells the IO thread to now watch the connection's file descriptor. 

Yeah, I know how it's working and why. I was just trying to record the entire flow for those people who are interested in this bug.

> I think the patch you just provided does the right thing by simply clearing
> mIOLoop.

That's great! So it's like you f+ for solution 1. :)
Ah, Ok, I see. Your comment about SetUpIO being called twice lend me to think you wanted to remove one of the calls. Sorry if my reply sounded a bit offensive.
(In reply to Thomas Zimmermann [:tzimmermann] from comment #6)
> Ah, Ok, I see. Your comment about SetUpIO being called twice lend me to
> think you wanted to remove one of the calls. Sorry if my reply sounded a bit
> offensive.

No problem! Thanks for making it more clear, Thomas. :)
Kyle, I came up with two solutions to this issue. I don't know which one is better, so please help with choosing one. Feel free to let me know if you think both of these won't work. Thanks.
I, uh, think you uploaded the same patch twice. That said, I think solution 1 is fine also. :)

BTW, note that we're probably going to need to start porting the patch from bug 845148 to m-c, which may change some of this stuff also. However I /think/ we can make this change for the moment with no real issues.
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #9)
> I, uh, think you uploaded the same patch twice. That said, I think solution
> 1 is fine also. :)
> 

Oops :P

> BTW, note that we're probably going to need to start porting the patch from
> bug 845148 to m-c, which may change some of this stuff also. However I
> /think/ we can make this change for the moment with no real issues.

Yes, thanks for reminding. For m-c, Ben has created a new bug (bug 846615). I think it'll be fine uploading this patch before dealing with that bug. Since the patch can really fix the problem, this issue can be closed after it lands, then people can focus on bug 846615.
Attachment #719848 - Attachment description: solution 1: clear variable mIOLoop when the socket fd resets → solution 1: clear variable mIOLoop when the socket fd resets, f=tzimmerman, r=qdot
Attachment #719848 - Flags: review?(kyle)
https://hg.mozilla.org/mozilla-central/rev/6dcde763a186
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Comment on attachment 719850 [details] [diff] [review]
solution 2: remove mIOLoop

Review of attachment 719850 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing this review since this landed?
Attachment #719850 - Flags: review?(kyle)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: