Closed Bug 756036 Opened 10 years ago Closed 10 years ago

Race condition in Ril.cpp


(Firefox OS Graveyard :: General, defect)

Not set


(Not tracked)



(Reporter: cyu, Assigned: cyu)



(2 files)

I encountered a crash gdb. Here is the debugging excerpt:

======================= webapi+apps.js =======================
[New Thread 4743.4750]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 4743.4750]
0x40c76162 in mozilla::ipc::RilClient::OnFileCanWriteWithoutBlocking (this=0x2da0d0, fd=21) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/ril/Ril.cpp:333
333             while (mCurrentWriteOffset < mCurrentRilRawData->mSize) {
(gdb) bt
#0  0x40c76162 in mozilla::ipc::RilClient::OnFileCanWriteWithoutBlocking (this=0x2da0d0, fd=21) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/ril/Ril.cpp:333
#1  0x40c75f36 in mozilla::ipc::RilWriteTask::Run (this=<value optimized out>) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/ril/Ril.cpp:177
#2  0x40cbf706 in MessageLoop::RunTask (this=0x100ffdf4, task=0x0) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/chromium/src/base/
#3  0x40cc0700 in MessageLoop::DeferOrRunPendingTask (this=0x0, pending_task=<value optimized out>) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/chromium/src/base/
#4  0x40cc138e in MessageLoop::DoWork (this=0x100ffdf4) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/chromium/src/base/
#5  0x40cd3048 in base::MessagePumpLibevent::Run (this=0xf130, delegate=0x100ffdf4) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/chromium/src/base/
#6  0x40cbf6a2 in MessageLoop::RunInternal (this=0x0) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/chromium/src/base/
#7  0x40cbf782 in MessageLoop::RunHandler (this=0x100ffdf4) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/chromium/src/base/
#8  MessageLoop::Run (this=0x100ffdf4) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/chromium/src/base/
#9  0x40cc86ce in base::Thread::ThreadMain (this=0xf0f8) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/chromium/src/base/
#10 0x40cd3542 in ThreadFunc (closure=0x0) at /home/cervantes/git/mozilla-b2g/B2G/gecko/ipc/chromium/src/base/
#11 0x40087c28 in __thread_entry (func=0x40cd3539 <ThreadFunc>, arg=0xf0f8, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217
#12 0x4008777c in pthread_create (thread_out=<value optimized out>, attr=0xbe836910, start_routine=0x40cd3539 <ThreadFunc>, arg=0xf0f8) at bionic/libc/bionic/pthread.c:357
#13 0x00000000 in ?? ()
(gdb) p mCurrentRilRawData
$6 = {mRawPtr = 0x0}

Since mCurrentRilRawData is assigned from popping mOutgoingQ. This could be that the queue contains a NULL pointer.

The mOutgoingQ is popped on IO thread, but is pushed from ril worker.
In SenRilRawData() mOutgoingQ mutex is locked before pushing, but in RilClient::OnFileCanWriteWithoutBlocking() it doesn't lock the mutex when reading and popping the queue. This is equivalent to locking at all. Adding MutexAutoLock lock(mMutex); before reading and popping the queue should fix the crash.
Assignee: nobody → cyu
Attachment #625017 - Flags: review?(kyle)
Attachment #625017 - Flags: review?(kyle) → review+
Looks good. Do you need me to land it, or do you have rights to?
Please land it. Thanks.
Keywords: checkin-needed
I assumed that the test code was not intended to land. Let me know if I was wrong.
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.