Closed
Bug 840294
Opened 11 years ago
Closed 11 years ago
FxOS Desktop debug on m-c crashes in IOThread
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: qdot, Assigned: qdot)
References
Details
(Keywords: crash, Whiteboard: [b2g-crash])
Crash Data
Attachments
(1 file)
1.33 KB,
patch
|
tzimmermann
:
review+
|
Details | Diff | Splinter Review |
When trying to get stacks for bug 840286, I found that running gdb on FxOS Desktop on current m-c (not b2g-18) crashes on the IOThread issue. (Stack coming soon)
Updated•11 years ago
|
Assignee | ||
Comment 1•11 years ago
|
||
Stack: ###!!! ABORT: file /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_pump_libevent.cc, line 155 UNKNOWN [/share/code/mozbuild/mozilla-central/obj-debug/dist/bin/libxul.so +0x0188E6B3] UNKNOWN [/share/code/mozbuild/mozilla-central/obj-debug/dist/bin/libxul.so +0x0187F399] UNKNOWN [/share/code/mozbuild/mozilla-central/obj-debug/dist/bin/libxul.so +0x01880AB7] UNKNOWN [/share/code/mozbuild/mozilla-central/obj-debug/dist/bin/libxul.so +0x01880BE0] UNKNOWN [/share/code/mozbuild/mozilla-central/obj-debug/dist/bin/libxul.so +0x0188E0BF] UNKNOWN [/share/code/mozbuild/mozilla-central/obj-debug/dist/bin/libxul.so +0x0187F55C] UNKNOWN [/share/code/mozbuild/mozilla-central/obj-debug/dist/bin/libxul.so +0x0187F584] UNKNOWN [/share/code/mozbuild/mozilla-central/obj-debug/dist/bin/libxul.so +0x01884E35] UNKNOWN [/share/code/mozbuild/mozilla-central/obj-debug/dist/bin/libxul.so +0x0188EA80] UNKNOWN [/lib/x86_64-linux-gnu/libpthread.so.0 +0x00007E9A] clone+0x0000006D [/lib/x86_64-linux-gnu/libc.so.6 +0x000F3CBD] ###!!! ABORT: file /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_pump_libevent.cc, line 155 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffe6740700 (LWP 25213)] mozalloc_abort (msg=<optimized out>) at /share/code/mozbuild/mozilla-central/memory/mozalloc/mozalloc_abort.cpp:30 30 MOZ_CRASH(); (gdb) bt #0 mozalloc_abort (msg=<optimized out>) at /share/code/mozbuild/mozilla-central/memory/mozalloc/mozalloc_abort.cpp:30 #1 0x00007ffff3a06098 in Abort (aMsg=0x7fffe673f58c "###!!! ABORT: file /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_pump_libevent.cc, line 155") at /share/code/mozbuild/mozilla-central/xpcom/base/nsDebugImpl.cpp:422 #2 NS_DebugBreak_P (aSeverity=<optimized out>, aStr=<optimized out>, aExpr=0x0, aFile=0x7ffff4635ec2 "/share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_pump_libevent.cc", aLine=155) at /share/code/mozbuild/mozilla-central/xpcom/base/nsDebugImpl.cpp:409 #3 0x00007ffff3a2c1f2 in mozilla::Logger::~Logger (this=0x7fffe673fa10, __in_chrg=<optimized out>) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/logging.cc:47 #4 0x00007ffff3a3b6b3 in ~LogWrapper (this=0x7fffe673fa10, __in_chrg=<optimized out>) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/logging.h:59 #5 base::MessagePumpLibevent::WatchFileDescriptor (this=0x7fffe0001000, fd=-1, persistent=false, mode=base::MessagePumpLibevent::WATCH_WRITE, controller=0x127c458, delegate=0x127c420) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_pump_libevent.cc:155 #6 0x00007ffff3a2c399 in MessageLoop::RunTask (this=0x7fffe673fcd8, task=0x1589010) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:333 #7 0x00007ffff3a2dab7 in MessageLoop::DeferOrRunPendingTask (this=<optimized out>, pending_task=...) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:341 #8 0x00007ffff3a2dbe0 in DoWork (this=<optimized out>) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:441 #9 MessageLoop::DoWork (this=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:420 #10 0x00007ffff3a3b0bf in base::MessagePumpLibevent::Run (this=0x7fffe0001000, delegate=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_pump_libevent.cc:311 #11 0x00007ffff3a2c55c in MessageLoop::RunInternal (this=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:215 #12 0x00007ffff3a2c584 in RunHandler (this=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:208 #13 MessageLoop::Run (this=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:182 #14 0x00007ffff3a31e35 in base::Thread::ThreadMain (this=0x6dea40) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/thread.cc:156 #15 0x00007ffff3a3ba80 in ThreadFunc (closure=<optimized out>) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/platform_thread_posix.cc:39 #16 0x00007ffff73c4e9a in start_thread (arg=0x7fffe6740700) at pthread_create.c:308 #17 0x00007ffff70f1cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 Most obvious thing from stack: WatchFileDescriptor's fd=-1. That can't be good.
Assignee | ||
Comment 2•11 years ago
|
||
Ok, so looks like this is RIL related (as I removed --enable-b2g-bt completely). I added a MOZ_ASSERT a couple of places, got this more helpful stack back: Assertion failure: aFd >= 0, at /share/code/mozbuild/mozilla-central/ipc/unixsocket/UnixSocket.cpp:703 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffe6740700 (LWP 31683)] 0x00007ffff39b38fe in mozilla::ipc::UnixSocketImpl::OnFileCanWriteWithoutBlocking (this=0x127c160, aFd=-1) at /share/code/mozbuild/mozilla-central/ipc/unixsocket/UnixSocket.cpp:703 703 MOZ_ASSERT(aFd >= 0); (gdb) bt #0 0x00007ffff39b38fe in mozilla::ipc::UnixSocketImpl::OnFileCanWriteWithoutBlocking (this=0x127c160, aFd=-1) at /share/code/mozbuild/mozilla-central/ipc/unixsocket/UnixSocket.cpp:703 #1 0x00007ffff3a2c3e1 in MessageLoop::RunTask (this=0x7fffe673fcd8, task=0x1478ff0) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:333 #2 0x00007ffff3a2daff in MessageLoop::DeferOrRunPendingTask (this=<optimized out>, pending_task=...) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:341 #3 0x00007ffff3a2dc28 in DoWork (this=<optimized out>) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:441 #4 MessageLoop::DoWork (this=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:420 #5 0x00007ffff3a3b107 in base::MessagePumpLibevent::Run (this=0x7fffe0001000, delegate=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_pump_libevent.cc:311 #6 0x00007ffff3a2c5a4 in MessageLoop::RunInternal (this=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:215 #7 0x00007ffff3a2c5cc in RunHandler (this=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:208 #8 MessageLoop::Run (this=0x7fffe673fcd8) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/message_loop.cc:182 #9 0x00007ffff3a31e7d in base::Thread::ThreadMain (this=0x6dea40) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/thread.cc:156 #10 0x00007ffff3a3bac8 in ThreadFunc (closure=<optimized out>) at /share/code/mozbuild/mozilla-central/ipc/chromium/src/base/platform_thread_posix.cc:39 #11 0x00007ffff73c4e9a in start_thread (arg=0x7fffe6740700) at pthread_create.c:308 #12 0x00007ffff70f1cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #13 0x0000000000000000 in ?? () So RIL is somehow sending data even when it doesn't have a socket to open. Also, this crash happens all the time, not just in gdb. It was just flying by too fast for me to see it in the non-gdb logs.
Assignee | ||
Updated•11 years ago
|
Summary: FxOS Desktop debug on m-c crashes in gdb → FxOS Desktop debug on m-c crashes in IOThread
Assignee | ||
Comment 4•11 years ago
|
||
This patch adds a check to make sure we aren't trying to send data to a non-opened RIL socket. It also adds a couple of asserts to blow up earlier if this happens elsewhere. Feel free to r- this if you feel we should be doing this lower than where I'm putting this check, I can try to figure out some way to do it in UnixSocket too. Mainly just want to get the review process started now since I'm PTO tomorrow.
Attachment #712725 -
Flags: review?(vyang)
Comment 5•11 years ago
|
||
It's similar to bug 805754.
Crash Signature: [@ mozalloc_abort(char const*) | NS_DebugBreak_P | mozilla::Logger::~Logger ]
Assignee | ||
Comment 6•11 years ago
|
||
Agreeing with comment 34 on bug 805754. Not similar.
Assignee | ||
Comment 7•11 years ago
|
||
Returning to normal. This is not critical since I don't think it'll happen on b2g-18.
Severity: critical → normal
Assignee | ||
Comment 8•11 years ago
|
||
Comment on attachment 712725 [details] [diff] [review] Patch 1 (v1) - Check RIL validity before writing to socket Thomas, since vicamo is out, can you take a look at this? Same idea applies, if you think it should be moved to UnixSocket just r- and let me know.
Attachment #712725 -
Flags: review?(vyang) → review?(tzimmermann)
Comment 9•11 years ago
|
||
Comment on attachment 712725 [details] [diff] [review] Patch 1 (v1) - Check RIL validity before writing to socket Review of attachment 712725 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. ::: dom/system/gonk/SystemWorkerManager.cpp @@ +449,5 @@ > UnixSocketRawData* aRaw) > { > if ((gInstance->mRilConsumers.Length() <= aClientId) || > + !gInstance->mRilConsumers[aClientId] || > + gInstance->mRilConsumers[aClientId]->GetConnectionStatus() != SOCKET_CONNECTED) { Just some nitpicking: could it happen that the socket is in the state SOCKET_CONNECTING when this line gets executed? My impression is that we should be able to send data (as in: add it to the send queue) in this case.
Attachment #712725 -
Flags: review?(tzimmermann) → review+
Assignee | ||
Comment 10•11 years ago
|
||
Yeah, at that point we should be ok to add to the queue since we'll be CONNECTED by the time the task ends. I'll add that, update patch, and land. (In reply to Thomas Zimmermann [:tzimmermann] from comment #9) > Comment on attachment 712725 [details] [diff] [review] > Patch 1 (v1) - Check RIL validity before writing to socket > > Review of attachment 712725 [details] [diff] [review]: > ----------------------------------------------------------------- > > Looks good. > > ::: dom/system/gonk/SystemWorkerManager.cpp > @@ +449,5 @@ > > UnixSocketRawData* aRaw) > > { > > if ((gInstance->mRilConsumers.Length() <= aClientId) || > > + !gInstance->mRilConsumers[aClientId] || > > + gInstance->mRilConsumers[aClientId]->GetConnectionStatus() != SOCKET_CONNECTED) { > > Just some nitpicking: could it happen that the socket is in the state > SOCKET_CONNECTING when this line gets executed? My impression is that we > should be able to send data (as in: add it to the send queue) in this case.
Assignee | ||
Comment 11•11 years ago
|
||
Actually, I'm going to leave as is for the moment. CONNECTING does not imply the connection will be successful, and since this queues work via tasks which we'd have to cancel as well as queue later destruction of the UnixSocketImpl (which is a followup I'll be filing here in a sec), we might as well just wait 'til we're actually connected.
Assignee | ||
Comment 12•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/87ac03700d5d
Comment 13•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/87ac03700d5d
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•