Closed Bug 1026864 Opened 8 years ago Closed 8 years ago

[Nuwa] deadlock when a thread frozen inside a call to malloc's mutex

Categories

(Core :: IPC, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: kk1fff, Assigned: kk1fff)

References

Details

Attachments

(1 file, 2 obsolete files)

If a thread is forzen inside a call to malloc while it hold the malloc's mutex, it will block calls to malloc from other threads, and it will finally block main thread and result in deadlock.
An example stacktrace of this case

Thread 4 (Thread 654.740):
#0  __futex_syscall3 () at bionic/libc/arch-arm/bionic/atomics_arm.S:183
#1  0x4006e284 in _normal_lock (mutex=0x400574f8) at bionic/libc/bionic/pthread.c:951
#2  pthread_mutex_lock (mutex=0x400574f8) at bionic/libc/bionic/pthread.c:1041
#3  0x4002874c in __wrap_pthread_mutex_lock (mtx=<value optimized out>) at /home/patrick/w/hgpool/emu1/mcgit/mozglue/build/Nuwa.cpp:1124
#4  0x4002bade in malloc_mutex_lock (mutex=0xfffffe00) at /home/patrick/w/hgpool/emu1/mcgit/memory/mozjemalloc/jemalloc.c:1629
#5  0x4002d6b0 in arena_dalloc (ptr=0x435bb300, offset=128) at /home/patrick/w/hgpool/emu1/mcgit/memory/mozjemalloc/jemalloc.c:4580
#6  0x4002e6a0 in je_free (ptr=0xfffffe00) at /home/patrick/w/hgpool/emu1/mcgit/memory/mozjemalloc/jemalloc.c:6502
#7  0x4002a6a2 in free (ptr=0x435bb300) at /home/patrick/w/hgpool/emu1/mcgit/memory/build/replace_malloc.c:200
#8  0x416d5352 in js_free (this=<value optimized out>, fop=0x43141ccc) at ../../dist/include/js/Utility.h:122
#9  js::SystemAllocPolicy::free_ (this=<value optimized out>, fop=0x43141ccc) at /home/patrick/w/hgpool/emu1/mcgit/js/src/jsalloc.h:31
#10 js::detail::HashTable<js::Shape* const, js::HashSet<js::Shape*, js::ShapeHasher, js::SystemAllocPolicy>::SetOps, js::SystemAllocPolicy>::destroyTable (
    this=<value optimized out>, fop=0x43141ccc) at ../../dist/include/js/HashTable.h:1061
#11 ~HashTable (this=<value optimized out>, fop=0x43141ccc) at ../../dist/include/js/HashTable.h:1129
#12 ~HashSet (this=<value optimized out>, fop=0x43141ccc) at ../../dist/include/js/HashTable.h:300
#13 delete_<js::KidsHash> (this=<value optimized out>, fop=0x43141ccc) at /home/patrick/w/hgpool/emu1/mcgit/js/src/vm/Runtime.h:387
#14 js::Shape::finalize (this=<value optimized out>, fop=0x43141ccc) at /home/patrick/w/hgpool/emu1/mcgit/js/src/jspropertytree.cpp:266
#15 0x41697ad8 in finalize<js::Shape> (fop=<value optimized out>, src=<value optimized out>, dest=<value optimized out>, thingKind=<value optimized out>,
    budget=...) at /home/patrick/w/hgpool/emu1/mcgit/js/src/jsgc.cpp:486
#16 FinalizeTypedArenas<js::Shape> (fop=<value optimized out>, src=<value optimized out>, dest=<value optimized out>, thingKind=<value optimized out>,
    budget=...) at /home/patrick/w/hgpool/emu1/mcgit/js/src/jsgc.cpp:543
#17 FinalizeArenas (fop=<value optimized out>, src=<value optimized out>, dest=<value optimized out>, thingKind=<value optimized out>, budget=...)
    at /home/patrick/w/hgpool/emu1/mcgit/js/src/jsgc.cpp:590
#18 0x4169803c in js::gc::ArenaLists::backgroundFinalize (this=<value optimized out>, onBackgroundThread=<value optimized out>)
    at /home/patrick/w/hgpool/emu1/mcgit/js/src/jsgc.cpp:1956
#19 js::gc::GCRuntime::sweepBackgroundThings (this=<value optimized out>, onBackgroundThread=<value optimized out>)
    at /home/patrick/w/hgpool/emu1/mcgit/js/src/jsgc.cpp:2516
#20 0x41699a44 in js::GCHelperState::doSweep (this=0x402d6bd0) at /home/patrick/w/hgpool/emu1/mcgit/js/src/jsgc.cpp:2798
#21 0x41699b10 in js::GCHelperState::work (this=0x402d6bd0) at /home/patrick/w/hgpool/emu1/mcgit/js/src/jsgc.cpp:2657
#22 0x41731d3a in js::HelperThread::handleGCHelperWorkload (arg=0x40235460) at /home/patrick/w/hgpool/emu1/mcgit/js/src/vm/HelperThreads.cpp:1042
#23 js::HelperThread::threadLoop (arg=0x40235460) at /home/patrick/w/hgpool/emu1/mcgit/js/src/vm/HelperThreads.cpp:1094
#24 js::HelperThread::ThreadMain (arg=0x40235460) at /home/patrick/w/hgpool/emu1/mcgit/js/src/vm/HelperThreads.cpp:749
#25 0x42286b38 in _pt_root (arg=<value optimized out>) at /home/patrick/w/hgpool/emu1/mcgit/nsprpub/pr/src/pthreads/ptthread.c:212
#26 0x40029d4c in _thread_create_startup (arg=0x4025fc00) at /home/patrick/w/hgpool/emu1/mcgit/mozglue/build/Nuwa.cpp:607
#27 thread_create_startup (arg=0x4025fc00) at /home/patrick/w/hgpool/emu1/mcgit/mozglue/build/Nuwa.cpp:638
#28 0x4006ee4c in __thread_entry (func=0x40029d0d <thread_create_startup>, arg=0x4025fc00, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217
#29 0x4006e99c in pthread_create (thread_out=<value optimized out>, attr=0x4025fc14, start_routine=0x40029d0d <thread_create_startup>, arg=0x4025fc00)
    at bionic/libc/bionic/pthread.c:357
#30 0x00000000 in ?? ()


Thread 1 (Thread 654.654):
#0  __futex_syscall3 () at bionic/libc/arch-arm/bionic/atomics_arm.S:183
#1  0x4006e284 in _normal_lock (mutex=0x400cd044) at bionic/libc/bionic/pthread.c:951
#2  pthread_mutex_lock (mutex=0x400cd044) at bionic/libc/bionic/pthread.c:1041
#3  0x4002871a in __wrap_pthread_mutex_lock (mtx=<value optimized out>) at /home/patrick/w/hgpool/emu1/mcgit/mozglue/build/Nuwa.cpp:1123
#4  0x4002bade in malloc_mutex_lock (mutex=0xfffffe00) at /home/patrick/w/hgpool/emu1/mcgit/memory/mozjemalloc/jemalloc.c:1629
#5  0x4002db02 in arena_malloc_small (arena=0x400cd040, size=128, zero=false) at /home/patrick/w/hgpool/emu1/mcgit/memory/mozjemalloc/jemalloc.c:4066
#6  arena_malloc (arena=0x400cd040, size=128, zero=false) at /home/patrick/w/hgpool/emu1/mcgit/memory/mozjemalloc/jemalloc.c:4145
#7  0x4002dd7a in imalloc (size=28) at /home/patrick/w/hgpool/emu1/mcgit/memory/mozjemalloc/jemalloc.c:4157
#8  0x4002dd9a in je_malloc (size=28) at /home/patrick/w/hgpool/emu1/mcgit/memory/mozjemalloc/jemalloc.c:6207
#9  0x4002a7b6 in malloc (size=28) at /home/patrick/w/hgpool/emu1/mcgit/memory/build/replace_malloc.c:150
#10 0x4186a06a in moz_xmalloc (size=28) at /home/patrick/w/hgpool/emu1/mcgit/memory/mozalloc/mozalloc.cpp:52
#11 0x407c34a4 in operator new (this=0x4025c418) at ../../dist/include/mozilla/mozalloc.h:201
#12 mozilla::dom::PContentChild::SendNuwaReady (this=0x4025c418) at /home/patrick/w/hgpool/emu1/B2G/objdir-gecko/ipc/ipdl/PContentChild.cpp:2473
#13 0x40c75ac2 in OnNuwaProcessReady () at /home/patrick/w/hgpool/emu1/mcgit/dom/ipc/ContentChild.cpp:2016
#14 0x40029156 in MakeNuwaProcess () at /home/patrick/w/hgpool/emu1/mcgit/mozglue/build/Nuwa.cpp:1728
#15 0x40c761ca in OnFinishNuwaPreparation () at /home/patrick/w/hgpool/emu1/mcgit/dom/ipc/ContentChild.cpp:1594
#16 0x40790ef0 in DispatchToFunction<void (*)()> (this=0x400cd040) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/tuple.h:439
#17 RunnableFunction<void (*)(), Tuple0>::Run (this=0x400cd040) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/task.h:415
#18 0x4078c410 in MessageLoop::RunTask (this=0xbe9987a4, task=0x1c) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:357
#19 0x4078d172 in MessageLoop::DeferOrRunPendingTask (this=0x400cd040, pending_task=<value optimized out>)
    at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:365
#20 0x4078dd30 in MessageLoop::DoWork (this=0xbe9987a4) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:443
#21 0x40799f76 in mozilla::ipc::DoWorkRunnable::Run (this=<value optimized out>) at /home/patrick/w/hgpool/emu1/mcgit/ipc/glue/MessagePump.cpp:228
#22 0x40649f62 in nsThread::ProcessNextEvent (this=0x402024e0, aMayWait=false, aResult=0xbe997e97)
    at /home/patrick/w/hgpool/emu1/mcgit/xpcom/threads/nsThread.cpp:773
#23 0x406181c4 in NS_ProcessNextEvent (thread=0x400cd040, mayWait=false) at /home/patrick/w/hgpool/emu1/mcgit/xpcom/glue/nsThreadUtils.cpp:263
#24 0x4079a01c in mozilla::ipc::MessagePump::Run (this=0x40201af0, aDelegate=0xbe9987a4) at /home/patrick/w/hgpool/emu1/mcgit/ipc/glue/MessagePump.cpp:95
#25 0x4079a0e6 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x40201af0, aDelegate=0xbe9987a4)
    at /home/patrick/w/hgpool/emu1/mcgit/ipc/glue/MessagePump.cpp:283
#26 0x4078c3d4 in MessageLoop::RunInternal (this=0x1000000) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:229
#27 0x4078c452 in MessageLoop::RunHandler (this=0xbe9987a4) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:222
#28 MessageLoop::Run (this=0xbe9987a4) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:196
#29 0x40caac24 in nsBaseAppShell::Run (this=0x43517f40) at /home/patrick/w/hgpool/emu1/mcgit/widget/xpwidgets/nsBaseAppShell.cpp:164
#30 0x41311b9a in XRE_RunAppShell () at /home/patrick/w/hgpool/emu1/mcgit/toolkit/xre/nsEmbedFunctions.cpp:692
#31 0x4079a0b4 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0x40201af0, aDelegate=0xbe9987a4)
    at /home/patrick/w/hgpool/emu1/mcgit/ipc/glue/MessagePump.cpp:253
#32 0x4078c3d4 in MessageLoop::RunInternal (this=0x43517f40) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:229
#33 0x4078c452 in MessageLoop::RunHandler (this=0xbe9987a4) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:222
#34 MessageLoop::Run (this=0xbe9987a4) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:196
#35 0x41312014 in XRE_InitChildProcess (aArgc=-1097234120, aArgv=0xbe9988c0, aProcess=3197733044)
    at /home/patrick/w/hgpool/emu1/mcgit/toolkit/xre/nsEmbedFunctions.cpp:529
#36 0x000087a0 in main (argc=8, argv=0xbe998934) at /home/patrick/w/hgpool/emu1/mcgit/ipc/app/MozillaRuntimeMain.cpp:149
Make malloc use the real pthread_mutex_lock. And we can get rid of LibcAllocator in Nuwa, since calling to new with sThreadCountLoad held won't result in deadlock anymore.
Assignee: nobody → pwang
Attachment #8441971 - Flags: review?(mh+mozilla)
Attachment #8441971 - Flags: review?(khuey)
Attachment #8441971 - Flags: review?(cyu)
Comment on attachment 8441971 [details] [diff] [review]
Prevent malloc from using wrapped pthread_mutex_lock

Review of attachment 8441971 [details] [diff] [review]:
-----------------------------------------------------------------

You should just build jemalloc with -Dpthread_mutex_lock=__real_pthread_mutex_lock. And do the same in memory/jemalloc to avoid any nice surprise when we change which jemalloc we use.
Attachment #8441971 - Flags: review?(mh+mozilla) → review-
Attachment #8441971 - Attachment is obsolete: true
Attachment #8441971 - Flags: review?(khuey)
Attachment #8441971 - Flags: review?(cyu)
Attachment #8442647 - Flags: review?(mh+mozilla)
Attachment #8442647 - Flags: review?(khuey)
Attachment #8442647 - Flags: review?(cyu)
Attachment #8442647 - Flags: review?(cyu) → review+
Comment on attachment 8442647 [details] [diff] [review]
Prevent malloc from using wrapped pthread_mutex_lock v2

Review of attachment 8442647 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/mozjemalloc/moz.build
@@ +31,5 @@
>  # See bug 419470
>  if CONFIG['OS_TARGET'] == 'Linux':
>      NO_PGO = True
>  
> +if CONFIG['MOZ_NUWA_PROCESS'] != '':

if CONFIG['MOZ_NUWA_PROCESS']:

@@ +32,5 @@
>  if CONFIG['OS_TARGET'] == 'Linux':
>      NO_PGO = True
>  
> +if CONFIG['MOZ_NUWA_PROCESS'] != '':
> +    DEFINES['pthread_mutex_lock'] = '__real_pthread_mutex_lock';

As mentioned in my previous comment, you need to also do that in memory/jemalloc, and make it conditional to CONFIG['MOZ_JEMALLOC3'] as well.
Attachment #8442647 - Flags: review?(mh+mozilla) → feedback+
Attachment #8444930 - Flags: review?(mh+mozilla) → review+
https://hg.mozilla.org/mozilla-central/rev/c6574383501a
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
See Also: → 1121269
You need to log in before you can comment on or make changes to this bug.