Closed
Bug 1132908
Opened 10 years ago
Closed 4 years ago
TSan: data race tools/profiler/platform-linux.cc:132 paf_prepare
Categories
(Core :: Gecko Profiler, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: froydnj, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [tsan])
Attachments
(2 files)
11.10 KB,
text/plain
|
Details | |
1.39 KB,
patch
|
Details | Diff | Splinter Review |
[cribbing from decoder's script, hopefully he won't mind]
The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer).
Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause unacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1].
If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist.
If I understand the stack traces correctly, it looks like this is just multiple fork() calls writing to the same variable. (I don't know why we're forking multiple times during a mochitest run, though.) Seems like making this Atomic<> should be sufficient...
[1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
Reporter | ||
Updated•10 years ago
|
Summary: TSan: data race → TSan: data race tools/profiler/platform-linux.cc:132 paf_prepare
Updated•10 years ago
|
Flags: needinfo?(bgirard)
Comment 1•10 years ago
|
||
Comment 2•10 years ago
|
||
If you have the setup to test this patch it would be handy but if not I'll get around to it later.
Flags: needinfo?(nfroyd)
Reporter | ||
Comment 3•10 years ago
|
||
I don't understand what's going on here, but I still see a race even with the patch applied, though with different stacks:
WARNING: ThreadSanitizer: data race (pid=32087)
Write of size 1 at 0x7ff866f87258 by thread T2:
#0 paf_prepare /home/froydnj/src/gecko-dev.git/tools/profiler/platform-linux.cc:134 (libxul.so+0x000003e7fbe0)
#1 __libc_fork /build/glibc-NmptCx/glibc-2.19/posix/../nptl/sysdeps/unix/sysv/linux/x86_64/../fork.c:95 (libc.so.6+0x0000000b75c1)
#2 LaunchApp /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/process_util_linux.cc:281 (libxul.so+0x000000e1d080)
#3 PerformAsyncLaunchInternal /home/froydnj/src/gecko-dev.git/ipc/glue/GeckoChildProcessHost.cpp:721 (libxul.so+0x000000e70a5c)
#4 PerformAsyncLaunch /home/froydnj/src/gecko-dev.git/ipc/glue/GeckoChildProcessHost.cpp:447 (libxul.so+0x000000e6fdb4)
#5 RunPerformAsyncLaunch /home/froydnj/src/gecko-dev.git/ipc/glue/GeckoChildProcessHost.cpp:485 (libxul.so+0x000000e6f65d)
#6 DispatchToMethod<mozilla::ipc::GeckoChildProcessHost, bool (mozilla::ipc::GeckoChildProcessHost::*)(std::vector<std::basic_string<char>, std::allocator<std::basic_string<char> > >, base::ProcessArchitecture), std::vector<std::basic_string<char>, std::allocator<std::basic_string<char> > >, base::ProcessArchitecture> /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/tuple.h:400 (libxul.so+0x000000e72c07)
#7 RunTask /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/message_loop.cc:361 (libxul.so+0x000000e42f96)
#8 Run /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/message_pump_libevent.cc:311 (libxul.so+0x000000e1c328)
#9 RunInternal /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/message_loop.cc:233 (libxul.so+0x000000e420bf)
#10 ThreadMain /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/thread.cc:170 (libxul.so+0x000000e52097)
#11 ThreadFunc /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/platform_thread_posix.cc:39 (libxul.so+0x000000e1cd4e)
Previous write of size 1 at 0x7ff866f87258 by thread T41:
#0 paf_prepare /home/froydnj/src/gecko-dev.git/tools/profiler/platform-linux.cc:134 (libxul.so+0x000003e7fbe0)
#1 __libc_fork /build/glibc-NmptCx/glibc-2.19/posix/../nptl/sysdeps/unix/sysv/linux/x86_64/../fork.c:95 (libc.so.6+0x0000000b75c1)
#2 g_test_get_filename ??:0 (libglib-2.0.so.0+0x000000070926)
Thread T2 'Gecko_IOThread' (tid=32109, running) created by main thread at:
#0 pthread_create ??:0 (exe+0x00000002bdd2)
#1 CreateThread /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/platform_thread_posix.cc:144 (libxul.so+0x000000e1cc4f)
#2 StartWithOptions /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/thread.cc:92 (libxul.so+0x000000e51e5f)
#3 NS_InitXPCOM2 /home/froydnj/src/gecko-dev.git/xpcom/build/XPCOMInit.cpp:548 (libxul.so+0x000000a077ae)
#4 Initialize /home/froydnj/src/gecko-dev.git/toolkit/xre/nsAppRunner.cpp:1406 (libxul.so+0x0000040dede2)
#5 XRE_main /home/froydnj/src/gecko-dev.git/toolkit/xre/nsAppRunner.cpp:4456 (libxul.so+0x0000040df52c)
#6 do_main /home/froydnj/src/gecko-dev.git/browser/app/nsBrowserApp.cpp:294 (exe+0x00000005b07c)
#7 __libc_start_main /build/glibc-NmptCx/glibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021b44)
Thread T41 'gmain' (tid=32155, running) created by main thread at:
#0 pthread_create ??:0 (exe+0x00000002bdd2)
#1 g_mutex_init ??:0 (libglib-2.0.so.0+0x00000008e1bf)
#2 HandlerExists /home/froydnj/src/gecko-dev.git/uriloader/exthandler/unix/nsGNOMERegistry.cpp:22 (libxul.so+0x0000016c31a5)
#3 OSProtocolHandlerExists /home/froydnj/src/gecko-dev.git/uriloader/exthandler/unix/nsOSHelperAppService.cpp:1154 (libxul.so+0x0000016abb89)
#4 GetProtocolHandlerInfoFromOS /home/froydnj/src/gecko-dev.git/uriloader/exthandler/unix/nsOSHelperAppService.cpp:1516 (libxul.so+0x0000016ae6af)
#5 non-virtual thunk to nsOSHelperAppService::GetProtocolHandlerInfoFromOS(nsACString_internal const&, bool*, nsIHandlerInfo**) /home/froydnj/src/gecko-dev.git/uriloader/exthandler/unix/nsOSHelperAppService.cpp:1537 (libxul.so+0x0000016ae832)
#6 NS_InvokeByIndex /home/froydnj/src/gecko-dev.git/xpcom/reflect/xptcall/md/unix/xptcinvoke_x86_64_unix.cpp:174 (libxul.so+0x0000009de942)
#7 Invoke /home/froydnj/src/gecko-dev.git/js/xpconnect/src/XPCWrappedNative.cpp:2110 (libxul.so+0x000001482d13)
#8 XPC_WN_CallMethod /home/froydnj/src/gecko-dev.git/js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1144 (libxul.so+0x00000148564f)
#9 CallJSNative /home/froydnj/src/gecko-dev.git/js/src/jscntxtinlines.h:226 (libxul.so+0x000004cc79e5)
#10 Invoke /home/froydnj/src/gecko-dev.git/js/src/vm/Interpreter.cpp:554 (libxul.so+0x000004ca9077)
#11 DoCallFallback /home/froydnj/src/gecko-dev.git/js/src/jit/BaselineIC.cpp:9571 (libxul.so+0x000004f09b97)
#12 <null> <null>:0 (0x7ff84ef14510)
#13 EnterBaselineAtBranch /home/froydnj/src/gecko-dev.git/js/src/jit/BaselineJIT.cpp:210 (libxul.so+0x000004f1ddf1)
#14 Interpret /home/froydnj/src/gecko-dev.git/js/src/vm/Interpreter.cpp:1737 (libxul.so+0x000004cf08ff)
#15 RunScript /home/froydnj/src/gecko-dev.git/js/src/vm/Interpreter.cpp:448 (libxul.so+0x000004cd80f8)
#16 Invoke /home/froydnj/src/gecko-dev.git/js/src/vm/Interpreter.cpp:517 (libxul.so+0x000004cc7e3d)
#17 Invoke /home/froydnj/src/gecko-dev.git/js/src/vm/Interpreter.cpp:554 (libxul.so+0x000004ca9077)
#18 JS_CallFunctionValue /home/froydnj/src/gecko-dev.git/js/src/jsapi.cpp:4216 (libxul.so+0x0000051c3e0d)
#19 CallMethod /home/froydnj/src/gecko-dev.git/js/xpconnect/src/XPCWrappedJSClass.cpp:1210 (libxul.so+0x00000147a60c)
#20 CallMethod /home/froydnj/src/gecko-dev.git/js/xpconnect/src/XPCWrappedJS.cpp:532 (libxul.so+0x000001474587)
#21 PrepareAndDispatch /home/froydnj/src/gecko-dev.git/xpcom/reflect/xptcall/md/unix/xptcstubs_x86_64_linux.cpp:122 (libxul.so+0x0000009df6a5)
#22 SharedStub /home/froydnj/src/gecko-dev.git/xpcom/reflect/xptcall/md/unix/xptcstubs_x86_64_linux.cpp:0 (libxul.so+0x0000009de9ea)
#23 GetServiceByContractID /home/froydnj/src/gecko-dev.git/xpcom/components/nsComponentManager.cpp:1561 (libxul.so+0x0000009b8821)
#24 CallGetService /home/froydnj/src/gecko-dev.git/xpcom/glue/nsComponentManagerUtils.cpp:67 (libxul.so+0x0000009f19a1)
#25 nsCOMPtr /opt/build/froydnj/build-tsan/uriloader/exthandler/../../dist/include/nsCOMPtr.h:514 (libxul.so+0x0000016bde9a)
#26 GetTypeFromFile /home/froydnj/src/gecko-dev.git/uriloader/exthandler/nsExternalHelperAppService.cpp:2835 (libxul.so+0x0000016bf05b)
#27 non-virtual thunk to nsExternalHelperAppService::GetTypeFromFile(nsIFile*, nsACString_internal&) /opt/build/froydnj/build-tsan/uriloader/exthandler/Unified_cpp_uriloader_exthandler0.cpp:2836 (libxul.so+0x0000016bf0b9)
#28 MakeFileInputStream /home/froydnj/src/gecko-dev.git/netwerk/protocol/file/nsFileChannel.cpp:317 (libxul.so+0x000000becdab)
#29 OpenContentStream /home/froydnj/src/gecko-dev.git/netwerk/protocol/file/nsFileChannel.cpp:390 (libxul.so+0x000000bed640)
#30 BeginPumpingData /home/froydnj/src/gecko-dev.git/netwerk/base/nsBaseChannel.cpp:215 (libxul.so+0x000000a7c6d9)
#31 AsyncOpen /home/froydnj/src/gecko-dev.git/netwerk/base/nsBaseChannel.cpp:630 (libxul.so+0x000000a7f85d)
#32 non-virtual thunk to nsBaseChannel::AsyncOpen(nsIStreamListener*, nsISupports*) /opt/build/froydnj/build-tsan/netwerk/base/Unified_cpp_netwerk_base1.cpp:650 (libxul.so+0x000000a7f9d9)
#33 NS_NewStreamLoaderInternal /opt/build/froydnj/build-tsan/dom/xul/../../dist/include/nsNetUtil.h:909 (libxul.so+0x000003315014)
#34 operator class nsILoadGroup * /opt/build/froydnj/build-tsan/dom/xul/../../dist/include/nsNetUtil.h:931 (libxul.so+0x0000032e5e5e)
#35 ResumeWalk /home/froydnj/src/gecko-dev.git/dom/xul/XULDocument.cpp:2905 (libxul.so+0x0000032d9c3e)
#36 OnPrototypeLoadDone /home/froydnj/src/gecko-dev.git/dom/xul/XULDocument.cpp:590 (libxul.so+0x0000032d8fcd)
#37 DidBuildModel /home/froydnj/src/gecko-dev.git/dom/xul/nsXULContentSink.cpp:240 (libxul.so+0x0000032ee7a5)
#38 DidBuildModel /home/froydnj/src/gecko-dev.git/parser/htmlparser/nsParser.cpp:902 (libxul.so+0x0000016fd1f7)
#39 OnStopRequest /home/froydnj/src/gecko-dev.git/parser/htmlparser/nsParser.cpp:1879 (libxul.so+0x0000016ffc3a)
#40 non-virtual thunk to nsParser::OnStopRequest(nsIRequest*, nsISupports*, nsresult) /opt/build/froydnj/build-tsan/parser/htmlparser/Unified_cpp_parser_htmlparser0.cpp:1894 (libxul.so+0x0000016ffcd1)
#41 OnStopRequest /home/froydnj/src/gecko-dev.git/uriloader/base/nsURILoader.cpp:323 (libxul.so+0x0000016a389d)
#42 OnStopRequest /home/froydnj/src/gecko-dev.git/netwerk/base/nsBaseChannel.cpp:771 (libxul.so+0x000000a80696)
#43 non-virtual thunk to nsBaseChannel::OnStopRequest(nsIRequest*, nsISupports*, nsresult) /opt/build/froydnj/build-tsan/netwerk/base/Unified_cpp_netwerk_base1.cpp:784 (libxul.so+0x000000a8078f)
#44 OnStateStop /home/froydnj/src/gecko-dev.git/netwerk/base/nsInputStreamPump.cpp:721 (libxul.so+0x000000a9c7f2)
#45 OnInputStreamReady /home/froydnj/src/gecko-dev.git/netwerk/base/nsInputStreamPump.cpp:440 (libxul.so+0x000000a9ba1a)
#46 non-virtual thunk to nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) /opt/build/froydnj/build-tsan/netwerk/base/Unified_cpp_netwerk_base1.cpp:502 (libxul.so+0x000000a9c9a9)
#47 Run /home/froydnj/src/gecko-dev.git/xpcom/io/nsStreamUtils.cpp:90 (libxul.so+0x0000009b314c)
#48 ProcessNextEvent /home/froydnj/src/gecko-dev.git/xpcom/threads/nsThread.cpp:855 (libxul.so+0x0000009cf229)
#49 NS_ProcessNextEvent /home/froydnj/src/gecko-dev.git/xpcom/glue/nsThreadUtils.cpp:265 (libxul.so+0x0000009fc761)
#50 Run /home/froydnj/src/gecko-dev.git/ipc/glue/MessagePump.cpp:99 (libxul.so+0x000000e7faec)
#51 RunInternal /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/message_loop.cc:233 (libxul.so+0x000000e420bf)
#52 Run /home/froydnj/src/gecko-dev.git/widget/nsBaseAppShell.cpp:164 (libxul.so+0x00000338cf28)
#53 Run /home/froydnj/src/gecko-dev.git/toolkit/components/startup/nsAppStartup.cpp:281 (libxul.so+0x000004069999)
#54 XRE_mainRun /home/froydnj/src/gecko-dev.git/toolkit/xre/nsAppRunner.cpp:4160 (libxul.so+0x0000040de9b7)
#55 XRE_main /home/froydnj/src/gecko-dev.git/toolkit/xre/nsAppRunner.cpp:4236 (libxul.so+0x0000040dee48)
#56 XRE_main /home/froydnj/src/gecko-dev.git/toolkit/xre/nsAppRunner.cpp:4456 (libxul.so+0x0000040df52c)
#57 do_main /home/froydnj/src/gecko-dev.git/browser/app/nsBrowserApp.cpp:294 (exe+0x00000005b07c)
#58 __libc_start_main /build/glibc-NmptCx/glibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021b44)
Flags: needinfo?(nfroyd)
Comment 4•10 years ago
|
||
Both write locations are guarded by an auto lock. Not sure what's going wrong here TBH.
Comment 5•10 years ago
|
||
I don't know what's happening here. It looks ok to me. I wonder if because we're forking there's something more complicated happening here with the copy-on-write behavior.
Flags: needinfo?(bgirard)
Reporter | ||
Comment 6•10 years ago
|
||
(In reply to Benoit Girard (:BenWa) from comment #5)
> I don't know what's happening here. It looks ok to me. I wonder if because
> we're forking there's something more complicated happening here with the
> copy-on-write behavior.
I wondered about that too, but we'd have to fork from the initial parent, and then fork again from the newly spawned child, as paf_prepare should only be called in the parent. I think I was running media tests, so maybe that's plausible?
I haven't verified, but it looks like the patch also introduces other weird races because of the newly adding locking.
Comment 7•10 years ago
|
||
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #3)
> I don't understand what's going on here, but I still see a race even with
> the patch applied, though with different stacks:
>
> WARNING: ThreadSanitizer: data race (pid=32087)
> Write of size 1 at 0x7ff866f87258 by thread T2:
> #0 paf_prepare
> /home/froydnj/src/gecko-dev.git/tools/profiler/platform-linux.cc:134
> (libxul.so+0x000003e7fbe0)
> #1 __libc_fork
> /build/glibc-NmptCx/glibc-2.19/posix/../nptl/sysdeps/unix/sysv/linux/x86_64/.
> ./fork.c:95 (libc.so.6+0x0000000b75c1)
> #2 LaunchApp
> /home/froydnj/src/gecko-dev.git/ipc/chromium/src/base/process_util_linux.cc:
> 281 (libxul.so+0x000000e1d080)
[...]
> Previous write of size 1 at 0x7ff866f87258 by thread T41:
> #0 paf_prepare
> /home/froydnj/src/gecko-dev.git/tools/profiler/platform-linux.cc:134
> (libxul.so+0x000003e7fbe0)
> #1 __libc_fork
> /build/glibc-NmptCx/glibc-2.19/posix/../nptl/sysdeps/unix/sysv/linux/x86_64/.
> ./fork.c:95 (libc.so.6+0x0000000b75c1)
> #2 g_test_get_filename ??:0 (libglib-2.0.so.0+0x000000070926)
[...]
> Thread T2 'Gecko_IOThread' (tid=32109, running) created by main thread at:
[...]
> Thread T41 'gmain' (tid=32155, running) created by main thread at:
> #0 pthread_create ??:0 (exe+0x00000002bdd2)
> #1 g_mutex_init ??:0 (libglib-2.0.so.0+0x00000008e1bf)
> #2 HandlerExists
> /home/froydnj/src/gecko-dev.git/uriloader/exthandler/unix/nsGNOMERegistry.
> cpp:22 (libxul.so+0x0000016c31a5)
> #3 OSProtocolHandlerExists
> /home/froydnj/src/gecko-dev.git/uriloader/exthandler/unix/
> nsOSHelperAppService.cpp:1154 (libxul.so+0x0000016abb89)
> #4 GetProtocolHandlerInfoFromOS
> /home/froydnj/src/gecko-dev.git/uriloader/exthandler/unix/
> nsOSHelperAppService.cpp:1516 (libxul.so+0x0000016ae6af)
So… if we're forking on two threads concurrently, paf_prepare can race on was_paused, is what I think it's saying. Would it work to change paused_ to an atomic counter holding the number of threads (or other entities) currently pausing the profiler, instead of a boolean?
Comment 8•10 years ago
|
||
Ahh yes you're right, even with the lock. But that's a logic error, not a data race. Can TSan detect something that advanced? With my patch all the access to was_paused should be protected.
Comment 9•4 years ago
|
||
Presumably fixed long ago
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•