Closed
Bug 1093823
Opened 10 years ago
Closed 10 years ago
ThreadLocal<bool> yields valgrind warning
Categories
(Core :: MFBT, defect)
Tracking
()
RESOLVED
FIXED
mozilla36
People
(Reporter: tromey, Assigned: tromey)
Details
Attachments
(1 file, 2 obsolete files)
3.09 KB,
patch
|
Details | Diff | Splinter Review |
I ran some tests under valgrind and saw this in the logs: ==14094== Conditional jump or move depends on uninitialised value(s) ==14094== at 0x4A2EBC0: pthread_setspecific (/usr/src/debug/glibc-2.18/nptl/pthread_setspecific.c:49) ==14094== by 0x6EAAF2D: mozilla::ThreadLocal<bool>::set(bool) (/obj-x86_64-unknown-linux-gnu/xpcom/threads/../../dist/include/mozilla/ThreadLocal.h:142) ==14094== by 0x6EA019C: NS_SetMainThread() (/xpcom/threads/nsThreadManager.cpp:43) ==14094== by 0x6E3287A: NS_LogInit (/xpcom/base/nsTraceRefcnt.cpp:1016) ==14094== by 0x41F814: NS_LogInit (/xpcom/glue/standalone/nsXPCOMGlue.cpp:884) ==14094== by 0x403C3F: InitXPCOMGlue(char const*, nsIFile**) (/browser/app/nsBrowserApp.cpp:571) ==14094== by 0x403CEE: main (/browser/app/nsBrowserApp.cpp:625) After that, there are many instance of this: ==14094== Conditional jump or move depends on uninitialised value(s) ==14094== at 0x4A2EB04: pthread_getspecific (/usr/src/debug/glibc-2.18/nptl/pthread_getspecific.c:57) ==14094== by 0x6EAAE2C: mozilla::ThreadLocal<bool>::get() const (/obj-x86_64-unknown-linux-gnu/xpcom/threads/../../dist/include/mozilla/ThreadLocal.h:127) ==14094== by 0x6EA0137: NS_IsMainThread() (/xpcom/threads/nsThreadManager.cpp:33) The bug here is that ThreadLocal::set creates a Helper object on the stack. If the ThreadLocal's "T" type is narrower than "void *", then some bits of this Helper are not initialized. This leads valgrind to report the issue during set(), and to report it again in get(). One possible fix is to memset the Helper to 0 in ThreadLocal::set. GCC will optimize out this memset in the pointer cases and convert it to a zero store in the bool case. However, I am not sure what other compilers might do. I verified that this silences the valgrind complaint. Another approach would be to add this to one of the valgrind suppression files. I think it's worth fixing this as the extra noise makes the valgrind logs harder to read.
Assignee | ||
Comment 1•10 years ago
|
||
A few more details. I am using x86-64 Fedora 20. I reproduced this with both the system valgrind and with a locally-built valgrind 3.10.0. My mozconfig is: ac_add_options --disable-optimize ac_add_options --enable-debug ac_add_options --enable-warnings-as-errors ac_add_options --disable-crashreporter ac_add_options --disable-jemalloc ac_add_options --enable-valgrind One way to reproduce: ./mach mochitest-devtools --debugger valgrind --debugger-args=--smc-check=all-non-file browser/devtools/profiler/test/browser_profiler_console-record-09.js
Comment 2•10 years ago
|
||
(In reply to Tom Tromey :tromey from comment #0) > I think it's worth fixing this as the extra noise makes the valgrind > logs harder to read. Yes please!
Assignee | ||
Comment 3•10 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #2) > (In reply to Tom Tromey :tromey from comment #0) > > I think it's worth fixing this as the extra noise makes the valgrind > > logs harder to read. > > Yes please! If you have a preferred way to fix it, I'm happy to implement it. I wasn't sure if the memset thing is a good way or not.
Comment 4•10 years ago
|
||
(In reply to Tom Tromey :tromey from comment #3) > If you have a preferred way to fix it, I'm happy to implement it. > I wasn't sure if the memset thing is a good way or not. Does something like: h.mPtr = 0; h.mValue = aValue; do the right thing? (I'd be fine with the memset version except that I don't know that some GCCs on some architectures (*cough* ARM) would a) inline the memset and eliminate it that way; or b) eliminate the memset without inlining. This way doesn't involve calling memset if something goes wrong.) A comment here is probably warranted, too.
Assignee | ||
Comment 5•10 years ago
|
||
I will test that. Another couple ideas: * Write get and set to cast via uintptr_t. This will either be a no-op, or cause the compiler to extend the value, in the bool case. We can remove Helper entirely. With a suitable comment I think this is both clean and clear; it's my preferred approach right now. * Specialize ThreadLocal<bool>. Seems grosser but it would work.
Assignee | ||
Comment 6•10 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #4) > Does something like: > > h.mPtr = 0; > h.mValue = aValue; > > do the right thing? I tried this on my reduced test case and it does work -- valgrind complains before the addition and is silent after.
Comment 7•10 years ago
|
||
(This is perhaps due to a lack of caffeine at this end, but ..) IMO there's something strange about all this. pthread_{set,get}_specific simply provide a thread-local key-value storage facility, afaics. I can see that the keys need to be defined for it to work properly, but I don't get why pthread_setspecific should care what the values contain or whether they are defined. It just copies them around, right? In which case V should not complain. I looked at __pthread_setspecific at glibc-2.19/nptl/pthread_setspecific.c:49 but am none the wiser. What it is complaining about is this "if" /* Remember that we stored at least one set of data. */ if (value != NULL) THREAD_SETMEM (self, specific_used, true); but I don't get why libc would special-case the value NULL. If I had to guess, I'd say it's some hack which allows libc to avoid storing the (k,v) pair at all if v == NULL, based on the observation that getspecific(k) returns NULL if k isn't known.
Comment 8•10 years ago
|
||
> * Write get and set to cast via uintptr_t. This will either be a no-op,
> or cause the compiler to extend the value, in the bool case. We can
> remove Helper entirely.
I think this would be preferable (1) because it gets rid of the redundant
store and load and replaces it with a mov{s,z}{b,w,l}q widener, and (2)
as a result of (1) we don't wind up potentially messing with the processor's
store forwarding logic, which typically doesn't like wide-load-after-narrow-store.
But maybe it's not worth the hassle. Nathan, any idea if this stuff gets
used on fast paths?
Assignee | ||
Comment 9•10 years ago
|
||
(In reply to Julian Seward [:jseward] from comment #7) > (This is perhaps due to a lack of caffeine at this end, but ..) Not at all -- thanks for digging deeper. > I looked at __pthread_setspecific at glibc-2.19/nptl/pthread_setspecific.c:49 > but am none the wiser. What it is complaining about is this "if" > > /* Remember that we stored at least one set of data. */ > if (value != NULL) > THREAD_SETMEM (self, specific_used, true); > > but I don't get why libc would special-case the value NULL. What is going on is that this line sets the "specific_used" flag in the non-NULL case, so that later on pthread_create.c:__nptl_deallocate_tsd can take a faster path when no thread-local values have been set. It is a performance optimization to try to avoid the loop that looks at values and calls destructors. There's also another value-checking branch: if (value == NULL) /* We don't have to do anything. The value would in any case be NULL. We can save the memory allocation. */ return 0; This one is more self-explanatory -- it is in the "level 2" block and tries to avoid allocating the second level array when possible. I think this all seems reasonable enough that I wouldn't try to pursue it in glibc. > But maybe it's not worth the hassle. Nathan, any idea if this stuff gets > used on fast paths? I wonder that too. There's only the one <bool> instantiation; all the other ones, I believe, use pointers, and so wouldn't pay any price.
Comment 10•10 years ago
|
||
(In reply to Julian Seward [:jseward] from comment #8) > > * Write get and set to cast via uintptr_t. This will either be a no-op, > > or cause the compiler to extend the value, in the bool case. We can > > remove Helper entirely. > > I think this would be preferable (1) because it gets rid of the redundant > store and load and replaces it with a mov{s,z}{b,w,l}q widener, and (2) > as a result of (1) we don't wind up potentially messing with the processor's > store forwarding logic, which typically doesn't like > wide-load-after-narrow-store. > > But maybe it's not worth the hassle. Nathan, any idea if this stuff gets > used on fast paths? I do not know about hot paths here. We're paying a through-the-PLT function call here anyway, I'm not going to get too worked up about this. uintptr_t storage works for me, too.
Assignee | ||
Comment 11•10 years ago
|
||
The resulting patch looks pretty ugly, I'm afraid. The problem is that when writing "get", with pointers we want to use reinterpret_cast, but with bool we want to use static_cast. So we need a new template function. And then to get the compiler to pick the right one I used type traits... seems like a lot of machinery! Hopefully I'm missing something obvious. A C-style cast would work but that seems pretty gross as well.
Assignee | ||
Comment 12•10 years ago
|
||
I thought of a prettier way to do this; attached. However, I am not sure it is acceptable. It occurs to me that with the current approach it is fine to instantiate ThreadLocal using a POD, provided the size of the struct is small enough. That will not work with the approach taken in this patch. If that case is intended to work then I think we're back to zero-initializing, or a suppression.
Comment 13•10 years ago
|
||
I am OK with disallowing ThreadLocal<PODStruct>. If you want a structure, you should use a pointer. If you really want a small structure, you can do the necessary dances with integers and casting. Enforcing this with |static_assert(IsIntegral<T>::value || IsPointer<T>::value, ...);| would be good, though.
Assignee | ||
Comment 14•10 years ago
|
||
This version adds the static assert. I'll send through "try" shortly.
Attachment #8518256 -
Attachment is obsolete: true
Attachment #8518284 -
Flags: review?(nfroyd)
Comment 15•10 years ago
|
||
Comment on attachment 8518284 [details] [diff] [review] avoid valgrind report from ThreadLocal<bool> Review of attachment 8518284 [details] [diff] [review]: ----------------------------------------------------------------- WFM. Thanks for fixing this.
Attachment #8518284 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 16•10 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=30e2b7c4d7e9
Assignee | ||
Comment 17•10 years ago
|
||
... this time with the missing header included.
Assignee | ||
Comment 18•10 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=b3cef8608076
Assignee | ||
Comment 19•10 years ago
|
||
Looking through some other try builds, I think the R-e10s orange there is an intermittent failure. Also I re-ran the one red and it worked the second time. So I'm marking this checkin-needed. Thanks.
Keywords: checkin-needed
Updated•10 years ago
|
Attachment #8518284 -
Attachment is obsolete: true
Updated•10 years ago
|
Assignee: nobody → ttromey
Comment 20•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/e6e89ea52d09
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/e6e89ea52d09
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
You need to log in
before you can comment on or make changes to this bug.
Description
•