Last Comment Bug 753119 - Move thread_helper into MFBT
: Move thread_helper into MFBT
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: MFBT (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla15
Assigned To: Benoit Jacob [:bjacob] (mostly away)
:
:
Mentors:
Depends on: 749678 756965
Blocks: 756439 755904
  Show dependency treegraph
 
Reported: 2012-05-08 14:44 PDT by Robert O'Callahan (:roc) (email my personal email if necessary)
Modified: 2012-05-23 12:57 PDT (History)
10 users (show)
ryanvm: in‑testsuite-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch (v1) (15.36 KB, patch)
2012-05-09 13:56 PDT, :Ehsan Akhgari
no flags Details | Diff | Splinter Review
Patch (v2) (15.09 KB, patch)
2012-05-11 10:58 PDT, :Ehsan Akhgari
jwalden+bmo: review-
Details | Diff | Splinter Review
Patch (v3) (14.02 KB, patch)
2012-05-16 14:12 PDT, :Ehsan Akhgari
no flags Details | Diff | Splinter Review
Patch (v1) (14.02 KB, patch)
2012-05-16 14:14 PDT, :Ehsan Akhgari
jwalden+bmo: review+
Details | Diff | Splinter Review
patch v2 (14.66 KB, patch)
2012-05-17 13:35 PDT, Benoit Jacob [:bjacob] (mostly away)
jwalden+bmo: review+
ehsan: review+
Details | Diff | Splinter Review
patch v3 (14.79 KB, patch)
2012-05-17 20:20 PDT, Benoit Jacob [:bjacob] (mostly away)
jacob.benoit.1: review+
Details | Diff | Splinter Review

Description Robert O'Callahan (:roc) (email my personal email if necessary) 2012-05-08 14:44:45 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/16fe44589079 uses thread_helper in gfx/GL, which is currently in tools/profiler. This seems like a bad dependency.

If we want a C++ TLS class then it probably should be in MFBT. It will need to be cleaned up to conform to Gecko/MFBT coding conventions. In particular it shouldn't have its own mozilla::tls namespace.
Comment 1 Benoit Jacob [:bjacob] (mostly away) 2012-05-08 14:46:15 PDT
Punting this on Ehsan if he's OK with that.
Comment 2 Benoit Jacob [:bjacob] (mostly away) 2012-05-08 14:46:43 PDT
MFBT coding style is found in mfbt/STYLE
Comment 3 Benoit Jacob [:bjacob] (mostly away) 2012-05-08 14:48:25 PDT
If you want compiled unit tests for this (you should!), support for compiled tests in mfbt/tests is coming in bug 732875.
Comment 4 :Ehsan Akhgari 2012-05-09 13:56:08 PDT
Created attachment 622502 [details] [diff] [review]
Patch (v1)
Comment 5 :Ms2ger (⌚ UTC+1/+2) 2012-05-09 14:04:03 PDT
Comment on attachment 622502 [details] [diff] [review]
Patch (v1)

Review of attachment 622502 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/TLS.h
@@ +1,4 @@
> +/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
> +/* ***** BEGIN LICENSE BLOCK *****
> + * Version: MPL 1.1/GPL 2.0/LGPL 2.1
> + *

MPL2

@@ +91,5 @@
> +    typedef unsigned long key;
> +
> +    template <typename T>
> +    static inline T* get(key mykey) {
> +      return (T*) TlsGetValue(mykey);

C-style casts--
Comment 6 Landry Breuil (:gaston) 2012-05-10 03:24:49 PDT
I'm interested in this, since #749678 broke builds on platforms without SPS profiler. Testing this fix.
Comment 7 Landry Breuil (:gaston) 2012-05-10 09:26:45 PDT
And the proposed fix in att #622502 works for me (let aside styling issues...)
Comment 8 Jeff Walden [:Waldo] (remove +bmo to email) 2012-05-10 17:56:13 PDT
Prior art:

http://trac.webkit.org/browser/trunk/Source/WTF/wtf/ThreadSpecific.h

Their ThreadSpecific suggests an improvement: make the key into a class that wraps the sig_safe_t, templatized on the type of the pointer being stored.  Then you don't have to make sure to specify the correct T every time you want to get a TLS value, nor do you have to expose a key type.  That seems like a big improvement to me over having to be consistent about it for every use.  Is there any reason we can't change the interface here to work like that?
Comment 9 Benoit Jacob [:bjacob] (mostly away) 2012-05-11 05:00:50 PDT
I've landed Landry's patch to reduce a bit the urgency of this:
http://hg.mozilla.org/mozilla-central/rev/94d9ddb6fed8

The patch here will have to be (minorly) adapted to that.
Comment 10 :Ehsan Akhgari 2012-05-11 10:58:01 PDT
Created attachment 623219 [details] [diff] [review]
Patch (v2)
Comment 11 Jeff Walden [:Waldo] (remove +bmo to email) 2012-05-15 17:13:06 PDT
Comment on attachment 623219 [details] [diff] [review]
Patch (v2)

Review of attachment 623219 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/TLS.h
@@ +33,5 @@
> + * and other provisions required by the GPL or the LGPL. If you do not delete
> + * the provisions above, a recipient may use your version of this file under
> + * the terms of any one of the MPL, the GPL or the LGPL.
> + *
> + * ***** END LICENSE BLOCK ***** */

Still MPL2 here.

@@ +47,5 @@
> +// winbase.h which are sufficient to get the prototypes for the Tls* functions.
> +// # include <windef.h>
> +// # include <winbase.h>
> +// Unfortunately, even including these headers causes us to add a bunch of ugly
> +// stuff to our namespace e.g #define CreateEvent CreateEventW

Better than the alternative, I guess, but blech.

@@ +55,5 @@
> +__declspec(dllimport) unsigned long __stdcall TlsAlloc();
> +}
> +#else
> +# include <pthread.h>
> +# include <signal.h>

These should be indented one more, e.g.:

#else
#  include <pthread.h>
#  include <signal.h>
#endif

@@ +60,5 @@
> +#endif
> +
> +namespace mozilla {
> +
> +// sig_safe_t is a type which is safe to be accessed atomically.

I guess this kind of muddies up the filename-matches-classname idea below to name this ThreadLocal.h...but not that much, really.

This comment is kind of vague about exactly what it's promising, when it'd be used, what "safe" means, etc.  Does this mean that accesses to variables and well-aligned locations of that type, even when not done with obviously-locking routines, will be atomic?  When would this be used?  Clarify this comment as much as you can, preferably with some sort of simple example.  If mfbt is to be documented by its source code, we need to hold the line here.

@@ +85,5 @@
> + * // Get the TLS value
> + * int value = tlsKey.get();
> + */
> +template <class T>
> +struct ThreadLocal

class ThreadLocal, please.

This file should be named ThreadLocal.h, not TLS.h.  The latter has some conceptual ambiguity to it with the TLS network protocol (ugh, I even had to disambiguate *that* with "network").  It's also a bit more pleasant for a class to be found in a file with the same name.  That's the case here and makes sense here, as the header's not defining a cluster of similar functionalities deserving of an umbrella name of sorts.

@@ +88,5 @@
> +template <class T>
> +struct ThreadLocal
> +{
> +#if defined(XP_WIN)
> +    typedef unsigned long key_t;

Given we have two completely separate implementations, each dealing with a different type in a different way, the typedef doesn't seem super-useful to me.  Let's just have |unsigned long key;| and |pthread_key_t key;| at the top of each section.

@@ +90,5 @@
> +{
> +#if defined(XP_WIN)
> +    typedef unsigned long key_t;
> +
> +    ThreadLocal() : key(0) {

For the typical global-variable use, this will induce static initializers.  (Static initializers interacting with threads at load time, even.  What could possibly go wrong?  :-) )  And we are on The Warpath against static initializers.  So this should be a fallible |MOZ_WARN_UNUSED_RESULT bool init()| method.

Somewhat unfortunately, since it appears it's valid to have a key of 0 for both Windows and pthreads, we probably should add a |bool initialized;| that gets set once init's succeeded, and have accessors assert that this has been initialized.  Bleh.

@@ +92,5 @@
> +    typedef unsigned long key_t;
> +
> +    ThreadLocal() : key(0) {
> +      key_t newkey = TlsAlloc();
> +      if (newkey != (unsigned long)0xFFFFFFFF /* TLS_OUT_OF_INDEXES */) {

Use 0xffffffffUL instead to not be so verbose, maybe?  (Still with the comment, sigh.)

@@ +94,5 @@
> +    ThreadLocal() : key(0) {
> +      key_t newkey = TlsAlloc();
> +      if (newkey != (unsigned long)0xFFFFFFFF /* TLS_OUT_OF_INDEXES */) {
> +        key = newkey;
> +      }

No braces around a single-line if.

@@ +98,5 @@
> +      }
> +    }
> +
> +    T* get() const {
> +      return reinterpret_cast<T*> (TlsGetValue(key));

No space between > and (.

@@ +112,5 @@
> +      key_t newkey;
> +      bool valid = !pthread_key_create(&newkey, NULL);
> +      if (valid) {
> +        key = newkey;
> +      }

No braces here.

@@ +116,5 @@
> +      }
> +    }
> +
> +    inline T* get() const {
> +      return reinterpret_cast<T*> (pthread_getspecific(key));

>(

@@ +125,5 @@
> +    }
> +#endif
> +
> +  public:
> +    bool isSet() const {

This should probably be |initialized()|, in case anyone wants to store NULL in a TLS entry.

@@ +126,5 @@
> +#endif
> +
> +  public:
> +    bool isSet() const {
> +      return !!key;

Hm, this looks dodgy.  There's nothing preventing a pthreads implementation, or Windows TlsAlloc implementation, from returning a key of 0, is there?  I guess you need the initialized member for this too.

::: tools/profiler/TableTicker.cpp
@@ +102,4 @@
>  #endif
>  
>  
> +mozilla::ThreadLocal<ProfileStack> pkey_stack;

The pkey_ prefix doesn't much make sense for these any more.  They should probably be renamed, but what to, I don't know.  tlsStack, tlsTicker?

@@ +691,5 @@
>  void mozilla_sampler_init()
>  {
>    // TODO linux port: Use TLS with ifdefs
> +  if (!pkey_stack.isSet() ||
> +      !pkey_ticker.isSet()) {

Single line?  And that TODO comment looks like it can be removed now (and should have been when the tls::create use was added initially, even).
Comment 12 :Ehsan Akhgari 2012-05-16 13:45:55 PDT
Comment on attachment 623219 [details] [diff] [review]
Patch (v2)

Review of attachment 623219 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/TLS.h
@@ +88,5 @@
> +template <class T>
> +struct ThreadLocal
> +{
> +#if defined(XP_WIN)
> +    typedef unsigned long key_t;

I disagree.  ThreadLocal::key_t might be useful in consumers of this type.  It also avoids a little bit of code duplication.

::: tools/profiler/TableTicker.cpp
@@ +102,4 @@
>  #endif
>  
>  
> +mozilla::ThreadLocal<ProfileStack> pkey_stack;

This doesn't really have anything to do with this bug.  I'd rather do this in another bug.
Comment 13 Luke Wagner [:luke] 2012-05-16 13:55:47 PDT
Seems like key_t, if it is necessary at all, should be a private typedef since it is an implementation detail of ThreadLocal.
Comment 14 Jeff Walden [:Waldo] (remove +bmo to email) 2012-05-16 14:06:14 PDT
Comment on attachment 623219 [details] [diff] [review]
Patch (v2)

Review of attachment 623219 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/TLS.h
@@ +88,5 @@
> +template <class T>
> +struct ThreadLocal
> +{
> +#if defined(XP_WIN)
> +    typedef unsigned long key_t;

What Luke said.  I would also suggest that "might be useful" is not exactly the level I think we should require when considering whether to make something public.

::: tools/profiler/TableTicker.cpp
@@ +102,4 @@
>  #endif
>  
>  
> +mozilla::ThreadLocal<ProfileStack> pkey_stack;

I think this is exactly associated with this patch's changes.  Prior to the patch, this was a key for TLS pointer lookups.  After the patch, it's no longer a key, just an abstraction around accessing TLS entries.  The name change is intrinsically tied to the changing nature of the variable.

But as long as you actually promptly make a name change, fine, I don't care if it happens in another bug.  I just don't want the intermediate misnamed state to persist.
Comment 15 :Ehsan Akhgari 2012-05-16 14:12:15 PDT
Created attachment 624527 [details] [diff] [review]
Patch (v3)
Comment 16 :Ehsan Akhgari 2012-05-16 14:14:46 PDT
Created attachment 624531 [details] [diff] [review]
Patch (v1)

Oops, made a small mistake!
Comment 17 :Ehsan Akhgari 2012-05-16 14:17:37 PDT
(In reply to Jeff Walden [:Waldo] (busy, try to prefer other reviewers if possible) from comment #14)

> ::: tools/profiler/TableTicker.cpp
> @@ +102,4 @@
> >  #endif
> >  
> >  
> > +mozilla::ThreadLocal<ProfileStack> pkey_stack;
> 
> I think this is exactly associated with this patch's changes.  Prior to the
> patch, this was a key for TLS pointer lookups.  After the patch, it's no
> longer a key, just an abstraction around accessing TLS entries.  The name
> change is intrinsically tied to the changing nature of the variable.
> 
> But as long as you actually promptly make a name change, fine, I don't care
> if it happens in another bug.  I just don't want the intermediate misnamed
> state to persist.

Sure, yeah, it just doesn't make sense to convolute this patch with what is essentially just running a sed script!  :-)  I've filed bug 755904 for that.
Comment 18 Jeff Walden [:Waldo] (remove +bmo to email) 2012-05-16 16:03:15 PDT
Comment on attachment 624531 [details] [diff] [review]
Patch (v1)

Review of attachment 624531 [details] [diff] [review]:
-----------------------------------------------------------------

glandium, see the static initializer comment specifically -- looking for feedback on that alone.

::: gfx/gl/GLContext.h
@@ +570,1 @@
>      static bool sTLSKeyAlreadyCreated;

You don't need this bool, just use sTLSKey.initialized().

@@ +573,5 @@
>  
>      GLContextTLSStorage() {
>          if (!sTLSKeyAlreadyCreated) {
> +            bool result = sTLSKey.init();
> +            mozilla::unused << result;

Assert that this was initialized successfully, or NS_RUNTIMEABORT() or something, on failure, if you're not going to propagate failure.

::: mfbt/ThreadLocal.h
@@ +59,5 @@
> +#if defined(XP_WIN)
> +    typedef unsigned long key_t;
> +
> +  public:
> +    NS_WARN_UNUSED_RESULT bool init() {

MOZ_WARN_UNUSED_RESULT, and you'll need to add a #include "mozilla/Attributes.h" to get this.

@@ +69,5 @@
> +      return initialized();
> +    }
> +
> +    T* get() const {
> +      return reinterpret_cast<T*>(TlsGetValue(key));

MOZ_ASSERT(initialized()); and you'll need to add #include "mozilla/Assertions.h" to get this.

@@ +73,5 @@
> +      return reinterpret_cast<T*>(TlsGetValue(key));
> +    }
> +
> +    bool set(const T* value) {
> +      return TlsSetValue(key, const_cast<T*>(value));

MOZ_ASSERT(initialized());

@@ +79,5 @@
> +#else
> +    typedef pthread_key_t key_t;
> +
> +  public:
> +    NS_WARN_UNUSED_RESULT bool init() {

MOZ_WARN_UNUSED_RESULT

@@ +90,5 @@
> +      return initialized();
> +    }
> +
> +    inline T* get() const {
> +      return reinterpret_cast<T*>(pthread_getspecific(key));

MOZ_ASSERT(initialized());

@@ +94,5 @@
> +      return reinterpret_cast<T*>(pthread_getspecific(key));
> +    }
> +
> +    inline bool set(const T* value) {
> +      return !pthread_setspecific(key, value);

MOZ_ASSERT(initialized());

@@ +99,5 @@
> +    }
> +#endif
> +
> +  public:
> +    ThreadLocal() : key(0), inited(false) {}

I think this will cause static initializers, sadly, so you probably need to remove this.  Forwarding r?glandium to verify this, you're good to go with changes once you get his say-so on this point.

::: tools/profiler/TableTicker.cpp
@@ +691,3 @@
>  void mozilla_sampler_init()
>  {
>    // TODO linux port: Use TLS with ifdefs

Comment still needs to go.
Comment 19 Mike Hommey [:glandium] 2012-05-16 22:46:45 PDT
Comment on attachment 624531 [details] [diff] [review]
Patch (v1)

Review of attachment 624531 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/ThreadLocal.h
@@ +99,5 @@
> +    }
> +#endif
> +
> +  public:
> +    ThreadLocal() : key(0), inited(false) {}

Indeed, this kind of construct creates static initializers if you use for global variables. Removing the constructor should still have the right behavior, though.

As a side note, we should make an effort using __thread on non-Windows, when supported. It has (much) less overhead. Also, why not make the API work more conveniently (like, say, nsCOMPtr), and allow tlsKey = 123; and int value = tlsKey?
Comment 20 Mike Hommey [:glandium] 2012-05-16 22:57:45 PDT
Another note, you can rid of inited on windows if your initial value for the key is TLS_OUT_OF_INDEXES. Which is fine on Windows because MSVC is smarter than gcc and doesn't create a static initializer for that. If you are concerned by mingw, you can use key == 0 as initial value, (key + TLS_OUT_OF_INDEXES) when using the key, and TlsAlloc() - TLS_OUT_OF_INDEXES when creating it.
Comment 21 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-05-16 23:04:27 PDT
(In reply to Mike Hommey [:glandium] from comment #19)
> As a side note, we should make an effort using __thread on non-Windows, when
> supported. It has (much) less overhead.

MSVC has __declspec(thread). It sure would be great if we can use these!
Comment 22 Mike Hommey [:glandium] 2012-05-16 23:17:41 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #21)
> (In reply to Mike Hommey [:glandium] from comment #19)
> > As a side note, we should make an effort using __thread on non-Windows, when
> > supported. It has (much) less overhead.
> 
> MSVC has __declspec(thread). It sure would be great if we can use these!

Unfortunately, we can't use __declspec(thread)
http://msdn.microsoft.com/en-us/library/9w1sdazb%28v=vs.80%29.aspx#1
Comment 23 Jeff Walden [:Waldo] (remove +bmo to email) 2012-05-17 12:03:00 PDT
(In reply to Mike Hommey [:glandium] from comment #19)
> As a side note, we should make an effort using __thread on non-Windows, when
> supported. It has (much) less overhead.

I'd be interested in a link to details on how this is implemented, such that its performance is much better than thread-specific data.  As a practical matter, however, it seems to me that in the absence of a test where TLS access is a demonstrable perf bottleneck, we should stick with the thing that works, and doesn't have compiler- and linker- and other system-specific dependencies.

> Also, why not make the API work more conveniently (like, say, nsCOMPtr),
> and allow tlsKey = 123; and int value = tlsKey?

I am not expressing a strong opinion on this, but this style would seem to obscure when TLS gets accessed.  Better to be explicit about it so the developer knows when he's paying the access cost.  Local variables don't seem like too much of a burden in order to be more explicit here.
Comment 24 :Ehsan Akhgari 2012-05-17 12:17:12 PDT
Sorry, I'm not gonna have enough time to address the comments here.  bjacob might.
Comment 25 :Ehsan Akhgari 2012-05-17 12:19:16 PDT
(In reply to Mike Hommey [:glandium] from comment #22)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #21)
> > (In reply to Mike Hommey [:glandium] from comment #19)
> > > As a side note, we should make an effort using __thread on non-Windows, when
> > > supported. It has (much) less overhead.
> > 
> > MSVC has __declspec(thread). It sure would be great if we can use these!
> 
> Unfortunately, we can't use __declspec(thread)
> http://msdn.microsoft.com/en-us/library/9w1sdazb%28v=vs.80%29.aspx#1

Isn't __thread basically implemented the same way with similar limitations?
Comment 26 Benoit Jacob [:bjacob] (mostly away) 2012-05-17 13:35:30 PDT
Created attachment 624866 [details] [diff] [review]
patch v2

Updated patch.

I wanted to keep the class definition as free of #ifdefs as possible. Hopefully you agree that this is cleaner this way.

Regarding __thread / declspect(thread), Jeff M tells me that it's not supported on Mac before 10.7, which is another reason why we can't use it in the near future.

Regarding static initializers: while Mike's suggestion on Windows may be a useful short-term fix, the fact that it's only possible on Windows due to the existence of a special reserved TLS key value there, makes me think that the right way to avoid static initializers here, is the same way we do for other existing types with constructors: either let the global variable be only a pointer, initialized to null, e.g.

ThreadLocal<Foo> *sFooPointer = null;

or let them be static local variables in functions so they will only be initialized the first time the function is called.
Comment 27 Benoit Jacob [:bjacob] (mostly away) 2012-05-17 14:18:10 PDT
Also note that the GLContext bits are not needed at the moment, as the GLContext patch using thread_helper has been backed out.
Comment 28 Jeff Walden [:Waldo] (remove +bmo to email) 2012-05-17 15:37:46 PDT
Comment on attachment 624866 [details] [diff] [review]
patch v2

Review of attachment 624866 [details] [diff] [review]:
-----------------------------------------------------------------

Honestly I'm not sure why you're changing the patch this much, the previous thing was fine (and not significantly less clear, if it was less clear at all) and was only a few nits from landing.  :-\  But if you really think this is that much better -- I don't think I care -- so be it.

::: mfbt/ThreadLocal.h
@@ +28,5 @@
> +#include "mozilla/Assertions.h"
> +#include "mozilla/Attributes.h"
> +
> +
> +namespace mozilla {

Only one blank line before this.

@@ +57,5 @@
> + * // Get the TLS value
> + * int value = tlsKey.get();
> + */
> +template <typename T>
> +class ThreadLocal

All the methods in this class should have inline specifiers on their declarations, when the definitions are out-of-line (i.e. everything but initialized()).

@@ +65,5 @@
> +#else
> +    typedef pthread_key_t key_t;
> +#endif
> +
> +    static bool createKey(key_t& k);

With comments below, this should become non-static.

@@ +74,5 @@
> +    T* get() const;
> +
> +    bool set(const T* value);
> +
> +    ThreadLocal() : key(0), inited(false) {}

Remove this -- it causes static initializers as discussed earlier.

The nature of thread-local data is such that ThreadLocal<T> really only makes sense as variables with static duration -- static members of classes, static variables in functions, or as global variables.  All those cases will get zero-initialized.  So I don't believe we need static pointers to ThreadLocal<T> instances -- we'll be just fine with static instances.

@@ +86,5 @@
> +    bool inited;
> +};
> +
> +template <typename T>
> +inline bool ThreadLocal<T>::createKey(key_t& k) {

template<typename T>
inline bool
ThreadLocal<T>::createKey()
{

The outparam always flows into the member field (on success), so you might as well just have this use the field directly, and not have the outparam indirection.  The failure case will be rare, and in that case things won't be inited, so calling other methods would be an API violation that asserts would detect.

@@ +96,5 @@
> +#endif
> +}
> +
> +template <typename T>
> +inline bool ThreadLocal<T>::init() {

Same formatting change as with the previous method (and in other out-of-line inlines after this).

@@ +102,5 @@
> +  if (createKey(newkey)) {
> +    key = newkey;
> +    inited = true;
> +  }
> +  return initialized();

With the createKey change (and asserting not-initialized at start), I think you'd just have this as the body:

{
  MOZ_ASSERT(!initialized());
  inited = createKey();
  return inited;
}

(You could use initialized() too for the last one, I don't really care -- just seems not using it might be clearer here.  Maybe.)

@@ +116,5 @@
> +#endif
> +}
> +
> +template <typename T>
> +inline bool ThreadLocal<T>::set(const T* value) {

Hmm, I somehow missed that set() was fallible when reading previous patches.  Probably in a followup we should mark this as warn-unused-result and make all users handle failure -- or we should crash on failure.  (No need to do anything now on this point, but feel free to say something if you have an opinion which way is better.)
Comment 29 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-05-17 17:08:28 PDT
(In reply to Jeff Walden [:Waldo] (busy, try to prefer other reviewers if possible) from comment #23)
> I'd be interested in a link to details on how this is implemented, such that
> its performance is much better than thread-specific data.

http://www.akkadia.org/drepper/tls.pdf

> As a practical
> matter, however, it seems to me that in the absence of a test where TLS
> access is a demonstrable perf bottleneck, we should stick with the thing
> that works, and doesn't have compiler- and linker- and other system-specific
> dependencies.

TLS gets used at every point where we add SPS instrumentation probes. So reducing its overhead will reduce SPS overhead and let us add more probes.
Comment 30 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-05-17 17:10:07 PDT
(In reply to Ehsan Akhgari [:ehsan] from comment #25)
> > Unfortunately, we can't use __declspec(thread)
> > http://msdn.microsoft.com/en-us/library/9w1sdazb%28v=vs.80%29.aspx#1
> 
> Isn't __thread basically implemented the same way with similar limitations?

Yes. The problem is that, according to that link, __declspec(thread) doesn't work in DLLs loaded with LoadLibrary in Windows XP.
Comment 31 :Ehsan Akhgari 2012-05-17 17:19:08 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #30)
> (In reply to Ehsan Akhgari [:ehsan] from comment #25)
> > > Unfortunately, we can't use __declspec(thread)
> > > http://msdn.microsoft.com/en-us/library/9w1sdazb%28v=vs.80%29.aspx#1
> > 
> > Isn't __thread basically implemented the same way with similar limitations?
> 
> Yes. The problem is that, according to that link, __declspec(thread) doesn't
> work in DLLs loaded with LoadLibrary in Windows XP.

Yeah, which is how we load xul.dll (among others) on that platform.
Comment 32 Jeff Walden [:Waldo] (remove +bmo to email) 2012-05-17 17:24:53 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #29)
> http://www.akkadia.org/drepper/tls.pdf

Thanks, something to read.

> TLS gets used at every point where we add SPS instrumentation probes. So
> reducing its overhead will reduce SPS overhead and let us add more probes.

This is not quite numbers, but it is plausible.  On the other hand, SPS already uses these putatively-slower thread-specific bits, so and we are making things no worse here.  And this bug is not about improving what we have now except in the interface to it.  So __thread is out of scope for this bug, here and now.  We can return to it in another bug.
Comment 33 Benoit Jacob [:bjacob] (mostly away) 2012-05-17 19:41:06 PDT
(In reply to Jeff Walden [:Waldo] (busy, try to prefer other reviewers if possible) from comment #28)
> Honestly I'm not sure why you're changing the patch this much, the previous
> thing was fine (and not significantly less clear, if it was less clear at
> all) and was only a few nits from landing.  :-\  But if you really think
> this is that much better -- I don't think I care -- so be it.

I should have explained the motivation more.

What made me uncomfortable with the previous patch was that the whole class was like

class ThreadLocal {
#ifdef XP_WIN
  ... class in Windows case
#else
  ... class in non-Windows case
#endif
};

In other words, there was little rigidity there guaranteeing that the interface was the same in the Windows and non-Windows cases. I wanted to change it so that the interface would be explicitly independent of the platform, and only the implementation would be platform-ifdef'd.
Comment 34 Benoit Jacob [:bjacob] (mostly away) 2012-05-17 20:11:09 PDT
(In reply to Jeff Walden [:Waldo] (busy, try to prefer other reviewers if possible) from comment #28)
> @@ +57,5 @@
> > + * // Get the TLS value
> > + * int value = tlsKey.get();
> > + */
> > +template <typename T>
> > +class ThreadLocal
> 
> All the methods in this class should have inline specifiers on their
> declarations, when the definitions are out-of-line (i.e. everything but
> initialized()).

Really? I didn't know that |inline| on a function declaration had any effect. Thought that it was purely a function-definition thing. But, OK.

> > +    ThreadLocal() : key(0), inited(false) {}
> 
> Remove this -- it causes static initializers as discussed earlier.
> 
> The nature of thread-local data is such that ThreadLocal<T> really only
> makes sense as variables with static duration -- static members of classes,
> static variables in functions, or as global variables.

OK, the !initialized() assertion in init() will take care of guaranteeing this, but I'll also add a comment about this in the class comment.

>  All those cases will
> get zero-initialized.

Wow! I just learned something huge about C++!

Stackoverflow says you're right:

http://stackoverflow.com/questions/3553559/how-are-local-and-global-variables-initialized-by-default

"3.6.2- "Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) before any other initialization takes place."

Alright then, of course I agree with you, I just had no idea that that was the case.

> So I don't believe we need static pointers to
> ThreadLocal<T> instances -- we'll be just fine with static instances.

Indeed!

> 
> Same formatting change as with the previous method (and in other out-of-line
> inlines after this).
> 
> @@ +102,5 @@
> > +  if (createKey(newkey)) {
> > +    key = newkey;
> > +    inited = true;
> > +  }
> > +  return initialized();
> 
> With the createKey change (and asserting not-initialized at start), I think
> you'd just have this as the body:
> 
> {
>   MOZ_ASSERT(!initialized());

Aha -- yes, I actually thought about making createKey non-static but then like you I thought it would go best if it asserted !initialized... but I didn't know for sure if that were an acceptable API change. OK then.

In fact, then, init simplifies quite a bit (no need for newkey anymore) and so createKey is useless, merging it back into init.

> @@ +116,5 @@
> > +#endif
> > +}
> > +
> > +template <typename T>
> > +inline bool ThreadLocal<T>::set(const T* value) {
> 
> Hmm, I somehow missed that set() was fallible when reading previous patches.
> Probably in a followup we should mark this as warn-unused-result and make
> all users handle failure -- or we should crash on failure.  (No need to do
> anything now on this point, but feel free to say something if you have an
> opinion which way is better.)

I don't have an opinion because I don't know how rare this failure will be. If it is as rare as OOM-during-a-small-malloc then crashing seems like the right approach. If it can fail like a huge malloc can, then fallibility and forcing users to check the status sounds like a good idea.
Comment 35 Benoit Jacob [:bjacob] (mostly away) 2012-05-17 20:20:55 PDT
Created attachment 624999 [details] [diff] [review]
patch v3

Updated patch, on try:
https://tbpl.mozilla.org/?tree=Try&rev=c5660c0786d0
Comment 36 Mike Hommey [:glandium] 2012-05-18 00:42:49 PDT
(In reply to Ehsan Akhgari [:ehsan] from comment #25)
> > Unfortunately, we can't use __declspec(thread)
> > http://msdn.microsoft.com/en-us/library/9w1sdazb%28v=vs.80%29.aspx#1
> 
> Isn't __thread basically implemented the same way with similar limitations?

If you mean __thread on windows, yes, which is why i said non-windows. On Linux, it doesn't have this limitation.
Comment 37 Benoit Jacob [:bjacob] (mostly away) 2012-05-18 05:57:37 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/fdac756d438b
Comment 38 Jeff Walden [:Waldo] (remove +bmo to email) 2012-05-18 16:17:52 PDT
(In reply to Benoit Jacob [:bjacob] from comment #33)
> In other words, there was little rigidity there guaranteeing that the
> interface was the same in the Windows and non-Windows cases. I wanted to
> change it so that the interface would be explicitly independent of the
> platform, and only the implementation would be platform-ifdef'd.

Ah, that makes sense -- I agree.

(In reply to Benoit Jacob [:bjacob] from comment #34)
> Really? I didn't know that |inline| on a function declaration had any
> effect. Thought that it was purely a function-definition thing. But, OK.

I'm not sure if its presence on a declaration has much effect, either.  I consider it more of a documentation thing: this function will be defined inline at some later point, possibly in some later file.  (Of course, when the declaration is the definition, marking it inline at that point is redundant and doesn't document anything at all, actually.)

> If it is as rare as OOM-during-a-small-malloc then crashing seems like the
> right approach.

I think it's about this rare.  It sort of depends how many thread-locals will ever be in existence at one time.  The Windows 1k-ish restriction means we're talking 4KB/8KB max allocation that might be needed, that might fail.  That's getting closer to the big side, but I'm pretty sure we won't have near that many, ever -- usually it'll be a small allocation.  So let's just have it MOZ_CRASH() in that case.
Comment 39 Ryan VanderMeulen [:RyanVM] 2012-05-18 18:09:37 PDT
https://hg.mozilla.org/mozilla-central/rev/fdac756d438b
Comment 40 Mike Hommey [:glandium] 2012-05-19 10:03:12 PDT
There is something wrong with the ThreadLocal API: get() and set() handle a T*, which makes the example wrong:
ThreadLocal<int> tlsKey;
tlsKey.set(123); // 123 is not an int*
int value = tlsKey.get(); // tlsKey.get() returns an int*, not an int.
Comment 41 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-05-23 12:57:17 PDT
(In reply to Jeff Walden [:Waldo] (busy, try to prefer other reviewers if possible) from comment #32)
> This is not quite numbers, but it is plausible.  On the other hand, SPS
> already uses these putatively-slower thread-specific bits, so and we are
> making things no worse here.  And this bug is not about improving what we
> have now except in the interface to it.  So __thread is out of scope for
> this bug, here and now.  We can return to it in another bug.

Filed bug 757969.

Note You need to log in before you can comment on or make changes to this bug.