Open Bug 987887 Opened 11 years ago Updated 3 years ago

Add Atomic::load() and Atomic::store(T) member methods to permit atomics uses to be more visible/obvious/explicit, when desired

Categories

(Core :: MFBT, defect)

x86
macOS
defect

Tracking

()

People

(Reporter: ehsan.akhgari, Unassigned)

References

Details

See bug 985878 and bug 987667 for context here. I'm beginning to reach the conclusion that the std::atomic type in C++11 is a bad idea and I think this is a pattern which we should remove from our code base. Here's the case for my proposal: The idea behind std::atomic is to enable one to write generic code which operates either on atomic or non-atomic types. But as these two bugs clearly show, that idea is not really based on reality, since you cannot just switch type Foo to atomic<Foo> and have the resulting code be thread-safe. If anything, having an atomic type makes it very easy for the patch author and the reviewer to not pay attention to this problem when the code seemingly just manipulates a bunch of built-in types. I think a much better model would be to have an atomic type which doesn't expose any operation except for perhaps copy construction and assignment and have explicit functions that you can call on the atomic type (and only on the atomic type) which enable you to access the compiler intrinsics and such in a cross platform way. What do people think about this?
I was thinking similar thoughts after seeing bug 985878, but I wasn't sure if it was because it was actually a good idea or because I'm becoming a grouchy old programmer who doesn't like these new-fangled things. So count me in support.
I also stumbled over this when reading code that used atomic and would be in support.
I like being able to write |myAtomicCounter++| though...
Count me in too; I like convenience, but I like safety more.
(In reply to comment #3) > I like being able to write |myAtomicCounter++| though... Ben, even after reading bug 985878, and the fact that neither me or the three other experienced Gecko hackers who reviewed my patch managed to catch the (obvious!) error there, do you still maintain that the syntactic nicety of the above is worth the potential thread safety hazard?
I'm not convinced the alternative you're proposing would be less error prone.
(In reply to comment #6) > I'm not convinced the alternative you're proposing would be less error prone. Can you please explain why you think that? Note that I don't claim my proposal will make it easier for novice programmers to use this class. My main worry is about hackers who already know the semantics of the compiler specific intrinsics that can currently be used for this purpose. The function names operationg on atomic types with my proposal will have obvious names (for example, mozilla::FetchAndAdd, etc.) That is at least better than the current situation where we have something so error prone that even our experienced hackers would also get it wrong.
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #7) > That is at least better than the current situation where we have something > so error prone that even our experienced hackers would also get it wrong. The flip side, perhaps, is that we didn't previously have experience with this particular mistake, and now we do. That might make the second time different -- or it might not.
(In reply to comment #8) > (In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment > #7) > > That is at least better than the current situation where we have something > > so error prone that even our experienced hackers would also get it wrong. > > The flip side, perhaps, is that we didn't previously have experience with this > particular mistake, and now we do. That might make the second time different > -- or it might not. Yeah, that is a good point. But in these situations, I usually tend to assume the worst. :-)
Given C++11 added pretty much the types we've added, it seems unlikely that people will forever make careless mistakes here, to be honest. And when C++11 <atomic> is available everywhere, I don't see why we wouldn't switch to it, to save ourselves the trouble of maintaining our own copy. (And to expose slightly-more-optimal things like http://stackoverflow.com/questions/10268737/c11-atomics-and-intrusive-shared-pointer-reference-count when well-justified, and there are a bunch of justifications for that particular one floating around the web.) The takeaway to me, given atomic<> isn't going away in the long run, seems to be that people need to pay harder attention when using the types. The use-by-template case seems like it requires particular care. Given both instances are of that flavor, I'm not yet inclined to worry too hard. Throwing more people at reviewing patches using atomics, and telling them they're to look specifically for atomic issues, should catch the sort of simple issues flagged here. (More complex ones, not so much, but I guess we're not considering those in any case.) One thing that would solve the issue for reference counting would be to expose a ReferenceCount<Atomic/NonAtomic> type, once. (This would also make it easier to implement the mini-optimization noted before, everywhere.) I am somewhat disturbed, in retrospect, that there were multiple places that needed a fix for this.
Or wait, no, I'm apparently not quite accurate about it being templates specifically at fault for at least one of the cases. I'm still inclined to say pushing to remind reviewers to be more brain-on about specifically atomic issues is a largely adequate fix.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #10) > Given C++11 added pretty much the types we've added, it seems unlikely that > people will forever make careless mistakes here, to be honest. And when > C++11 <atomic> is available everywhere, I don't see why we wouldn't switch > to it, to save ourselves the trouble of maintaining our own copy. (And to > expose slightly-more-optimal things like > http://stackoverflow.com/questions/10268737/c11-atomics-and-intrusive-shared- > pointer-reference-count when well-justified, and there are a bunch of > justifications for that particular one floating around the web.) I enjoy the new shiny language/library features as much as the next person, but that being said, I don't really think what C++11 supports should have much weight in what we decide to do here. C++ has a lot of error prone and hard to use features and we have chosen to not use some of them because they are hard to get right. > The takeaway to me, given atomic<> isn't going away in the long run, seems > to be that people need to pay harder attention when using the types. The > use-by-template case seems like it requires particular care. Given both > instances are of that flavor, I'm not yet inclined to worry too hard. > Throwing more people at reviewing patches using atomics, and telling them > they're to look specifically for atomic issues, should catch the sort of > simple issues flagged here. (More complex ones, not so much, but I guess > we're not considering those in any case.) I think you're missing the problematic point here. As soon as I read bug 985878, I knew exactly what's going wrong, and how to fix it. It's not a question of the people who wrote/looked at that code not knowing the atomicity semantics. The reason why I missed this when I was writing the original patch was that nothing in the code I was touching was making it obvious that I'm dealing with an atomic type (neither operator++ or operator--!) so I was just not thinking about the atomicity issue. And it's true that it is the use-by-template case which is the real bad footgun because it makes things even less obvious. Also, the other thing to note here is that we're talking about thread-safety which is already difficult to get right, even for the experienced hacker. I don't think we should make it possible to get things wrong in new ways. > One thing that would solve the issue for reference counting would be to > expose a ReferenceCount<Atomic/NonAtomic> type, once. (This would also make > it easier to implement the mini-optimization noted before, everywhere.) I > am somewhat disturbed, in retrospect, that there were multiple places that > needed a fix for this. We can definitely fix this on a case by case basis by hiding the atomic types behind other abstractions so that people touch the atomic type less often.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #11) > Or wait, no, I'm apparently not quite accurate about it being templates > specifically at fault for at least one of the cases. I'm still inclined to > say pushing to remind reviewers to be more brain-on about specifically > atomic issues is a largely adequate fix. Really the core of the issue is that we have a type which walks like a duck, quacks like a duck, but oh boy you would be wrong if you thought it was anything close to a duck! :-)
Just because there's one bad pattern that got copied over and over again doesn't mean the type is bad. The reason I made mozilla::Atomic in the first place was in part because that was what std::atomic did, but the main reason was because I thought it prevented errors as well: you cannot do a non-atomic operation on a mozilla::Atomic type without serious type-punning. The conversion of NSPR intrinsics to mozilla::Atomic did catch a case where this was happening, erroneously marked as being "safe." I don't like FetchAndAdd-style named methods, in large part because it's never clear off the top of my head if these correspond to ++x and x++. In truth, you could probably avoid most of these refcount-style bugs with atomic by removing operator T() and operator=(T) methods and requiring .load() and .store() methods instead.
(In reply to Joshua Cranmer [:jcranmer] from comment #14) > Just because there's one bad pattern that got copied over and over again > doesn't mean the type is bad. The reason I made mozilla::Atomic in the first > place was in part because that was what std::atomic did, but the main reason > was because I thought it prevented errors as well: you cannot do a > non-atomic operation on a mozilla::Atomic type without serious type-punning. > The conversion of NSPR intrinsics to mozilla::Atomic did catch a case where > this was happening, erroneously marked as being "safe." You can definitely perform non-atomic operations without *any* type pruning using Atomic: struct Foo { mozilla::Atomic<int> mRefCount; void Release() { --mRefCount; if (mRefCount == 0) { delete this; } } // ... }; Your assertion is definitely true if you only consider each expression individually, but that's not a very useful semantic. I think the semantics of the type should cover the operations over the type not just individual expressions touching the type. > I don't like FetchAndAdd-style named methods, in large part because it's > never clear off the top of my head if these correspond to ++x and x++. How about picking more explicit names, such as FetchAndThenAdd and AddAndThenFetch or something? > In > truth, you could probably avoid most of these refcount-style bugs with > atomic by removing operator T() and operator=(T) methods and requiring > .load() and .store() methods instead. That is definitely one possible improvement. Another one would be marking operations such as ++ and -- MOZ_WARN_UNUSED_RESULT. We could definitely do both if I can't convince Waldo in this bug, but I think taking those improvements will already move us away from std::atomic compatibility and is at least an acknowledgement of the problem here!
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #15) > Your assertion is definitely true if you only consider each expression > individually, but that's not a very useful semantic. I think the semantics > of the type should cover the operations over the type not just individual > expressions touching the type. The case I was thinking about amounts to something like this: int flag; T1: atomic_store(flag, 5); T2: if (flag) This is the kind of idiot-proofing I was originally most concerned about when designing the interface. > > I don't like FetchAndAdd-style named methods, in large part because it's > > never clear off the top of my head if these correspond to ++x and x++. > > How about picking more explicit names, such as FetchAndThenAdd and > AddAndThenFetch or something? The first one doesn't add any useful clarity: the question in my mind is always whether or not the method returns the original or the modified value. > > > In > > truth, you could probably avoid most of these refcount-style bugs with > > atomic by removing operator T() and operator=(T) methods and requiring > > .load() and .store() methods instead. > > That is definitely one possible improvement. Another one would be marking > operations such as ++ and -- MOZ_WARN_UNUSED_RESULT. We could definitely do > both if I can't convince Waldo in this bug, but I think taking those > improvements will already move us away from std::atomic compatibility and is > at least an acknowledgement of the problem here! I think marking ++ as MOZ_WARN_UNUSED_RESULT is driving it too much towards one particular use case. Note that, for example, ::AddRef typically doesn't need to observe the value, and using mozilla::Atomic for thread-safe counter accumulators could easily require just incrementing the value. I'm in the same mind as Waldo right now: the problem to me isn't a problem in mozilla::Atomic, it's a review problem. Thread-safety is extremely difficult and anyone reviewing code that involves multiple threads ought to be taking the time to think hard about possible race conditions. Ditching mozilla::Atomic because people have problems with the use case you are most concerned about doesn't mean it's inadequate for other use cases.
(In reply to Joshua Cranmer [:jcranmer] from comment #16) > difficult and anyone reviewing code that involves multiple threads ought to > be taking the time to think hard about possible race conditions. The issue Ehsan is pointing out, though, is that the current design of mozilla::Atomic helps to hide the fact that code involves multiple threads.
(In reply to comment #17) > (In reply to Joshua Cranmer [:jcranmer] from comment #16) > > difficult and anyone reviewing code that involves multiple threads ought to > > be taking the time to think hard about possible race conditions. > > The issue Ehsan is pointing out, though, is that the current design of > mozilla::Atomic helps to hide the fact that code involves multiple threads. Precisely.
Atomic has never been a magic wand against threading problems. Believing it is is the problem.
Again, that's not what anyone is claiming. The claim is that Atomic *obscures* the fact that the code uses multiple threads.
Atomic or not, there are no markers that code uses multiple threads. There are some code patterns that might indicate thread safety, and I don't see what Atomic would obscure. The only thing i see here is that yes, thread safety is hard, and Atomic is a tool to create thread safety without locks. I'd argue, in fact, that Atomic should only be used for hot paths, and be scrutinized there. I'd also argue too much hand holding at the API level could lead to the opposite problem, being bugs caused by a false sense of security.
I would really appreciate if we can keep focusing on the claim that I'm making here, not the ones that I'm not. Again, here is my contention: Given the two bits of code: // (1) Atomic<int> mRefCnt; // later on... --mRefCnt; if (mRefCnt == 0) { delete this; } // (2) Atomic<int> mRefCnt; // later on FetchAndAdd(mRefCnt, -1); // name to be bikeshedded if (mRefCount == 0) { delete this; } I'm arguing that the current API (1) makes it much more difficult to spot the problem by experienced hackers who know about thread-safety, atomicity, etc. And that the one I'm proposing (2) makes the problem more obvious, hopefully during code review if not during local development. I doubt I can make myself more clear here.
To my eyes the problem is just about as hidden in both cases, except in the second case, one would need to check the API to know how to fix it.
We need a decision here, I doubt anybody has any opinions which has not already been presented here.
Flags: needinfo?(jwalden+bmo)
That you're needinfo'ing the MFBT module owner for "a decision" sounds like you think that the problem at hand here is a MFBT problem, and the question to be resolved here is whether mozilla::Atomic should be removed from MFBT. But I think of this problem as a general Gecko problem, and before asking the MFBT questionb, a deeper question to answer is whether we want such an abstraction to be used in Gecko code in general. This question needs to be resolved first anyway, as you won't be able to remove mozilla::Atomic before you've removed all usage of it ;-)
(In reply to comment #25) > That you're needinfo'ing the MFBT module owner for "a decision" sounds like you > think that the problem at hand here is a MFBT problem, and the question to be > resolved here is whether mozilla::Atomic should be removed from MFBT. But I > think of this problem as a general Gecko problem, and before asking the MFBT > questionb, a deeper question to answer is whether we want such an abstraction > to be used in Gecko code in general. Why do you think mozilla::Atomic should be removed? That is definitely not what I'm proposing.
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #26) > Why do you think mozilla::Atomic should be removed? That is definitely not > what I'm proposing. *points to the title of the bug*
Also, FWIW I'll note that I don't actually believe this is an MFBT module only issue, as this makes it easier to add thready safety bugs to the rest of Gecko. It's just that the code lives in this directory. I still hope to find a satisfactory result here!
(In reply to Joshua Cranmer [:jcranmer] from comment #27) > (In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment > #26) > > Why do you think mozilla::Atomic should be removed? That is definitely not > > what I'm proposing. > > *points to the title of the bug* I thought "operations" on types makes sense? Sorry in case of English fail, using "methods" now. (FWIW comment 0 is still unambiguous AFAICT.
Summary: Remove mozilla::Atomic operations → Remove mozilla::Atomic member methods
How about discussing this on dev-platform?
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #26) > Why do you think mozilla::Atomic should be removed? That is definitely not > what I'm proposing. That nuance is a detail; replace "mozilla::Atomic" by "mozilla::Atomic operations" in my comment if you prefer. My point, which does not depend on such fine detail, is: before asking whether MFBT should offer a feature, ask if Gecko should want to use such a feature. (In reply to Mike Hommey [:glandium] from comment #30) > How about discussing this on dev-platform? I think that that would be an appropriate channel for this.
I mean this: http://dxr.mozilla.org/mozilla-central/search?q=mozilla%3A%3AAtomic&case=true&redirect=true MFBT's own internal usage of mozilla::Atomic operations is only the tip of the iceberg...
I'll post to dev-platform tomorrow, but I'd like to get to a decision very soon before Atomic is used even more, so that won't be an open ended discussion. And yes, I am proposing to fix up those consumers too, but that's easy once we decide what to do here, and the compiler will remind us if we forget anything.
Flags: needinfo?(ehsan)
OK, posted to dev-platform about this.
Flags: needinfo?(ehsan)
It's been awhile since I read the newsgroup threads about this, but my recollection was that this was largely not that big a problem in practice, outside of templated code where a variable might be atomic in one case, non-atomic in another. Flat-out removing atomic methods is a bridge too far, I think. That said, C++11 atomic supports *both* operators, *and* member functions to load/store. We have the former. We don't have the latter. If the latter were present, users who wished to be paranoid *could* use the member functions instead of operator overloads. I don't remember particular discussion of why it would be a *bad* thing to add load() methods and store(T) methods to support this, at the time Atomics.h was added. So we should do that, and users who want explicitness -- for example, like the motivating case here -- can use the methods instead.
Flags: needinfo?(jwalden+bmo)
Summary: Remove mozilla::Atomic member methods → Add Atomic::load() and Atomic::store(T) member methods to permit atomics uses to be more visible/obvious/explicit, when desired
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.