507718 - Provide a lightweight alternative to NSPR for synchronization+threads with a modern C++ API

Andreas Gal :gal

Reporter

Description

•

15 years ago

No description provided.

Andreas Gal :gal

Reporter

Updated

•

15 years ago

Summary: Provide a lightweight alternative to NSPR with a modern C++ API → Provide a lightweight alternative to NSPR for synchronization+threads with a modern C++ API

Andreas Gal :gal

Reporter

Comment 1

•

15 years ago

It would be nice to have some simple lightweight threading and synchronization primitives with a modern C++ API, i.e. Atomic<int> foo; f++ => operator++ => happy times Or virtual Thread::run(), the rest is hidden. Performance-critical parts like Atomic should be all-inline and fast and use compiler intrinsics where available (i.e. gcc 4.1). We can probably borrow code from someone instead of rolling our own from scratch.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 2

•

15 years ago

We chatted about this last week and decided to implement the C++1x specification for threads, mutexes, condvars, and atomics. This will also pull in part of the C++1x time API since some methods allow one to wait for a specified duration of time. I have a nascent pthreads implementation of <thread> and implementing the rest will be quite easy as the API is similar to pthreads'. A Windows implementation will be harder, but bent has volunteered to do this. We can borrow code from boost, Google, and NSPR where it makes sense to do so. This code will live in js/src and be shared between Spidermonkey and Gecko. Once/if our compilers support this part of the C++1x spec, we can just switch to their implementations. It would be nice to take this code from boost wholesale, but boost uses exceptions and implements a superset of the C++1x spec. Instead of throwing exceptions as the spec dictates, we will just abort() on errors. This code is crucial enough that there's not much point in trying to recover from an error, and it might be counterproductive to do so anyway. We briefly discussed how to handle unsupported platforms/architectures. The option we tentatively settled on was #error'ing to force maintainers of other platforms to provide an impl. We also have the option of wrapping the C++1x API around NSPR, as a slow and crufty fallback, and #warning that the fallback is being used.

Brendan Eich [:brendan]

Comment 3

•

15 years ago

This better not take too long and (not or, I'm generous today) retrace too many NSPR footsteps, or I will cry "boondoggle." /be

Brendan Eich [:brendan]

Comment 4

•

15 years ago

Don't get me wrong: C++ sugar won't (ought not!) take long and it will be sweet. Only downside is kids eat too much (everyone wants to be a lock-free threaded programming superstar) and too many get cavities (bugs from lack of barriers or other subtleties -- Mom and Dad pay the bills). /be

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 5

•

15 years ago

Nothing fancy is being planned here, just a (i) lightweight, modern wrapper around system threading primitives, and (ii) well-specified implementation of atomic operations. NSPR has neither. We'd still need to catch abuses of these APIs in review, just like we have to today.

Brendan Eich [:brendan]

Comment 6

•

15 years ago

js/src/threads, then? /be

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 7

•

15 years ago

Aye. Or maybe js/src/c++1x/ since it's going to include atomics and sync primitives too.

Mike Shaver (:shaver emeritus)

Comment 8

•

15 years ago

People will expect to find those in a dir called threads/, I think, and I don't want to really scope-creep this in the way that your directory name choice suggests, ahem!

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 9

•

15 years ago

I'd like to solicit feedback on the mostly-done v0.0, which lives here http://hg.mozilla.org/users/cjones_mozilla.com/cxx1x/file/63a0c5674d42 I hope this medium is preferable to a big patch. Implemented so far is most of C++1x's <thread>, <mutex>, <condition_variable> for pthreads systems, and <cstdatomic> (lock-free) for x86/x86-64/ARM (I also implemented a fallback that uses mutexes). Some parts of their specs seemed useless or worse so I omitted them. I'm going to finish up the lacking feature (timed condvar.wait()) and work on more tests in the mean time.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

15 years ago

Blocks: 530636

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 10

•

15 years ago

My WIP repo listed in comment 9 would lead to the layout js/src/threads/ [platform-neutral headers] linux/ osx/ posix/ pthread/ windows/ test/ I prefer this because I can use the same file names for all the platform-specific headers and selectively export them. Is this OK, or does anyone have strong opinions on an alternate layout?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 11

•

15 years ago

Attached patch part 1, v1: adds js/src/threads and mozstd::thread::hardware_concurrency() — Details — Splinter Review

Not really sure who wants to/should be reviewing what, so I'm going to r? spam.

Assignee: nobody → jones.chris.g

Attachment #414176 - Flags: review?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 12

•

15 years ago

Comment on attachment 414176 [details] [diff] [review] part 1, v1: adds js/src/threads and mozstd::thread::hardware_concurrency() Probably want to look over the build goop.

Attachment #414176 - Flags: review?(benjamin)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

15 years ago

Attachment #414176 - Flags: review?(gal)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

15 years ago

Attachment #414176 - Flags: review? → review?(brendan)

Andreas Gal :gal

Reporter

Comment 13

•

15 years ago

Comment on attachment 414176 [details] [diff] [review] part 1, v1: adds js/src/threads and mozstd::thread::hardware_concurrency() Brendan, do you want to enforce JS code conventions for this code? If so, we have to adjust indents and such a bit.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 14

•

15 years ago

I'm not sure if the "and such" includes class/function naming conventions, but FWIW I used the same identifiers as the C++1x spec (save s/std/mozstd/). The reason is to allow us to just dump this code when/if C++1x support starts appearing.

Shawn Wilsher :sdwilsh

Comment 15

•

15 years ago

Drive-by nit: mozstd seems weird. Everything else we are adding is in the mozilla namespace, and this adds a new top level one.

Andreas Gal :gal

Reporter

Comment 16

•

15 years ago

Comment on attachment 414176 [details] [diff] [review] part 1, v1: adds js/src/threads and mozstd::thread::hardware_concurrency() >+unsigned >+hardware_concurrency() >+{ indent is 4 spaces (no tabs) >+ int nproc; >+ size_t size = sizeof(nproc); >+ >+ if (sysctlbyname("hw.ncpu", &nproc, &size, NULL, 0)) >+ return 1; >+ >+ return nproc; >+} >+ >+} // namespace mozinternal >+ if (!fp) { Don't overbrace if there is a single consequence. >+ if (line) { >+ free(line); >+ } Those seem to be the most relevant style diffs to JS. However, I think this module should get a style waiver and do whatever chris prefers since its an external component (like nanojit). That's Brendan's call though. I remember the name rationale and no objections there.

Shawn Wilsher :sdwilsh

Comment 17

•

15 years ago

It looks like he's just following the Coding Style docs on MDC: https://developer.mozilla.org/En/Developer_Guide/Coding_Style

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 18

•

15 years ago

In fact, this code was originally in JS style, and I adapted it for Gecko style before posting this patch. But I personally prefer SM style, and Andreas has spoken, so ...

Benjamin Smedberg

Comment 19

•

15 years ago

Comment on attachment 414176 [details] [diff] [review] part 1, v1: adds js/src/threads and mozstd::thread::hardware_concurrency() I'd like brendan to decide about namespace and style before I review... FWIW my opinion would be to use mozilla standard 2-space and either mozilla::thread or mozilla::std::thread

Attachment #414176 - Flags: review?(benjamin)

Julian Seward [:jseward]

Comment 20

•

15 years ago

Chris, are you going to make it so that it is possible to do a debug build using purely POSIX pthreads primitives, and no magic hacks with atomic insns, lock free algorithms, or wotnot? I ask because the option to map this stuff purely onto pthreads will make it much easier to use runtime race detection tools in future. IME the #1 problem such tools have is "seeing" all the relevant inter-thread synchronisation events; failure to do so leads floods of false positives and general uselessness.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 21

•

15 years ago

Could we just compile in some valgrind instrinsics that trigger the right happens-before relations? Regarding style, I think this code should follow Coding_Style since it's not really JS code, it's new code, and new code follows the style guide; that is not optional. Maybe that means it shouldn't live under js/. Namespace "mozilla::thread" or "mozilla::threads" sounds good to me.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 22

•

15 years ago

For future reference, the plan would be to fall back on pthread-synchronized operations across the board, including atomic arithmetic. My user repo already has fallback implementations of atomic arithmetic that use std::mutex, so we would just need to set a configure flag to make this happen.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 23

•

15 years ago

(In reply to comment #21) > Could we just compile in some valgrind instrinsics that trigger the right > happens-before relations? > Julian raised the point that these APIs are in their infancy and likely to change. Using pthread sync would be more robust at first, and reduce annotation garbage.

Andreas Gal :gal

Reporter

Comment 24

•

15 years ago

roc, the reason to have it under js/ is that we would like to kill off the remaining ties to NSPR in JS and replace it with this.

Brendan Eich [:brendan]

Comment 25

•

15 years ago

AIUI, this code was going to live under js for two reasons: first, so standalone SpiderMonkey does not depend on anything (JS_THREADSAFE currently requires NSPR, but this is considered a misfeature); second, to avoid yet more DSO/DLL hell. We can revisit either, but I'm fine with using the Mozilla style guide and not bothering about the variation. js/src/xpconnect uses its own style, and while js/src/nanojit happens to use 4-space c-basic-offset, it is from an upstream repo and its style is not as consistent as SpiderMonkey's anyway. Agree with roc on wanting valgrind no-op hokey-pokey for atomic ops, not ifdef'ed layering on pthreads (which will be so slow and different I worry it won't find the races that matter). /be

Brendan Eich [:brendan]

Comment 26

•

15 years ago

(In reply to comment #23) > (In reply to comment #21) > > Could we just compile in some valgrind instrinsics that trigger the right > > happens-before relations? > > > > Julian raised the point that these APIs are in their infancy and likely to > change. Using pthread sync would be more robust at first, and reduce > annotation garbage. We have been using CAS in JS for over ten years. Perhaps we'll just keep using our jslock.{cpp,h} code and part ways here, but if js/src/threads could host the one portable CAS interface and implementations, that would be better. We can't afford to stop using CAS in production builds, of course. Would valgrind's race detection tools really find the same races if we ran every CAS through a global lock, or address-keyed hashtable of locks? /be

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 27

•

15 years ago

(In reply to comment #25) > Agree with roc on wanting valgrind no-op hokey-pokey for atomic ops, not > ifdef'ed layering on pthreads (which will be so slow and different I worry it > won't find the races that matter). > I don't think it would be /that/ much slower ... in the common case, it would be approximately twice as expensive as a CAS (~two CAS's instead of one). But in the contended case, it would indeed be much more expensive. IIRC, helgrind is using hybrid lockset/happens-before detection (thanks roc!), so I don't think locked CAS will affect the "real" races reported. IOW, I locked CAS shouldn't induce happens-before or add to the lockset for operations that are otherwise racy.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 28

•

15 years ago

(In reply to comment #26) > (In reply to comment #23) > > (In reply to comment #21) > > > Could we just compile in some valgrind instrinsics that trigger the right > > > happens-before relations? > > > > > > > Julian raised the point that these APIs are in their infancy and likely to > > change. Using pthread sync would be more robust at first, and reduce > > annotation garbage. > > We have been using CAS in JS for over ten years. Perhaps we'll just keep using > our jslock.{cpp,h} code and part ways here, but if js/src/threads could host > the one portable CAS interface and implementations, that would be better. We > can't afford to stop using CAS in production builds, of course. The C++1x spec has APIs for a whole host of atomic ops, and the semantics are well defined IMHO. I'd prefer to switch jslock over to the js/threads code (which does use CAS on x86/x86-64/ARM).

Andreas Gal :gal

Reporter

Comment 29

•

15 years ago

I don't really want to switch from NSPR to a yet slower implementation based on pthreads. I think we can wait with switching our CAS code until the proper new implementation is in place.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 30

•

15 years ago

(In reply to comment #29) > I don't really want to switch from NSPR to a yet slower implementation based on > pthreads. I think we can wait with switching our CAS code until the proper new > implementation is in place. Julian was asking about pthread-locked atomics for special DEBUG builds only. Regular builds will use CAS on x86/x86-64/ARM. If jslock has native atomics support for platforms that js/threads doesn't, I can pretty easily add them to js/threads.

Julian Seward [:jseward]

Comment 31

•

15 years ago

(In reply to comment #26) > We have been using CAS in JS for over ten years. Perhaps we'll just keep using > our jslock.{cpp,h} code and part ways here, but if js/src/threads could host > the one portable CAS interface and implementations, that would be better. We > can't afford to stop using CAS in production builds, of course. Would > valgrind's race detection tools really find the same races if we ran every CAS > through a global lock, or address-keyed hashtable of locks? We model CAS by ignoring the write and only observing the read, thereby missing the case where a CAS races against a read. However, that might be preferable to the sequence { pth-mx-lock; x++; pth-mx-unlock; } since the latter induces 2 h-b edges and makes the program look more ordered than it really is. So perhaps an OK compromise is to allow CAS but force all other inter-thread synch to be done via pthreads? Remember, this is for debugging only. For production, there's no proposal at all to use pthreads it's deemed undesirable.

Brendan Eich [:brendan]

Updated

•

15 years ago

Attachment #414176 - Flags: review?(brendan) → review?(lw)

Brendan Eich [:brendan]

Comment 32

•

15 years ago

Comment on attachment 414176 [details] [diff] [review] part 1, v1: adds js/src/threads and mozstd::thread::hardware_concurrency() I think Luke should bless this, esp. the C++[01]x purity but also the SpiderMonkey-ness apart from coding style. I'm ok with it following the Mozilla C++ style guide, as noted earlier. /be

Luke Wagner [:luke]

Updated

•

15 years ago

Attachment #414176 - Flags: review?(lw) → review+

Luke Wagner [:luke]

Comment 33

•

15 years ago

Comment on attachment 414176 [details] [diff] [review] part 1, v1: adds js/src/threads and mozstd::thread::hardware_concurrency() Well, there is only a little to look at, but I like the direction this is heading with cleanly separating the platform/arch-specific stuff. Looking forward to the rest. >+unsigned >+hardware_concurrency() >+{ >+ int nproc; >+ size_t size = sizeof(nproc); >+ >+ if (sysctlbyname("hw.ncpu", &nproc, &size, NULL, 0)) >+ return 1; >+ >+ return nproc; >+} The spec says "if the value is not computable or well defined and implementation should return 0". Is there another reason to return 1? Second, I think new tests in js can be public domain (which I hear has a 3 line license you can copy from newer js reftests), so perhaps you can set the precedent for the new test directory you are growing? (I'm about the farthest thing from a lawyer, so maybe double check that :) Also, a high-level question: are you planning to simulate rvalue references?

Wan-Teh Chang

Comment 34

•

15 years ago

Comment on attachment 414176 [details] [diff] [review] part 1, v1: adds js/src/threads and mozstd::thread::hardware_concurrency() Chris: just curious -- is your sysctlbyname code in js/src/threads/darwin/hardware_concurrency.h a better way to get the number of processors than the _PR_HAVE_SYSCTL code in PR_GetNumberOfProcessors? See http://mxr.mozilla.org/nspr/ident?i=PR_GetNumberOfProcessors I'm wondering if I should copy your code into NSPR. >+ // NSPR doesn't expose this information You can call PR_GetNumberOfProcessors

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 35

•

15 years ago

(In reply to comment #33) > >+unsigned > >+hardware_concurrency() > >+{ > >+ int nproc; > >+ size_t size = sizeof(nproc); > >+ > >+ if (sysctlbyname("hw.ncpu", &nproc, &size, NULL, 0)) > >+ return 1; > >+ > >+ return nproc; > >+} > > The spec says "if the value is not computable or well defined and > implementation should return 0". Is there another reason to return 1? > Nice catch! > Second, I think new tests in js can be public domain (which I hear has a 3 line > license you can copy from newer js reftests), so perhaps you can set the > precedent for the new test directory you are growing? (I'm about the farthest > thing from a lawyer, so maybe double check that :) > Interesting question, seems like a gray area. The tests sort of depend on an open spec, but the implementation is "mozstd" ... probably best to ask a lawyer ;). > Also, a high-level question: are you planning to simulate rvalue references? Not atm, I've just been skipping over those interfaces.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 36

•

15 years ago

(In reply to comment #34) > (From update of attachment 414176 [details] [diff] [review]) > Chris: just curious -- is your sysctlbyname code in > js/src/threads/darwin/hardware_concurrency.h a better > way to get the number of processors than the _PR_HAVE_SYSCTL > code in PR_GetNumberOfProcessors? See > http://mxr.mozilla.org/nspr/ident?i=PR_GetNumberOfProcessors > Don't think so: The sysctlbyname() function accepts an ASCII representation of the name and internally looks up the integer name vector. Apart from that, it behaves the same as the standard sysctl() function. > >+ // NSPR doesn't expose this information > > You can call PR_GetNumberOfProcessors Gah! I looked for this and failed to find it. If I'd known it existed, I would have cribbed from here rather than searching Google ;).

Mike Shaver (:shaver emeritus)

Comment 37

•

15 years ago

(In reply to comment #35) > Interesting question, seems like a gray area. The tests sort of depend on an > open spec, but the implementation is "mozstd" ... probably best to ask a lawyer > ;). The test code itself can by PD, regardless of what specification it implements or what it calls. Please use the minimal PD license text for the tests!

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 38

•

14 years ago

With bug 577899 and bug 642381, code shared between Gecko/SpiderMonkey is starting to happen. The code this bug covers would probably live in mfbt/threads, per current plans. Some lingering issues from past discussion - I think there's value in differentiating code written to follow the C++0x spec from code that's not. The latter kind of code already lives (will live) in mfbt/. I would prefer to put the C++0x stuff in a namespace that includes the letters "std"; I still like mozstd, but I also don't want to hang up on this. - Regardless of the mozstd/mozilla::std/mozilla stuff that follows the C++0x spec, we might still want mozilla::Thread/etc. wrappers in mfbt when we need to break from the C++0x API. One such break might be moving the xpcom/glue deadlock detector into mfbt/, and have it operate on mozilla::Mutex/etc. wrapped around mozstd::mutex/etc. - On valgrind-able platforms, std::mutex/etc. are implemented by pthread types. No hand-rolled broken glorp. I don't believe those impls will affect helgrind. (If xpcom/glue/Mutex.h and PRLock don't affect helgrind on m-c tip, then the new regime wouldn't either.) - On valgrind-able platforms and architectures, the std:: atomics are implemented by inline assembly. The x86/x86_64 instructions used are (of course!) the well-known cmpxchgl, xaddl, the *fences, etc. (It's a bit more complicated on ARM.) I don't fully understand the potential problems in helgrind, but we have many options available for helping out helgrind if it doesn't already understand the semantics of x86 atomics: annotation, fallback impls, calls to magic functions, ... (ARM is more complicated.) Maybe Julian and I should discuss this somewhere else. - C++0x specifies atomics with relaxed-consistency semantics. There are implementations of these in mozstd. I don't expect we'll need to use these in either SpiderMonkey or Gecko, but if we do, they will probably present problems for helgrind. I'd like to cross that bridge when we come to it.

Greg Rubino

Comment 39

•

13 years ago

(In reply to comment #2) > We chatted about this last week and decided to implement the C++1x > specification for threads, mutexes, condvars, and atomics. This will also > pull in part of the C++1x time API since some methods allow one to wait for > a specified duration of time. I have a nascent pthreads implementation of > <thread> and implementing the rest will be quite easy as the API is similar > to pthreads'. A Windows implementation will be harder, but bent has > volunteered to do this. We can borrow code from boost, Google, and NSPR > where it makes sense to do so. This code will live in js/src and be shared > between Spidermonkey and Gecko. Once/if our compilers support this part of > the C++1x spec, we can just switch to their implementations. It would be > nice to take this code from boost wholesale, but boost uses exceptions and > implements a superset of the C++1x spec. Instead of throwing exceptions as > the spec dictates, we will just abort() on errors. This code is crucial > enough that there's not much point in trying to recover from an error, and > it might be counterproductive to do so anyway. I'm not terribly familiar with the code under discussion here, but I thought I'd throw in that boost.lockfree looks like it's going to make it in to boost 1.47.0. Documentation can be found here: http://tim.klingt.org/boost_lockfree/ it might fit your needs. it simulates atomics with available mechanisms (transparently) and provides three lockfree containers (fifo, ring buffer and stack, I believe). HTH! We briefly discussed how to > handle unsupported platforms/architectures. The option we tentatively > settled on was #error'ing to force maintainers of other platforms to provide > an impl. We also have the option of wrapping the C++1x API around NSPR, as > a slow and crufty fallback, and #warning that the fallback is being used.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

13 years ago

Attachment #414176 - Flags: review?(gal)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

12 years ago

Assignee: jones.chris.g → nobody

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Jens Stutte [:jstutte]

Comment 40

•

11 months ago

Is this still a thing? In case: Should it live in some JS component then ?

Flags: needinfo?(jseward)

Julian Seward [:jseward]

Comment 41

•

11 months ago

I think this is long since redundant and hence is no longer A Thing. And so we should close it.

Flags: needinfo?(jseward)

Jens Stutte [:jstutte]

Updated

•

11 months ago

Status: NEW → RESOLVED

Closed: 11 months ago

Resolution: --- → INVALID