Closed Bug 1067414 Opened 10 years ago Closed 10 years ago

Clarify thread-safety of unlocked PreferredSampleRate calls

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35

People

(Reporter: jesup, Assigned: kinetik)

References

Details

Attachments

(2 files, 1 obsolete file)

To avoid locking, cubeb uses InitPreferredSampleRate(), which locks, and PreferredSampleRate(), which doesn't. The problem is that PreferredSampleRate can fall afoul of TSAN issues since it will be accessed from multiple threads with no interlocks. The compiler is allowed to do really stupid things and replace (temporarily) your nice preferred rate with nuclear launch codes. Simplest solution is likely to make the static some form of atomic.
Also, I note that the sample rate is stored as a uint32_t, and returned as an int... likely there are some type confusions around this value and uses of it.
I don't understand. I assume each thread will call InitPreferredSampleRate(), or use memory barriers to ensure that another thread has called InitPreferredSampleRate(), before using PreferredSampleRate(). InitPreferredSampleRate() uses a lock, which I assume acts a memory barrier to ensure the set value will be seen by any subsequent PreferredSampleRate(). Where is the opportunity for the compiler replace the value set to sPreferredSampleRate with something else?
https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong See "But what if we modify shared state with atomic operations but read with plain loads? Like in the following example:" In theory: Thread 1: InitPreferredSampleRate() Lock Stores value to sSampleRate UnLock <stall> Thread 2: InitPreferredSampleRate() Lock <compiler spills nuclear launch codes to sSampleRate <stall> Thread 1: x = PreferredSampleRate() x == nuclear launch codes Thread 2: sets sSampleRate (again) UnLock Is it stupid that compilers can do this? Pretty much yeah. Is it wrong? Yes. Can you 99.9% of the time get away with it? Yes, but you never know when the compiler will decide to get "smart" and then you're in timing races. Plus this will cause random racy failures from TSAN runs. Solution: Make it some form of Atomic.
Quick cleanup, no logic changes. Avoids exposing implementation in CubebUtils.h unnecessarily.
Attachment #8489752 - Flags: review?(rjesup)
Fix potential thread-safety issue. Add a bit of documentation to explain what's going on.
Attachment #8489754 - Flags: review?(rjesup)
Attachment #8489752 - Flags: review?(rjesup) → review+
Attachment #8489754 - Flags: review?(rjesup) → review+
(In reply to Randell Jesup [:jesup] from comment #3) > https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what- > could-possibly-go-wrong Thanks for the link. I enjoyed reading the details, and the pictures. The article gives examples of situations where races cause problems because compilers assume there are no races. However, here there is no race in the use of sPreferredSampleRate, and so there is no opportunity for the compiler to break the intended logic. > See "But what if we modify shared state with atomic operations but read with > plain loads? Like in the following example:" That section is an example of a read while a concurrent write may happen on another thread. However, here there is no concurrent write on another thread because the write happens only once and has already happened. > In theory: > Thread 1: > InitPreferredSampleRate() > Lock > Stores value to sSampleRate > UnLock > <stall> > > Thread 2: > InitPreferredSampleRate() > Lock > <compiler spills nuclear launch codes to sSampleRate > <stall> The articles explains that "if a program stores to a variable X, the compiler can legally reuse X’s storage for any temporal data inside of some region of code before the store (e.g. to reduce stack frame size)." However, here, in thread 2, the program will not store to sSampleRate and so the compiler cannot assume that it will, and so cannot use it for temporal data. The argument in this bug report (but I didn't see it in the linked article) seems to be that if a section of code may modify storage under one set of conditions, then the compiler may assume that it modifies the storage under any set of conditions. I don't understand the basis for that argument.
(In reply to Karl Tomlinson (:karlt) from comment #7) > (In reply to Randell Jesup [:jesup] from comment #3) > > https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what- > > could-possibly-go-wrong > > Thanks for the link. I enjoyed reading the details, and the pictures. It's a good article. > The article gives examples of situations where races cause problems because > compilers assume there are no races. However, here there is no race in the > use of sPreferredSampleRate, and so there is no opportunity for the compiler > to break the intended logic. > > > See "But what if we modify shared state with atomic operations but read with > > plain loads? Like in the following example:" > > That section is an example of a read while a concurrent write may happen on > another thread. However, here there is no concurrent write on another thread > because the write happens only once and has already happened. If so, then I likely misread the code... (reads) I did; I missed the "if sSampleRate == 0" part. I've seen the pattern I gave a bunch of times where the programmer thought "So long as all the setters use locks, and the readers don't care if they get value A or value B in a race, it's fine!" (And where people thought "if I set it to the same value, there can't be a race!"
(In reply to Randell Jesup [:jesup] from comment #8) > And where people thought "if I set it to the same value, there can't be a > race!" Yes, I did not know that it was considered a race even if no threads modify a value but one thread writes the existing value.
I would like to echo the part of "But what if we modify shared state with atomic operations but read with plain loads?" IIRC, there is an article talking strict-aliasing which gives an example as below: int a = ...; if (a > 0) { ++a; } It is legal for the compiler to translate the code into something like below: ++a; if (a == 1) { --a; } Therefore, there is no guarantee you will always see 0 in |a| when the initial value is 0 as far as multi-thread and crazy compiler optimization are concerned.
(In reply to JW Wang [:jwwang] from comment #11) See "Speculative code motion involving stores" in "Concurrency memory model compiler consequences" http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html "It is generally not allowable to introduce stores to potentially shared memory locations that might not otherwise have been written between the same synchronization points." There may of course be some broken compilers, but, if compilers ignore this, then I don't know where to stop guessing what the compiler might do.
This seems to have caused bug 1068658. We can't start b2g any more with a debug gecko :(
Flags: needinfo?(kinetik)
Depends on: 1068658
Status: RESOLVED → REOPENED
Flags: needinfo?(kinetik)
Resolution: FIXED → ---
Target Milestone: mozilla35 → ---
per comment 8, this is not actually a TSAN violation
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → INVALID
I am curious how a seemly innocent change causes bug 1068658.
sPreferredSampleRate was first set from cubeb_get_preferred_sample_rate(), and then to 44100 on subsequent InitPreferredSampleRate() calls.
I see. Using Atomic is not the cause.
The cleanup was still worthwhile, as is adding some explanation around why unlocked sPreferredSampleRate accesses are considered safe.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Summary: Almost certain TSAN violation in cubeb PreferredSampleRate() → Clarify thread-safety of unlocked PreferredSampleRate calls
Attachment #8489754 - Attachment is obsolete: true
Attachment #8491235 - Flags: review?(karlt)
Attachment #8491235 - Flags: review?(karlt) → review+
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: