Closed
Bug 1067414
Opened 10 years ago
Closed 10 years ago
Clarify thread-safety of unlocked PreferredSampleRate calls
Categories
(Core :: Audio/Video, defect)
Core
Audio/Video
Tracking
()
RESOLVED
FIXED
mozilla35
People
(Reporter: jesup, Assigned: kinetik)
References
Details
Attachments
(2 files, 1 obsolete file)
9.89 KB,
patch
|
jesup
:
review+
|
Details | Diff | Splinter Review |
4.11 KB,
patch
|
karlt
:
review+
|
Details | Diff | Splinter Review |
To avoid locking, cubeb uses InitPreferredSampleRate(), which locks, and PreferredSampleRate(), which doesn't.
The problem is that PreferredSampleRate can fall afoul of TSAN issues since it will be accessed from multiple threads with no interlocks. The compiler is allowed to do really stupid things and replace (temporarily) your nice preferred rate with nuclear launch codes.
Simplest solution is likely to make the static some form of atomic.
Reporter | ||
Comment 1•10 years ago
|
||
Also, I note that the sample rate is stored as a uint32_t, and returned as an int... likely there are some type confusions around this value and uses of it.
Comment 2•10 years ago
|
||
I don't understand.
I assume each thread will call InitPreferredSampleRate(), or use memory barriers to ensure that another thread has called InitPreferredSampleRate(), before using PreferredSampleRate().
InitPreferredSampleRate() uses a lock, which I assume acts a memory barrier to ensure the set value will be seen by any subsequent PreferredSampleRate().
Where is the opportunity for the compiler replace the value set to sPreferredSampleRate with something else?
Reporter | ||
Comment 3•10 years ago
|
||
https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
See "But what if we modify shared state with atomic operations but read with plain loads? Like in the following example:"
In theory:
Thread 1:
InitPreferredSampleRate()
Lock
Stores value to sSampleRate
UnLock
<stall>
Thread 2:
InitPreferredSampleRate()
Lock
<compiler spills nuclear launch codes to sSampleRate
<stall>
Thread 1:
x = PreferredSampleRate()
x == nuclear launch codes
Thread 2:
sets sSampleRate (again)
UnLock
Is it stupid that compilers can do this? Pretty much yeah. Is it wrong? Yes. Can you 99.9% of the time get away with it? Yes, but you never know when the compiler will decide to get "smart" and then you're in timing races. Plus this will cause random racy failures from TSAN runs.
Solution: Make it some form of Atomic.
Assignee | ||
Comment 4•10 years ago
|
||
Quick cleanup, no logic changes. Avoids exposing implementation in CubebUtils.h unnecessarily.
Attachment #8489752 -
Flags: review?(rjesup)
Assignee | ||
Comment 5•10 years ago
|
||
Fix potential thread-safety issue. Add a bit of documentation to explain what's going on.
Attachment #8489754 -
Flags: review?(rjesup)
Reporter | ||
Updated•10 years ago
|
Attachment #8489752 -
Flags: review?(rjesup) → review+
Reporter | ||
Updated•10 years ago
|
Attachment #8489754 -
Flags: review?(rjesup) → review+
Assignee | ||
Comment 6•10 years ago
|
||
Comment 7•10 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #3)
> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-
> could-possibly-go-wrong
Thanks for the link. I enjoyed reading the details, and the pictures.
The article gives examples of situations where races cause problems because
compilers assume there are no races. However, here there is no race in the
use of sPreferredSampleRate, and so there is no opportunity for the compiler
to break the intended logic.
> See "But what if we modify shared state with atomic operations but read with
> plain loads? Like in the following example:"
That section is an example of a read while a concurrent write may happen on
another thread. However, here there is no concurrent write on another thread
because the write happens only once and has already happened.
> In theory:
> Thread 1:
> InitPreferredSampleRate()
> Lock
> Stores value to sSampleRate
> UnLock
> <stall>
>
> Thread 2:
> InitPreferredSampleRate()
> Lock
> <compiler spills nuclear launch codes to sSampleRate
> <stall>
The articles explains that
"if a program stores to a variable X, the compiler can legally reuse X’s
storage for any temporal data inside of some region of code before the store
(e.g. to reduce stack frame size)."
However, here, in thread 2, the program will not store to sSampleRate and so
the compiler cannot assume that it will, and so cannot use it for temporal
data.
The argument in this bug report (but I didn't see it in the linked article)
seems to be that if a section of code may modify storage under one set of
conditions, then the compiler may assume that it modifies the storage under
any set of conditions.
I don't understand the basis for that argument.
Reporter | ||
Comment 8•10 years ago
|
||
(In reply to Karl Tomlinson (:karlt) from comment #7)
> (In reply to Randell Jesup [:jesup] from comment #3)
> > https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-
> > could-possibly-go-wrong
>
> Thanks for the link. I enjoyed reading the details, and the pictures.
It's a good article.
> The article gives examples of situations where races cause problems because
> compilers assume there are no races. However, here there is no race in the
> use of sPreferredSampleRate, and so there is no opportunity for the compiler
> to break the intended logic.
>
> > See "But what if we modify shared state with atomic operations but read with
> > plain loads? Like in the following example:"
>
> That section is an example of a read while a concurrent write may happen on
> another thread. However, here there is no concurrent write on another thread
> because the write happens only once and has already happened.
If so, then I likely misread the code... (reads) I did; I missed the "if sSampleRate == 0" part. I've seen the pattern I gave a bunch of times where the programmer thought "So long as all the setters use locks, and the readers don't care if they get value A or value B in a race, it's fine!" (And where people thought "if I set it to the same value, there can't be a race!"
Comment 9•10 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #8)
> And where people thought "if I set it to the same value, there can't be a
> race!"
Yes, I did not know that it was considered a race even if no threads modify a value but one thread writes the existing value.
https://hg.mozilla.org/mozilla-central/rev/750e7aaf09cb
https://hg.mozilla.org/mozilla-central/rev/a68c9d18ffa5
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Comment 11•10 years ago
|
||
I would like to echo the part of "But what if we modify shared state with atomic operations but read with plain loads?"
IIRC, there is an article talking strict-aliasing which gives an example as below:
int a = ...;
if (a > 0) {
++a;
}
It is legal for the compiler to translate the code into something like below:
++a;
if (a == 1) {
--a;
}
Therefore, there is no guarantee you will always see 0 in |a| when the initial value is 0 as far as multi-thread and crazy compiler optimization are concerned.
Comment 12•10 years ago
|
||
(In reply to JW Wang [:jwwang] from comment #11)
See "Speculative code motion involving stores" in "Concurrency memory model compiler consequences" http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html
"It is generally not allowable to introduce stores to potentially shared memory locations that might not otherwise have been written between the same synchronization points."
There may of course be some broken compilers, but, if compilers ignore this, then I don't know where to stop guessing what the compiler might do.
Comment 13•10 years ago
|
||
This seems to have caused bug 1068658. We can't start b2g any more with a debug gecko :(
Flags: needinfo?(kinetik)
Comment 14•10 years ago
|
||
Backed out for causing bug 1068658.
https://hg.mozilla.org/mozilla-central/rev/d2c01d77b9d0
Status: RESOLVED → REOPENED
Flags: needinfo?(kinetik)
Resolution: FIXED → ---
Target Milestone: mozilla35 → ---
Reporter | ||
Comment 15•10 years ago
|
||
per comment 8, this is not actually a TSAN violation
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → INVALID
Comment 16•10 years ago
|
||
I am curious how a seemly innocent change causes bug 1068658.
Comment 17•10 years ago
|
||
sPreferredSampleRate was first set from cubeb_get_preferred_sample_rate(), and then to 44100 on subsequent InitPreferredSampleRate() calls.
Comment 18•10 years ago
|
||
I see. Using Atomic is not the cause.
Assignee | ||
Comment 19•10 years ago
|
||
The cleanup was still worthwhile, as is adding some explanation around why unlocked sPreferredSampleRate accesses are considered safe.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Summary: Almost certain TSAN violation in cubeb PreferredSampleRate() → Clarify thread-safety of unlocked PreferredSampleRate calls
Assignee | ||
Updated•10 years ago
|
Attachment #8489754 -
Attachment is obsolete: true
Assignee | ||
Comment 20•10 years ago
|
||
Attachment #8491235 -
Flags: review?(karlt)
Updated•10 years ago
|
Attachment #8491235 -
Flags: review?(karlt) → review+
Assignee | ||
Comment 21•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/f5d2ffa0f1a8
https://hg.mozilla.org/mozilla-central/rev/e14e8c4b3021
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in
before you can comment on or make changes to this bug.
Description
•