Closed
Bug 1216526
Opened 9 years ago
Closed 3 years ago
TSan: data race nsprpub/pr/src/io/prlog.c:358 PR_NewLogModule (race on logModules)
Categories
(NSPR :: NSPR, defect)
NSPR
NSPR
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: jseward, Assigned: wtc, NeedInfo)
References
(Blocks 1 open bug)
Details
(Whiteboard: [tsan])
Attachments
(3 files)
7.94 KB,
text/plain
|
Details | |
1014 bytes,
patch
|
Details | Diff | Splinter Review | |
3.17 KB,
patch
|
Details | Diff | Splinter Review |
The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer). * Specific information about this bug PR_NewLogModule isn't thread-safe. The problem occurs when two uncoordinated threads try to add their newly allocated PRLogModuleInfo objects to |logModules|, the linked list of such objects: lm->next = logModules; logModules = lm; * General information about TSan, data races, etc. Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause unacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1][2]. If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist. [1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong [2] _How to miscompile programs with "benign" data races_: https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf
Reporter | ||
Comment 1•9 years ago
|
||
I tried the obvious thing, which is to bracket the two offending lines with _PR_LOCK_LOG / _PR_UNLOCK_LOG, thusly: @@ -350,18 +350,20 @@ PR_IMPLEMENT(PRLogModuleInfo*) PR_NewLog lm = PR_NEWZAP(PRLogModuleInfo); if (lm) { lm->name = strdup(name); lm->level = PR_LOG_NONE; + _PR_LOCK_LOG(); lm->next = logModules; logModules = lm; + _PR_UNLOCK_LOG(); _PR_SetLogModuleLevel(lm); } return lm; } That unfortunately causes it to segfault, because |_pr_logLock| is still NULL at that point. This implies that we're calling PR_NewLogModule before _PR_InitLog (which sets up |_pr_logLock|). That seems a bit strange; is that correct/intended?
Reporter | ||
Comment 2•9 years ago
|
||
Here's _a_ fix, that uses a PRLock to sequentialise access to |logModules|. Unfortunately it has to be heap allocated, and hence leaked, which is ungood.
Reporter | ||
Comment 3•9 years ago
|
||
This is an alternative fix. It creates a new global, internal lock that is a peer of _pr_sleeplock and uses that instead. I added code to free it in PR_Cleanup(), like _pr_sleeplock is, but this never appears to get called in a normal FX run.
Reporter | ||
Comment 4•9 years ago
|
||
Would either of those fixes be considered acceptable?
Flags: needinfo?(wtc)
Comment 5•9 years ago
|
||
Pretty sure this is a dup of bug 1073578, there's been a proposed fix for over a year.
Comment 6•3 years ago
|
||
Per bug 1073578 we are no longer using nspr logging. This (or similar) may reappear if we for any reason decided to use it again.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•