Closed Bug 1216526 Opened 9 years ago Closed 3 years ago

TSan: data race nsprpub/pr/src/io/prlog.c:358 PR_NewLogModule (race on logModules)

Categories

(NSPR :: NSPR, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: jseward, Assigned: wtc, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [tsan])

Attachments

(3 files)

Attached file TSan stack trace
The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer).

* Specific information about this bug

PR_NewLogModule isn't thread-safe.  The problem occurs when two
uncoordinated threads try to add their newly allocated PRLogModuleInfo
objects to |logModules|, the linked list of such objects:

        lm->next = logModules;
        logModules = lm;

* General information about TSan, data races, etc.

Typically, races reported by TSan are not false positives, but it is
possible that the race is benign. Even in this case though, we should
try to come up with a fix unless this would cause unacceptable
performance issues. Also note that seemingly benign races can possibly
be harmful (also depending on the compiler and the architecture)
[1][2].

If the bug cannot be fixed, then this bug should be used to either
make a compile-time annotation for blacklisting or add an entry to the
runtime blacklist.

[1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
[2] _How to miscompile programs with "benign" data races_: https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf
I tried the obvious thing, which is to bracket the two offending lines
with  _PR_LOCK_LOG / _PR_UNLOCK_LOG, thusly:

@@ -350,18 +350,20 @@ PR_IMPLEMENT(PRLogModuleInfo*) PR_NewLog
     lm = PR_NEWZAP(PRLogModuleInfo);
     if (lm) {
         lm->name = strdup(name);
         lm->level = PR_LOG_NONE;
+        _PR_LOCK_LOG();
         lm->next = logModules;
         logModules = lm;
+        _PR_UNLOCK_LOG();
         _PR_SetLogModuleLevel(lm);
     }
     return lm;
 }

That unfortunately causes it to segfault, because |_pr_logLock| is still
NULL at that point.  This implies that we're calling PR_NewLogModule
before _PR_InitLog (which sets up |_pr_logLock|).  That seems a bit
strange; is that correct/intended?
Attached patch A not good fixSplinter Review
Here's _a_ fix, that uses a PRLock to sequentialise access to |logModules|.
Unfortunately it has to be heap allocated, and hence leaked, which is ungood.
This is an alternative fix.  It creates a new global, internal lock
that is a peer of _pr_sleeplock and uses that instead.  I added code
to free it in PR_Cleanup(), like _pr_sleeplock is, but this never
appears to get called in a normal FX run.
Would either of those fixes be considered acceptable?
Flags: needinfo?(wtc)
Pretty sure this is a dup of bug 1073578, there's been a proposed fix for over a year.

Per bug 1073578 we are no longer using nspr logging. This (or similar) may reappear if we for any reason decided to use it again.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: