TSan: Thread data race in JSRuntime::triggerOperationCallback()

NEW
Unassigned

Status

()

Core
General
--
critical
5 years ago
5 years ago

People

(Reporter: posidron, Unassigned)

Tracking

({sec-want})

Trunk
x86_64
Linux
sec-want
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [tsan][tsan-test-blocker])

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
Created attachment 717800 [details]
trace

During Firefox start-up with ThreadSanitizer (LLVM version), we get a data race reported as described in the attached log. Trace was created on mozilla-central with changeset 122820:c233837cce08.

According to the TSan devs, most of the reported traces should be real data races, even though they can be "benign". We need to determine if the race can/should be fixed, or put on the ignore list. Even for benign races, TSan devs suggest to fix them (second priority), as they can also cause problems [1].

[1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
This particular race is intentional --- it is the mechanism used to allow other threads to interrupt the main javascript executing thread.  Its most important use is for showing the slow script dialog and aborting if a script is ilooping or similar.
Thanks for taking a look Brian! :) If there is no way or intention to get rid of the race (using atomic operations), then we'll add this signature to our compile-time ignore list.
Hmm, looking a bit more it is actually complaining about the writes to Ion's native stack limit rather than the interrupt flag.  This can also write/read race freely, but with Ion there are two pieces of data used for interrupting the main thread, and it is at least possible they could have bad interactions.  See the comment at the top here:

void
JSRuntime::triggerOperationCallback()
{
    /*
     * Invalidate ionTop to trigger its over-recursion check. Note this must be
     * set before interrupt, to avoid racing with js_InvokeOperationCallback,
     * into a weird state where interrupt is stuck at 0 but ionStackLimit is
     * MAXADDR.
     */
    mainThread.ionStackLimit = -1;

    /*
     * Use JS_ATOMIC_SET in the hope that it ensures the write will become
     * immediately visible to other processors polling the flag.
     */
    JS_ATOMIC_SET(&interrupt, 1);
}

Does the C++ memory model (ha ha) give any guarantees that the ionStackLimit write won't be reordered after the atomic write to interrupt?  I can never remember, and this happens-before reasoning tends to give me the creeps.  I think the writes on these two fields should just be protected by a lock, would be simple and bulletproof.  The write/read races would remain.  Does TSan report those?
(In reply to Brian Hackett (:bhackett) from comment #3)
> The write/read races would remain.  Does TSan report those?

Yes I think so, but if we can fix anything here as you suggested, that's a good start. We can look at the remaining issue then when it pops up.
You need to log in before you can comment on or make changes to this bug.