Closed
Bug 1266470
Opened 8 years ago
Closed 8 years ago
TSan: multiple data races on JSFunction::flags_ (in JSFunction::maybeRelazify, JSFunction::kind)
Categories
(Core :: JavaScript: GC, defect)
Core
JavaScript: GC
Tracking
()
RESOLVED
FIXED
mozilla50
People
(Reporter: jseward, Assigned: jandem)
Details
Attachments
(2 files)
6.04 KB,
text/plain
|
Details | |
17.03 KB,
patch
|
efaust
:
review+
|
Details | Diff | Splinter Review |
I'm not sure that these are really the same thing -- hence, I'm not sure if it is really wise to put them together in the same bug. But for better or worse, here they are. In any case, it's clear there is a bunch of racing happening on JSFunction::flags_, which seems to be a collection of bitfields. STR: do a TSan-enabled build. Then: TSAN_OPTIONS=suppressions=/home/sewardj/MOZ/SUPPS/tsan-empty.txt \ DISPLAY=:1.0 ./mach xpcshell-test --sequential --log-mach - \ devtools/server/tests/unit/test_nesting-01.js 2>&1 \ | tee spew-23-tsan-3-kind | grep SUMMARY Sometimes the summary is for maybeRelazify, sometimes for kind, but it is always for one or the other. In other words *a* race is always reported.
Reporter | ||
Comment 1•8 years ago
|
||
TSan complaints.
Reporter | ||
Comment 2•8 years ago
|
||
Brian, is this something you can look at? If not, can you suggest who would be a suitable person?
Flags: needinfo?(bhackett1024)
Comment 3•8 years ago
|
||
It's hard for me to tell whether this is a false positive or not. The reported race here is between the main thread relazifying functions at the start of a GC, and a helper thread performing an Ion compilation. We are supposed to be canceling all off thread compilations at the start of the GC, but I can't tell where this happens anymore. Immediately before the relazification stuff there is this block of code: /* For non-incremental GC the following sweep discards the jit code. */ if (isIncremental) { for (GCZonesIter zone(rt); !zone.done(); zone.next()) { gcstats::AutoPhase ap(stats, gcstats::PHASE_MARK_DISCARD_CODE); zone->discardJitCode(rt->defaultFreeOp()); } } The discardJitCode call will cause all off thread compilations to be canceled. But the comment on this block of code is plainly incorrect and doesn't address where jitcode is discarded / off thread compilations canceled for non-incremental GCs.
Flags: needinfo?(bhackett1024) → needinfo?(terrence)
Comment 4•8 years ago
|
||
Jan changed how we do relazification a couple months ago. Maybe he knows how this is supposed to work? If not, I'll try to figure out what should be happening.
Flags: needinfo?(terrence) → needinfo?(jdemooij)
Reporter | ||
Comment 5•8 years ago
|
||
(In reply to comment #3 and comment #4) I am not sure if this was clear from my original comments, but I suspect that this is a bitfield race. That is, there is no logical race, but because JSFunction::flags_ is treated as a grab-bag of bit-fields, and because individual fields cannot be updated atomically, races are reported as resulting from accesses to unrelated parts of the word. For example jsfun.h:252: void setKind(FunctionKind kind) { this->flags_ &= ~FUNCTION_KIND_MASK; this->flags_ |= static_cast<uint16_t>(kind) << FUNCTION_KIND_SHIFT; } assigns to some subfield of flags_ and writes back the other fields unchanged, which potentially overwrites concurrent changes to those other fields made by some other thread. Does that seem plausible?
Assignee | ||
Comment 6•8 years ago
|
||
(In reply to Terrence Cole [:terrence] from comment #4) > Jan changed how we do relazification a couple months ago. Maybe he knows how > this is supposed to work? Before the relazification rewrite we relazified in JSFunction::trace, so I think we had the same issue then (assuming we don't stop stop off-thread compilations before the mark phase)? > If not, I'll try to figure out what should be happening. That would be great! FWIW, there's also a CancelOffThreadIonCompile call in JitCompartment::sweep. Ideally there would be only 1 or 2 places where we have to do that for GC.
Flags: needinfo?(jdemooij) → needinfo?(terrence)
Comment 7•8 years ago
|
||
So I think the way it's supposed to work is that LIR should not have any direct GC pointers: we would not know to keep them rooted and they would race with the main thread. If LIR were allowed to have pointers, we'd have to stop and delete all ongoing compilations at the start of every GC slice, rather than once when we sweep the zone. This seems like it would destroy performance pretty effectively, so we should probably find a way to do whatever we're doing with this function without holding the function itself. Jan, what is this function even doing in LIR?
Flags: needinfo?(terrence) → needinfo?(jdemooij)
Comment 8•8 years ago
|
||
Note that this is not a GC race specifically. GC just happens to touch lots of stuff. Note 2: there are some "template" objects which are used by the BG thread, but we have mechanisms to root these on the main thread and ensure that they are not touched by other code while compilation is ongoing.
Assignee | ||
Comment 9•8 years ago
|
||
I'll try to set up a TSan shell build locally so I can look at these races.
Assignee | ||
Comment 10•8 years ago
|
||
This wraps the JSFunction* pointer in MCall et al in a WrappedFunction class. WrappedFunction has a constructor that copies the relevant data while we're still on the main thread, to avoid races off-thread. This fixes a ton of failures when I run jit-tests with TSan locally.
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Flags: needinfo?(jdemooij)
Attachment #8763856 -
Flags: review?(efaustbmo)
Assignee | ||
Comment 11•8 years ago
|
||
Comment on attachment 8763856 [details] [diff] [review] Patch Review of attachment 8763856 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/MIR.h @@ +3841,5 @@ > + nargs_(fun->nargs()), > + isNative_(fun->isNative()), > + isConstructor_(fun->isConstructor()), > + isClassConstructor_(fun->isClassConstructor()), > + isSelfHostedBuiltin_(fun->isSelfHostedBuiltin()) I forgot to mention: an alternative is to add a FunctionFlags class wrapping the function's |uint16_t flags|, and then we can use that class here and in JSFunction. That's a bit more involved though and the compiler should be able to emit pretty efficient bit twiddling code for this.
Comment 12•8 years ago
|
||
Comment on attachment 8763856 [details] [diff] [review] Patch Review of attachment 8763856 [details] [diff] [review]: ----------------------------------------------------------------- This seems fine. I have no objections to this design. ::: js/src/jit/MIR.h @@ +3850,5 @@ > + bool isClassConstructor() const { return isClassConstructor_; } > + bool isSelfHostedBuiltin() const { return isSelfHostedBuiltin_; } > + > + // fun->native() and fun->jitInfo() can safely be called off-thread: these > + // fields never change. This is where it would be so nice to have a typesystem that could enforce that this is never modified in a racy manner... Oh, well.
Attachment #8763856 -
Flags: review?(efaustbmo) → review+
Reporter | ||
Comment 13•8 years ago
|
||
Can this be landed now?
Comment 14•8 years ago
|
||
Pushed by jandemooij@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/64248267e93e Fix a TSan data race on JSFunction flags. r=efaust
Comment 15•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/64248267e93e
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox50:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
You need to log in
before you can comment on or make changes to this bug.
Description
•