TSAN detects data race when background ion compilation reads flag written by foreground wasm module instantiation
Categories
(Core :: JavaScript Engine: JIT, defect, P2)
Tracking
()
People
(Reporter: lth, Unassigned)
Details
(Keywords: csectype-race, sec-moderate)
Attachments
(1 file)
10.38 KB,
text/plain
|
Details |
This run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4abdc181f48477c92abf51af5b3d09ab9da31c99&selectedJob=219742464 requires the latest patch on bug 1515917. Running with --ion-eager --wasm-gc, there's an apparent race between the foreground wasm subsystem instantiating a module that was already compiled and background ion subsystem jit-compiling JS code: the wasm instantiation is atomizing some strings for a property name while the ion compilation is checking whether something has been atomized. I'm attaching an excerpt from the log that provides details. I don't know if there's any kind of protocol in place that one's supposed to use to avoid this kind of thing... surely if background ion compilation can trip over this then there must be some guarantee that no foreground code performs an atomization like here. (Setting component as "jit" since I'm unconvinced that the jit's behavior is correct, but this is just to have somewhere to start. This could well be a wasm bug.)
Reporter | ||
Comment 1•5 years ago
|
||
Jan, is there a protocol for avoiding this kind of race that wasm is ignoring? I see no obvious guard in CodeGenerator::addSetPropertyCache() that protects against concurrent action by the main thread, so is there some kind of assumption that the main thread will not perform this type of update (atomization) while jit compilation is ongoing?
Updated•5 years ago
|
Comment 2•5 years ago
|
||
The problem comes from the fact that our CodeGenerator was not initially written to be executed out of the main thread. When calling CodeGenerator::addSetPropertyCache. However, while we are not supposed to do these access, I fail to see from where the race is coming from as JSString are not supposed to move unless we have a GC which is expecting to cancel the compilation. Also, I would expect WASM to not touch existing Atoms and only create new one. My guess would be that a WASM module attempt to atomize a JSString which also happen to be a property name used in JS. To work-around this TSan issue we could record the Atom status when reading the String during IonBuilder, but I suspect this would only hide this TSan issue and make the problem even worse. I expect the actual fix to involve some kind of JSString-freezing, as we do with Type inference, but at the time where IonBuilder iterate over the string.
Comment 3•5 years ago
•
|
||
This is interesting: atoms are supposed to be immutable, but what we have here is wasm code calling JS_AtomizeAndPinString and that's an exception to this I guess. It probably shouldn't use that API because pinned atoms are never ever collected... We really need to kill this API :/ For the Ion code, maybe we can get away with ensuring we can only get atoms there during compilation (then we can just assert isAtom in debug builds). That might be an invariant already, worth trying.
Updated•5 years ago
|
Reporter | ||
Comment 4•5 years ago
|
||
I suppose to work around the problem for now I could preallocate atoms _0, _1, ..., and also _0_low, _0_high, ... up to some reasonable number -- this is just a prototype, after all; eventually the "names" will become indexed properties and the low/high hacks will go away. Those predefinitions can be under a suitable ifdef. If we run out of names we could just throw. That would not fix this bug per se but it would unblock landing bug 1515917, and it would reduce the number of users of the bad API by one.
Reporter | ||
Comment 5•5 years ago
|
||
Andrew, I'd like to petition to move this to sec-moderate; to the best of my knowledge the bug is triggered by highly experimental code that is available only in Nightly and then only behind a flag.
Comment 6•5 years ago
|
||
The rating is based on the severity if something was enabled, but I can mark it as disabled.
Reporter | ||
Comment 7•5 years ago
|
||
A fix for the use in Wasm is being submitted on bug 1530360, where it is pitched as a non-security leak fix (which it is); this does not remove the race condition, but it unblocks other wasm work that couldn't land because of the race condition. The race condition is really brought on by the combination of the existence of the API and ion background compilation.
Comment 8•5 years ago
|
||
The first stack shows the JSString
being created on the main thread. setPinned()
is called in the context of this lock. That's the locking scheme that's supposed to make this safe.
So how does the other thread get hold of the same JSString
?
-
If through the atom table, that access should also acquire the lock on the relevant partition of the table.
-
If it's passed to the helper thread as part of the task, the communication channel should be synchronized.
Comment 9•5 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #8)
The first stack shows the
JSString
being created on the main thread.setPinned()
is called in the context of this lock. That's the locking scheme that's supposed to make this safe.
There are no locks involved on the compilation thread, we just have a constant JSString*, likely an atom. The race is benign in that I think we can only pin atoms and that's not a flag Ion cares about, it just happens to be stored in the same flags word. As far as I know this affects not just Ion compilations but should also be possible to trigger with off-thread parsing etc.
Long-term atom pinning needs to go, but we're not there yet. I'll think about it more.
Comment 10•5 years ago
|
||
I think this is a regression from bug 1445196, it changed atoms from being immutable to having the is-pinned flag that is mutable.
Comment 11•5 years ago
|
||
See comment 9 and comment 10 - I think we need to either revert bug 1445196 (probably not feasible..) or work on static atoms and use them instead of pinned atoms (static atoms we're considering to save memory, too). In practice I think this race isn't an issue because different flags...
Updated•5 years ago
|
Comment 12•5 years ago
|
||
Lars, is this still nightly-only and behind a pref?
Reporter | ||
Comment 13•5 years ago
|
||
Wasm no longer triggers this bug, see comment 7. However, JS probably still has the issue, in principle; Jan would know whether it can actually be triggered. I suspect it's not urgent to resolve it.
Updated•5 years ago
|
Comment 14•5 years ago
|
||
We can make this sec-moderate. I don't think it's actually a security issue; compilation doesn't care about the isPinned flag.
Comment 15•3 years ago
|
||
Bug 1651723 also removed the write. Comment 7 explains that the read is fixed for WASM case. For the JS case, the read is covered by Bug 1623542 (which itself seems to be fixed by Bug 1660798). I'll mark this bug as fixed and we'll double-check Bug 1623542 and probably be able to close that as well.
Updated•3 years ago
|
Updated•2 years ago
|
Description
•