Closed Bug 1517452 Opened 5 years ago Closed 3 years ago

TSAN detects data race when background ion compilation reads flag written by foreground wasm module instantiation

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox67 --- disabled
firefox69 --- wontfix
firefox70 --- wontfix
firefox71 --- wontfix

People

(Reporter: lth, Unassigned)

Details

(Keywords: csectype-race, sec-moderate)

Attachments

(1 file)

Attached file tsan-error.txt
This run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4abdc181f48477c92abf51af5b3d09ab9da31c99&selectedJob=219742464
requires the latest patch on bug 1515917.

Running with --ion-eager --wasm-gc, there's an apparent race between the foreground wasm subsystem instantiating a module that was already compiled and background ion subsystem jit-compiling JS code: the wasm instantiation is atomizing some strings for a property name while the ion compilation is checking whether something has been atomized.  I'm attaching an excerpt from the log that provides details.

I don't know if there's any kind of protocol in place that one's supposed to use to avoid this kind of thing... surely if background ion compilation can trip over this then there must be some guarantee that no foreground code performs an atomization like here.

(Setting component as "jit" since I'm unconvinced that the jit's behavior is correct, but this is just to have somewhere to start.  This could well be a wasm bug.)
Jan, is there a protocol for avoiding this kind of race that wasm is ignoring?  I see no obvious guard in CodeGenerator::addSetPropertyCache() that protects against concurrent action by the main thread, so is there some kind of assumption that the main thread will not perform this type of update (atomization) while jit compilation is ongoing?
Flags: needinfo?(jdemooij)
Group: javascript-core-security
The problem comes from the fact that our CodeGenerator was not initially written to be executed out of the main thread.
When calling CodeGenerator::addSetPropertyCache.

However, while we are not supposed to do these access, I fail to see from where the race is coming from as JSString are not supposed to move unless we have a GC which is expecting to cancel the compilation. Also, I would expect WASM to not touch existing Atoms and only create new one.

My guess would be that a WASM module attempt to atomize a JSString which also happen to be a property name used in JS.

To work-around this TSan issue we could record the Atom status when reading the String during IonBuilder, but I suspect this would only hide this TSan issue and make the problem even worse.

I expect the actual fix to involve some kind of JSString-freezing, as we do with Type inference, but at the time where IonBuilder iterate over the string.
This is interesting: atoms are supposed to be immutable, but what we have here is wasm code calling JS_AtomizeAndPinString and that's an exception to this I guess. It probably shouldn't use that API because pinned atoms are never ever collected... We really need to kill this API :/

For the Ion code, maybe we can get away with ensuring we can only get atoms there during compilation (then we can just assert isAtom in debug builds). That might be an invariant already, worth trying.
Flags: needinfo?(jdemooij)
Summary: TSAN failure when background ion compilation reads flag written by foreground wasm module instantiation → TSAN detects data race when background ion compilation reads flag written by foreground wasm module instantiation
I suppose to work around the problem for now I could preallocate atoms _0, _1, ..., and also _0_low, _0_high, ... up to some reasonable number -- this is just a prototype, after all; eventually the "names" will become indexed properties and the low/high hacks will go away.  Those predefinitions can be under a suitable ifdef.  If we run out of names we could just throw.

That would not fix this bug per se but it would unblock landing bug 1515917, and it would reduce the number of users of the bad API by one.

Andrew, I'd like to petition to move this to sec-moderate; to the best of my knowledge the bug is triggered by highly experimental code that is available only in Nightly and then only behind a flag.

Flags: needinfo?(continuation)

The rating is based on the severity if something was enabled, but I can mark it as disabled.

Flags: needinfo?(continuation)

A fix for the use in Wasm is being submitted on bug 1530360, where it is pitched as a non-security leak fix (which it is); this does not remove the race condition, but it unblocks other wasm work that couldn't land because of the race condition. The race condition is really brought on by the combination of the existence of the API and ion background compilation.

No longer blocks: 1515917

The first stack shows the JSString being created on the main thread. setPinned() is called in the context of this lock. That's the locking scheme that's supposed to make this safe.

So how does the other thread get hold of the same JSString?

  • If through the atom table, that access should also acquire the lock on the relevant partition of the table.

  • If it's passed to the helper thread as part of the task, the communication channel should be synchronized.

Flags: needinfo?(jdemooij)

(In reply to Jason Orendorff [:jorendorff] from comment #8)

The first stack shows the JSString being created on the main thread. setPinned() is called in the context of this lock. That's the locking scheme that's supposed to make this safe.

There are no locks involved on the compilation thread, we just have a constant JSString*, likely an atom. The race is benign in that I think we can only pin atoms and that's not a flag Ion cares about, it just happens to be stored in the same flags word. As far as I know this affects not just Ion compilations but should also be possible to trigger with off-thread parsing etc.

Long-term atom pinning needs to go, but we're not there yet. I'll think about it more.

I think this is a regression from bug 1445196, it changed atoms from being immutable to having the is-pinned flag that is mutable.

See comment 9 and comment 10 - I think we need to either revert bug 1445196 (probably not feasible..) or work on static atoms and use them instead of pinned atoms (static atoms we're considering to save memory, too). In practice I think this race isn't an issue because different flags...

Flags: needinfo?(jdemooij)
Assignee: nobody → sdetar

Lars, is this still nightly-only and behind a pref?

Flags: needinfo?(lhansen)

Wasm no longer triggers this bug, see comment 7. However, JS probably still has the issue, in principle; Jan would know whether it can actually be triggered. I suspect it's not urgent to resolve it.

Flags: needinfo?(lhansen) → needinfo?(jdemooij)
Assignee: sdetar → nobody

We can make this sec-moderate. I don't think it's actually a security issue; compilation doesn't care about the isPinned flag.

Flags: needinfo?(jdemooij)
Keywords: sec-highsec-moderate

Bug 1651723 also removed the write. Comment 7 explains that the read is fixed for WASM case. For the JS case, the read is covered by Bug 1623542 (which itself seems to be fixed by Bug 1660798). I'll mark this bug as fixed and we'll double-check Bug 1623542 and probably be able to close that as well.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Group: javascript-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: