Closed Bug 1131759 Opened 9 years ago Closed 9 years ago

TSan: data race js/src/jsscript.cpp:2205 runtime / MarkScriptData during relazification

Categories

(Core :: JavaScript: GC, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla38
Tracking Status
firefox38 --- fixed
firefox45 --- fixed

People

(Reporter: froydnj, Assigned: shu)

References

(Blocks 1 open bug)

Details

(Whiteboard: [tsan])

Attachments

(2 files)

Attached file js-gc-race.txt
[cribbing from decoder's script, hopefully he won't mind]

The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer).

Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause unacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1].

If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist.

I'm not entirely sure why TSan thinks the function is called "runtime"; it's pretty obviously called MarkScriptData:

http://mxr.mozilla.org/mozilla-central/source/js/src/jsscript.cpp#2205

[1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
Here, let's put the bug in the correct component. >.<
Component: General → JavaScript: GC
Group: core-security
Do you know what you were running to trigger this?  It looks like two different worker threads were racing on something.
I was running dom/archivereader/test/.  The log says it took place just after the test_zip_in_zip.html test.  I've gotten it to happen twice, so...
That test may do something like fire off a bunch of workers at once, and run tests on them in parallel, which I could imagine would end up testing this sort of raciness in the JS engine.
Shu, do you have some idea what is going on here?  It looks like two different DOM workers are relazifying during GC and end up writing to the same location in a racy fashion.
Flags: needinfo?(shu)
Summary: TSan: data race js/src/jsscript.cpp:2205 runtime / MarkScriptData → TSan: data race js/src/jsscript.cpp:2205 runtime / MarkScriptData during relazification
This is racing on setting a flag to true on a shared thing. I'll put up a patch that changes the flag to a release-acquire atomic.
Flags: needinfo?(shu)
Attachment #8564444 - Flags: review?(terrence)
Comment on attachment 8564444 [details] [diff] [review]
Atomicize SharedScriptData::marked.

Review of attachment 8564444 [details] [diff] [review]:
-----------------------------------------------------------------

Makes sense.
Attachment #8564444 - Flags: review?(terrence) → review+
This looks fairly benign, but I don't know what "marked" is relied on for.
Keywords: sec-moderate
It's for making sure something isn't collected by GC. I would say this is sec low at most.
(In reply to Shu-yu Guo [:shu] from comment #10)
> It's for making sure something isn't collected by GC. I would say this is
> sec low at most.

Something isn't collected by GC?  Does that mean that without this patch, something could be collected by GC that wasn't supposed to, and we could have (the GC equivalent of) a UAF?
Flags: needinfo?(shu)
On second thought this isn't s-s at all. Please correct me if I'm wrong Terrence.
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #11)
> (In reply to Shu-yu Guo [:shu] from comment #10)
> > It's for making sure something isn't collected by GC. I would say this is
> > sec low at most.
> 
> Something isn't collected by GC?  Does that mean that without this patch,
> something could be collected by GC that wasn't supposed to, and we could
> have (the GC equivalent of) a UAF?

No, the race is on 2 threads setting a marked field to true. The race, regardless of which thread wins, will result in the thing getting marked. Falsifying the field happens in another phase and isn't racy.
Flags: needinfo?(shu)
Reading the flag doesn't race either as far as I can tell.
forgot to NI Terrence per comment 12.
Flags: needinfo?(terrence)
https://hg.mozilla.org/mozilla-central/rev/4a21032847e0
Assignee: nobody → shu
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Group: core-security
Keywords: sec-moderate
(In reply to Shu-yu Guo [:shu] from comment #15)
> forgot to NI Terrence per comment 12.

Your analysis looks correct. The instance here is acquired by walking the script via the bytecode, so the shared bits can be updated simultaneously. Sweeping appears to be accessed via the runtime though, so should not race with other threads. (Caveat: I'm not at all familiar with SharedScript's ownership model.)
Flags: needinfo?(terrence)
backed out to investigate tsvgx regression:

https://hg.mozilla.org/integration/mozilla-inbound/rev/8d38dc1dcdeb
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
According to :jseward this is the most common race in js/ after bug 1219401 lands.

Can we do a Try push and see if tsvgx is still unhappy? :)
Flags: needinfo?(shu)
I'm assuming tsvgx runs under svgr. Try push here:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=df5ba8fa257f
Flags: needinfo?(shu)
https://hg.mozilla.org/mozilla-central/rev/3d00b054f0b0
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: