Closed Bug 1131715 Opened 9 years ago Closed 9 years ago

TSan: data race js/src/vm/Runtime.h:963 setNeedsIncrementalBarrier

Categories

(Core :: JavaScript: GC, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox39 --- fixed

People

(Reporter: froydnj, Assigned: terrence)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [tsan])

Attachments

(2 files)

Attached file js-gc-race.txt
[cribbing from decoder's script, hopefully he won't mind]

The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer).

Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause unacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1].

If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist.

[1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
Nathan, thanks for filing all of these!
Blocks: GC.stability
It looks like a DOM Worker thread is looking at the same JSRuntime as the main thread? Main thread should be seeing a different JSRuntime::needsIncrementalBarrier from a DOM worker at all times, so it seems like something is going badly wrong here.

Ben, can you tell if there's anywhere we could be switching to an invalid context from the stack here?
Group: core-security
ni for comment 2
Flags: needinfo?(bent.mozilla)
(In reply to Terrence Cole [:terrence] from comment #2)

No, that worker stack looks normal to me... But minorGCImpl has a lot of stuff in it that mentions the main thread... https://dxr.mozilla.org/mozilla-central/source/js/src/jsgc.cpp?from=minorGCImpl#6374
Flags: needinfo?(bent.mozilla)
MainThread... of the JSRuntime, as opposed to one of the ExclusiveContext compartment threads, PJS threads, or (before they were made global) the helper threads. That usage should be fine. :-(
Is there a static member anywhere here?
In js/src/gc/GCTrace.cpp I see:
  static FILE *gcTraceFile = nullptr;
  static HashSet<const Class *, DefaultHasher<const Class *>, SystemAllocPolicy> tracedClasses;
  static HashSet<const ObjectGroup *, DefaultHasher<const ObjectGroup *>, SystemAllocPolicy> tracedGroups;

  js::gc::TraceTypeNewScript also has a static buffer.
js/src/gc/Memory.cpp has:
  static size_t pageSize = 0;
  static size_t allocGranularity = 0;
which is maybe covered by another bug.

js/src/gc/Statistics.cpp has:
  static ExtraPhaseInfo phaseExtra[PHASE_LIMIT] = { { 0, 0 } };
  static mozilla::Vector<Phase> dagDescendants[Statistics::MAX_MULTIPARENT_PHASES + 1];
  static bool initialized in Statistics::Statistics()

That's all I see in js/src/gc/.
Terrence, could you look into this some more?  Thanks.
Flags: needinfo?(terrence)
(In reply to Andrew McCreight [:mccr8] from comment #9)
> Terrence, could you look into this some more?  Thanks.

Thanks for the reminder! I was meaning to take a deeper look here, but it slipped off my radar.

Static data is a good idea, but I don't think it could be implicated here: the only thing that JSRuntime::needsIncrementalBarrier and JSRuntime::setNeedsIncrementalBarrier can possibly race on is JSRuntime::needsIncrementalBarrier_, as that is the only data they touch. It is not static. Indeed, that jibes with the TSan output with the location description of: "Location is heap block of size 57928 at 0x7dc000040000 allocated by main thread:".

And we are indeed setting JSRuntime::needsIncrementalBarrier_ from a main-thread minor GC. However, the racing access is from JSRuntime::needsIncrementalBarrier() on: "Thread T42 'DOM Worker' (tid=4140, running) created by main thread at:". Expanding the inlining, the racy stack is NativeObject::set -> HeapSlot::set -> BarrieredBase<Value>::pre -> InternalGCMethods<Value>::preBarrier. This method looks like:

        if (v.isMarkable() && shadowRuntimeFromAnyThread(v)->needsIncrementalBarrier())
            preBarrier(ZoneOfValueFromAnyThread(v), v);

Thus, the cross-thread runtime violation must come from shadowRuntimeFromAnyThread(v). This method acquires the runtime pointer via the Chunk trailer, which means that the |v| is indeed allocated in a different runtime. However, there is one case this is actually allowed to happen, and indeed, the zone-taking preBarrier -- called right after the racing access -- has a special check for that case:

        if (v.isString() && StringIsPermanentAtom(v.toString()))
            return;

Thus, this code will always exit before doing the barrier work, regardless of what value it reads via the race; therefore, the security implications here are nil. We should still refactor the code to avoid the race, but we can open it in the meantime.
Group: core-security
Flags: needinfo?(terrence)
Assignee: nobody → terrence
Status: NEW → ASSIGNED
Attachment #8569282 - Flags: review?(jcoppeard)
Attachment #8569282 - Flags: review?(jcoppeard) → review+
https://hg.mozilla.org/mozilla-central/rev/be71e3aadc35
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: