TM: Oracle bit vector allocation not thread-safe

RESOLVED FIXED

Status

()

defect
P1
critical
RESOLVED FIXED
10 years ago
10 years ago

People

(Reporter: jimb, Assigned: graydon)

Tracking

({fixed1.9.1})

unspecified
x86
macOS
Points:
---
Bug Flags:
blocking1.9.1 +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 attachment)

(Reporter)

Description

10 years ago
The static variable 'oracle' in jstracer.cpp is shared by all runtimes, and thus all threads, but the avmplus::BitSet type used by the Oracle type does not use thread-safe techniques to grow its bit vector.  Thus, SpiderMonkey clients that might have multiple threads using the JIT at the same time, even from different runtimes, could encounter race conditions growing the bitvector.

avmplus::BitSet has no fixed size; when one tries to set a bit off the end of the current array of longs, BitSet calls calloc to get a larger array, and then uses free to dispose of the older array.  This could race with other threads writing to the old array.

The oracle should be made per-thread, or should be grown to its maximum size upon construction (although the type as defined in Nanojit/avmplus.h doesn't provide a neat way to do that).
Flags: blocking1.9.1?
(Reporter)

Updated

10 years ago
Priority: -- → P1
(Assignee)

Comment 1

10 years ago
Posted patch BandaidSplinter Review
This should fix matters for now; I'll file a new one about removing imprecise oracles altogether and perhaps localizing to a single thread.
Attachment #376329 - Flags: review?(gal)

Updated

10 years ago
Attachment #376329 - Flags: review?(gal) → review+

Updated

10 years ago
Flags: blocking1.9.1? → blocking1.9.1+
(Assignee)

Comment 3

10 years ago
This is currently making the ARM tinderbox orange due to a stats-mismatch in trace-tests. I have attempted to reproduce in scratchbox but cannot. Rather, I get quite a lot of *different* stats mismatches (different ones, in fact, depending on whether I disable the mandelbrot tests or not) and I get different stats-mismatch failures still if I back up to one of the former revisions.

IOW I have no idea how to diagnose this. I'd love some time from vlad to help. Alternatively the test should possibly be disabled on ARM? I don't see anything incorrect about the patch. Any guesses?

If there's no other option it can be backed out, but I think it's actually rather important to making the worker threads safe.
(Assignee)

Updated

10 years ago
Whiteboard: fixed-in-tracemonkey
(Assignee)

Comment 4

10 years ago
The orange ARM box spontaneously fixed itself due to an unrelated checkin, but I also landed bug 492124 which should prevent the confounding state in the JIT cache(s) / oracle from causing such things in the future.

Comment 5

10 years ago
http://hg.mozilla.org/mozilla-central/rev/1ae1672f6749
Status: NEW → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.