Closed Bug 587288 Opened 10 years ago Closed 10 years ago

TM: Fix RegExp ExecutablePool race

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: cdleary, Assigned: cdleary)

References

Details

(Whiteboard: [hardblocker] [fixed-in-tracemonkey])

Attachments

(2 files, 1 obsolete file)

In bug 587277 we fixed a release race by instroducing atomic ops for ExecutablePool refcounting. The real solution, as mentioned in that bug, is to put the ExecutableAllocator per-compartment so that no two threads may be changing the refcount for a given regexp object simultaneously.

This also means that when refcounts cross compartment boundaries they need to be duplicated, similar to scripts, dissimilar to JSObjects. A cross-compartment copy routine must be made, which gal said he would help hook up.
Moves the regexp executable allocator to the JSCompartment and gets rid of js::RegExp atomic refcounts -- js::RegExps must not be referenced across compartment boundaries. We also don't need to refcount during regexp_exec_sub.

WIP because it's failing xpcshell tests I have to look into.
Making a note that Dave created the concept of a JaegerCompartment, per bug 611589 comment 1.
Per-compartment is makes the live population of ExecutablePools large. To handle thread data races we need only put the ExecutablePool in question in JSThreadData (IINM -- please correct me if I'm missing something).

We should use JSThreadData for anything that does not strictly need to be per compartment, but that must be single-threaded and fully lock-free.

/be
I see bug 587277 comment 1 about JSThreadData not being appropriate because of migration, though. Is this for sure? Igor had pointed out other problems in any embedding that migrates OS threads without migrating JSThreadData.

How exactly do compartments migrate across threads? Workers can do that but then they'd need their ThreadData to migrate too.

/be
Multiple threads can touch compartments, but by design only one thread is supposed to be associated with a compartment at a time (you associate yourself with a compartment by doing AutoEnterCompartment ac; and such). So over the lifetime of a compartment it can wander between threads, but once a thread associates with it, no other thread can touch it until it de-associates again first.
(this is not enforced with asserts yet, but we have a bug filed for this for b8)
#4: I played around with per-compartment property trees this morning. I was worried about memory use but turns out this is actually nice because compartments are often short-lived (you go to cnn.com, compartment is born, you navigate away, it goes away after its out of the BF cache). So it might not be too bad that we move such stuff into compartment, in particular if we teach the BF cache to purge per-compartment cache structures.
I'm asking concretely about how compartments migrate in code we're getting ready to ship. If workers do it by migrating from one OS thread to another, don't we have problems if JSThreadData is not migrated too?

Per-compartment property tree does not in fact reduce sharing, since shapes depend on prototype object identities. So we have the memory hit already. Atoms are a different story but want a truly lock-free solution (bug 611589).

It is not at all clear that multithreaded embeddings would mind loss of atom sharing in a runtime full of threads. But compartments in our world are many while the main thread is one, and there's little cross-thread sharing outside of reserved word atoms.

This |compartments| >> |threads| > |runtimes| concern would seem to epply to at least atoms, ExecutablePools, TraceMonitors, PropertyCaches, StackSpaces.

/be
(In reply to comment #8)
> I'm asking concretely about how compartments migrate in code we're getting
> ready to ship. If workers do it by migrating from one OS thread to another,
> don't we have problems if JSThreadData is not migrated too?

Since there is not a 1-to-1 association between components and thread data, it seems that there would be a problem migrating a thread data if any one of its compartments changes OS thread.  This is the origin of the "zone" proposal mentioned in bug 611589 comment 1.
(In reply to comment #9)
> (In reply to comment #8)
> > I'm asking concretely about how compartments migrate in code we're getting
> > ready to ship. If workers do it by migrating from one OS thread to another,
> > don't we have problems if JSThreadData is not migrated too?
> 
> Since there is not a 1-to-1 association between components

"compartments", right? (We have component too ;-)

> and thread data, it
> seems that there would be a problem migrating a thread data if any one of its
> compartments changes OS thread.  This is the origin of the "zone" proposal
> mentioned in bug 611589 comment 1.

I'd like to shave close with Occam's razor. Do we really need more entities, especially if there are many compartments per thread/thread-data but they'd all migrate at once, to another OS thread, along with their thread-data?

If we have a hard use-case for compartments A, B, and C using thread-data T on OS thread #1 migrating such that A and B migrate to OS thread #2 and C to #3, I'd like to hear it. Otherwise let's not overdesign for this possibility and at least for now (if not forever) enforce a strict hierarchy.

/be
Hehe, yes, "compartment" :)

I am going on just my basic understanding of browser content + web workers: we have hundreds of compartments living on 1 OS thread and tying ExecutablePools to the JSThreadData is appropriate (you clearly agree).  And we have individual compartments for web workers that bounce around OS threads, making it initially seem like ExecutablePools should be tied to the compartment.  To unify and always store on the JSThreadData, though, we'd have to give each worker compartment its own JSThreadData that moves with it as you stated above (otherwise we'd have the problem I mentioned above).  If this is what you were suggesting, I'm sorry for misunderstanding, because JSThreadData would effectively become the "zone" I was asking for (and stop being the "data associated with a HW thread" that it is now -- perhaps deserving a rename).
Right, I'm suggesting whatever we call it (JSThreadData, Zone), we don't need two such concepts.

/be
When running pbiggar's multithreaded test (bug 619595) on helgrind, I
saw some racing within Yarr's code allocator.  cdleary suggested on
irc that it's basically the same bug as this one.

Error report is below.  There's a bunch of very similar ones, not just
this one, and they appear pretty consistently in my test runs.  It
appears to be a race on JSC::ExecutablePool::m_freePtr, maybe on other
fields too.

I noticed that there were no reports of races from the methodjit
calling into ExecutableAllocator, only from YARR.  So perhaps the
methodjit uses some other locking scheme to serialise its accesses to
the allocator (or uses one allocator per thread).  Chris mentioned
that the MJ allocator lives in the JSRuntime, so perhaps that's the
reason.

Anyway, possible crasher?  If the code for two regexps winds up
overlapping?



Possible data race during read of size 8 at 0x618f910 by thread #6
   at 0x5642AC: JSC::ExecutablePool::available() const
                (ExecutableAllocator.h:171)
   by 0x5A16B2: JSC::Yarr::RegexGenerator::compile
                (ExecutableAllocator.h:242)
   by 0x59963B: JSC::Yarr::jitCompileRegex (RegexJIT.cpp:1573)
   by 0x4B2F4A: js::RegExp::createObjectNoStatics (jsregexpinlines.h:442)
   by 0x4AC705: js::Parser::primaryExpr(js::TokenKind, int)
                (jsregexpinlines.h:391)
   by 0x4A7263: js::Parser::memberExpr(int) (jsparse.cpp:7360)
   by 0x4A7B54: js::Parser::unaryExpr() (jsparse.cpp:6733)
   by 0x4A7C2D: js::Parser::mulExpr() (jsparse.cpp:6567)
   by 0x4A7D97: js::Parser::addExpr() (jsparse.cpp:6553)
   by 0x4A7EFF: js::Parser::shiftExpr() (jsparse.cpp:6542)
   by 0x4A7FFC: js::Parser::relExpr() (jsparse.cpp:6520)
   by 0x4A81E7: js::Parser::eqExpr() (jsparse.cpp:6501)

 This conflicts with a previous write of size 8 by thread #3
   at 0x56536B: JSC::ExecutablePool::alloc(unsigned long)
                (ExecutableAllocator.h:155)
   by 0x5A184E: JSC::Yarr::RegexGenerator::compile (AssemblerBuffer.h:148)
   by 0x59963B: JSC::Yarr::jitCompileRegex (RegexJIT.cpp:1573)
   by 0x4CAD41: regexp_compile_sub_tail (jsregexpinlines.h:442)
   by 0x4CB34D: regexp_compile_sub (jsregexp.cpp:742)
   by 0x4CB7FA: regexp_construct (jsregexp.cpp:878)
   by 0x5885B7: CallCompiler::generateNativeStub() (jscntxtinlines.h:685)
   by 0x5859ED: js::mjit::ic::NativeNew (MonoIC.cpp:906)

 Address 0x618f910 is 16 bytes inside a block of size 88 alloc'd
   at 0x4C2841A: operator new(unsigned long) (vg_replace_malloc.c:261)
   by 0x421B4F: JSC::ExecutablePool::create(unsigned long)
                (ExecutableAllocator.h:135)
   by 0x418834: JSRuntime::init(unsigned int) (ExecutableAllocator.h:211)
   by 0x419011: JS_Init (jsapi.cpp:771)
   by 0x40A1F0: main (js.cpp:5565)
(In reply to comment #13)
> Chris mentioned
> that the MJ allocator lives in the JSRuntime, so perhaps that's the
> reason.

I may have misspoken, the MJ allocator lives in the JaegerCompartment (one per JSCompartment) and the regexp allocator lives in the JSRuntime. 

This looks like a missed race condition to me, should be fixed with this move to the compartment space. Marking as blocker.
blocking2.0: --- → ?
Summary: TM: Fix ExecutablePool release race properly (avoid bus locking) → TM: Fix RegExp ExecutablePool race
blocking2.0: ? → betaN+
Whiteboard: hardblocker
Moves the allocator to the compartment, remove ExecutablePool atomic refcounts, and no regexps ever have their guts traded.
Attachment #486662 - Attachment is obsolete: true
Attachment #501549 - Flags: review?(gal)
Remove the atomic refcounts from js::RegExp, add a bunch of assertSameCompartment checks, and clean up the surrounding code that uses those refcounts. (Adds quite a few comments.)
Attachment #501550 - Flags: review?(gal)
Comment on attachment 501549 [details] [diff] [review]
0. Compartmentalize regexp allocator.

># HG changeset patch
>+    void addRef()
>+    {
>+        ++m_refCount;
>+    }

Assert on it.

> 
>     //static PassRefPtr<ExecutablePool> create(size_t n)

?

---

How to we flush this cache? Maybe we could add an inactiity timer like bill added for traces?
Attachment #501549 - Flags: review?(gal) → review+
As a side note, the |RegExp.compile| builtin method is insidious! Swaps out the guts of a regexp object, so you have to be really careful when holding js::RegExp temporaries, a la bug 327170, because further code execution can drop your refcount underneath you.

I looked at all the current uses of RegExp::extractFrom and they seem okay. It's used in few enough places that I didn't think it was important to return an AutoRefCount<RegExp> from RegExp::extractFrom all the time.

@brendan: Any chance we can rip that deprecated method out for 4.0?
Comment on attachment 501550 [details] [diff] [review]
1. Nonatomic js::RegExp refcount.

>+    /* Note: should only be used when neither c1 nor c2 may be the default compartment. */
>+    static void check(JSCompartment *c1, JSCompartment *c2) {
>+        if (c1 != c2)
>+            fail(c1, c2);
>+    }

Assert that. c1->runtime->defaultCompartment.

>-/* Defined in the inlines header to avoid Yarr dependency includes in main header. */
>+/*
>+ * The "meat" of the builtin regular expression objects: it contains the
>+ * mini-program that represents the source of the regular expression. Excepting
>+ * refcounts, this is an immutable datastructure after compilation.
>+ *
>+ * Non-atomic refcounting is used, so single-thread invariants must be
>+ * maintained: we check regexp operations are performed in a single
>+ * compartment.
>+ *
>+ * @note    Defined in the inlines header to avoid Yarr dependency includes in
>+ *          main header.
>+ * @note    refCount cannot overflow because that would require more referring
>+ *          regexp objects than there is space for in addressable memory.

We don't really do this @note style elsewhere. Check with Brendan.
(In reply to comment #17)
> >     //static PassRefPtr<ExecutablePool> create(size_t n)
> 
> ?

I think that was kept in to keep merging easier -- an artifact from the macroassembler import, where their refcount templates were GPL'd. I can probably remove it now without any issues.

> How to we flush this cache? Maybe we could add an inactiity timer like bill
> added for traces?

We don't flush the regexp JIT code because they're tied to regexp object lifetimes. Flushing those code blocks would mean that we'd use a lazily compiled regexp JIT code policy. Not sure that that is a win on any real benchmarks, although you can imagine lots of microbenchmarks where it would be.
cdleary: I bet you can rip out compile.

No JavaDoc @note noise if you please :-P.

/be
Attachment #501550 - Flags: review?(gal) → review+
Pushed to try because of possible chrome concurrency issues that may pop up: http://hg.mozilla.org/try/rev/bb9264049093
The style you want, rather than @note, is NB:.  I don't remember seeing anything else ever used.  Note: works fine, but it's just not idiomatic.

Don't be hating too much on javadocs, Brendan.  :-)  They're adequate to the task, giving minimal structure to comments with a few extra readable characters.  It could be much worse -- you could be contending with comments containing full-blown XML a la C#.  :-P
Try looked good, pushing to tracemonkey:

http://hg.mozilla.org/tracemonkey/rev/f75602c6e521
http://hg.mozilla.org/tracemonkey/rev/b5ccae65b44e
Whiteboard: hardblocker → [hardblocker] [fixed-in-tracemonkey]
(In reply to comment #24)
> Try looked good, pushing to tracemonkey:

Now runs Helgrind-clean; +1 from here.
Ok, JavaDoc tokens are better than XML!

But they are noisy (it's not hateful to say that). The @ sign calls attention but to the keyword. It's for tools, not humans, not based on any manual of style people use.

/be
No longer blocks: 614155
You need to log in before you can comment on or make changes to this bug.