Closed Bug 938124 Opened 6 years ago Closed 6 years ago

Add --ion-check-thread-safety mode to shell

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla29

People

(Reporter: bhackett1024, Assigned: bhackett1024)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Attached patch patchSplinter Review
Now that IonBuilder doesn't have access to a JSContext its ability to interact with the rest of the VM is pretty limited.  It still can, however, read (and even modify) heap state and it would be nice to have controls on this behavior.

The attached patch adds a new mode to Ion, --ion-check-thread-safety, which does the following things in debug builds:

- Before compiling with Ion, the entire GC heap is mprotected, so that IonBuilder will segv when accessing any GC thing.

- Adds classes AutoUnprotectCell and AutoUnprotectCellUnderCompilationLock for temporarily allowing access to a GC thing (and anything else in its arena/page).  For now these classes are identical, since there is no compilation lock, but for now these allow distinguishing between accesses IonBuilder makes to memory which is either immutable or will only be modified under the compilation lock.

Ideally these two categories are the only memory that IonBuilder would access, which would prevent all races when compiling scripts.  There are some, however that are tolerated; AutoUnprotectCell doesn't help at all in preventing races, just in identifying and looking for them more easily.  The races that are currently tolerated are:

- Reading script use counts.

- Reading script bitfields.  There are a ton of these and some change after the script is created (one, |uninlineable|, is even changed within IonBuilder itself) and there could be races when the main thread writes back modified bitmasks.

- Reading shape slots.  The number of linear searches on a shape is stored in the same word as the slot, and can be modified at any time by the main thread.

- Testing obj->isBoundFunction.  This accesses the shape and is an obscure spot I'd like to fix.
Attachment #831498 - Flags: review?(jdemooij)
Comment on attachment 831498 [details] [diff] [review]
patch

Review of attachment 831498 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/Heap.h
@@ +1108,5 @@
>  }
>  
>  } /* namespace gc */
> +
> +class AutoUnprotectCell

We should have a comment here explaining what this class is for and when it's safe to use it.

::: js/src/jit/IonBuilder.cpp
@@ +5847,5 @@
>                  existingValue = baselineFrame_->thisValue();
>              } else if (arg < info().nargs()) {
> +                if (info().needsArgsObj()) {
> +                    // Note: baseline frame contents need to be copied into temporary space
> +                    // when triggering an OSR off thread compilation.

Good catch, please file a follow-up bug. When we enter a script via OSR, we already copy the BaselineFrame to the heap (see PrepareOsrTempData), it would be nice if we could use the same buffer here. The only issue is that since bug 907187 we no longer copy the args/this, seems like for this case we do need access to |this| + formals.

(Also note that IonOsrTempData::baselineFrame is not a BaselineFrame*, it corresponds to the frame pointer, but should be easy to add a method that returns a BaselineFrame* by subtracting from baselineFrame.).

@@ -6291,5 @@
>  
>      MDefinition *value = current->peek(-1);
>  
> -    if (staticObject->watched())
> -        return jsop_setprop(name);

Why can we remove this?

::: js/src/vm/Runtime.cpp
@@ +835,5 @@
> +    }
> +
> +    arena = base;
> +
> +    if (mprotect(arena, sizeof(Arena), PROT_READ | PROT_WRITE))

Can we make this PROT_READ, and use PROT_WRITE only for AutoUnprotectCellForWrite?
Attachment #831498 - Flags: review?(jdemooij) → review+
(In reply to Jan de Mooij [:jandem] from comment #1)
> ::: js/src/jit/IonBuilder.cpp
> @@ +5847,5 @@
> >                  existingValue = baselineFrame_->thisValue();
> >              } else if (arg < info().nargs()) {
> > +                if (info().needsArgsObj()) {
> > +                    // Note: baseline frame contents need to be copied into temporary space
> > +                    // when triggering an OSR off thread compilation.
> 
> Good catch, please file a follow-up bug. When we enter a script via OSR, we
> already copy the BaselineFrame to the heap (see PrepareOsrTempData), it
> would be nice if we could use the same buffer here. The only issue is that
> since bug 907187 we no longer copy the args/this, seems like for this case
> we do need access to |this| + formals.
> 
> (Also note that IonOsrTempData::baselineFrame is not a BaselineFrame*, it
> corresponds to the frame pointer, but should be easy to add a method that
> returns a BaselineFrame* by subtracting from baselineFrame.).

I think we'll need a different mechanism for compatibility with GGC.  The baseline frame that we triggered the OSR compilation from could contain nursery pointers, and we don't let the compiler have any access to nursery pointers so that minor collections can happen on the main thread without regard for what off thread compilation is doing.  Instead, I think we need a new structure that mainly just records the types of this/formals/stack in the baseline frame.

> @@ -6291,5 @@
> >  
> >      MDefinition *value = current->peek(-1);
> >  
> > -    if (staticObject->watched())
> > -        return jsop_setprop(name);
> 
> Why can we remove this?

When objects are watched we already mark the associated properties as configured (baseops::Watch) which will also cause setStaticName to fall back to jsop_setprop.

> ::: js/src/vm/Runtime.cpp
> @@ +835,5 @@
> > +    }
> > +
> > +    arena = base;
> > +
> > +    if (mprotect(arena, sizeof(Arena), PROT_READ | PROT_WRITE))
> 
> Can we make this PROT_READ, and use PROT_WRITE only for
> AutoUnprotectCellForWrite?

I'd like to just remove AutoUnprotectCellForWrite --- there's only one use of it and it should be easy enough to refactor so that scripts are only marked as uninlineable on the main thread.
Blocks: 939438
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/8fefcd2bb154 for permaorange on debug 10.6 (https://tbpl.mozilla.org/php/getParsedLog.php?id=30807856&tree=Mozilla-Inbound#error19) and permared (by way of making the run take over the maximum allowed time) on debug 10.7 (https://tbpl.mozilla.org/php/getParsedLog.php?id=30807982&tree=Mozilla-Inbound#error19) and intermittent on Linux (https://tbpl.mozilla.org/php/getParsedLog.php?id=30801437&tree=Mozilla-Inbound#error19) timeouts in the devtools browser_gcli_ tests.

Probably also going to wind up being the cause of permaorange timeouts in debug ASan jit-tests in TypedObject/jit-complex.js and TypedObject/jit-prefix.js (https://tbpl.mozilla.org/php/getParsedLog.php?id=30803607&tree=Mozilla-Inbound), but that's not done doing retriggers yet - the proof will be if https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=957acbe8ca22&jobname=debug%20asan winds up green and https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=00644e4b067d&jobname=debug%20asan winds up timing out in those two tests.
Or, we may never know: there's one green above you now, making it either intermittent or started by something vanishingly unlikely to have started it.
Filed the ASan jit-test thing as bug 940868, because I doubt my ability to get any kind of idea what caused it by retriggering four hour builds.
Push with just the actual VM tweaks, and not any of the AutoUnprotectCell stuff:

https://hg.mozilla.org/integration/mozilla-inbound/rev/35c62ee3a3f8

Try:

https://tbpl.mozilla.org/?tree=Try&rev=353cc65a96d1

This this try push works and the earlier inbound one didn't, it seems pretty clear there isn't anything in the overall patch at fault and adding some inactive debug code just made these tests start timing out.  I'm not quite sure yet what the plan is about this but for now I'm going to table it and do the (small) remaining work to get off thread IonBuilder working --- IonBuilder should be pretty stable again now and it isn't super critical to get this analysis working ASAP.
Whiteboard: [leave open]
(In reply to Brian Hackett (:bhackett) from comment #7)
> I'm not quite
> sure yet what the plan is about this but for now I'm going to table it and
> do the (small) remaining work to get off thread IonBuilder working ---
> IonBuilder should be pretty stable again now and it isn't super critical to
> get this analysis working ASAP.

Can we get this in while I review bug 785905, at least for the shell? Making the AutoUnprotectCell constructor/destructor code inlinable (at least the analysis-disabled part) should probably help.

Without this analysis it's a matter of days before somebody calls numFixedSlots instead of numFixedSlotsForCompilation :)
Add the protection classes by themselves, with no unprotect uses:

https://hg.mozilla.org/integration/mozilla-inbound/rev/06f844b81f3d
(In reply to Jan de Mooij [:jandem] from comment #9)
> Can we get this in while I review bug 785905, at least for the shell? Making
> the AutoUnprotectCell constructor/destructor code inlinable (at least the
> analysis-disabled part) should probably help.
> 
> Without this analysis it's a matter of days before somebody calls
> numFixedSlots instead of numFixedSlotsForCompilation :)

Oops, missed this.  The try run I did after this bounced rejiggered things so that the AutoUnprotect ctor/dtor just checked a static variable and didn't do any uninlined calls.  So this bogus test failure has pretty much shot down the bigger patch.  I'm going to land things piecemeal and see what happens, which has the added benefit of doing things cleaner, by using proper accessors for threadsafe data (e.g. script->code() instead of both script->code and script->getCode()).
Should we fuzz with --ion-check-thread-safety now or wait until this bug is marked as fixed?
(In reply to Jesse Ruderman from comment #13)
> Should we fuzz with --ion-check-thread-safety now or wait until this bug is
> marked as fixed?

No, fuzzing should wait until this is marked as fixed (when the rest of the unprotects go in and --ion-check-thread-safety --ion-eager passes jit-tests).
Depends on: 944946
Depends on: 947782
Depends on: 948230
Depends on: 949747
This push does the rest of the patch, so that IonBuilder passes the thread safety analysis when running with --ion-eager.  While the OS X timeouts went away some Linux x86 browser-chrome ones arrived, so I just enabled the thread safety analysis only on x64, non-windows builds.

https://hg.mozilla.org/integration/mozilla-inbound/rev/3eb853546cff

https://tbpl.mozilla.org/?tree=Try&rev=2cbdc6bf9912
Whiteboard: [leave open]
Duplicate of this bug: 951957
https://hg.mozilla.org/mozilla-central/rev/3eb853546cff
Assignee: nobody → bhackett1024
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
(In reply to Brian Hackett (:bhackett) from comment #14)
> (In reply to Jesse Ruderman from comment #13)
> > Should we fuzz with --ion-check-thread-safety now or wait until this bug is
> > marked as fixed?
> 
> No, fuzzing should wait until this is marked as fixed (when the rest of the
> unprotects go in and --ion-check-thread-safety --ion-eager passes jit-tests).

Since this has been marked fixed, I've added support for this flag in fuzzing rev 6d44953d8704.
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #18)
> Since this has been marked fixed, I've added support for this flag in
> fuzzing rev 6d44953d8704.

needinfo from decoder to make sure LangFuzz knows about this too.
Flags: needinfo?(choller)
I've enabled this in LangFuzz for threadsafe builds a while ago.
Flags: needinfo?(choller)
You need to log in before you can comment on or make changes to this bug.