Closed Bug 904133 Opened 9 years ago Closed 9 years ago

IonMonkey: Store spilled float regs in safepoints

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(1 file)

Attached patch PatchSplinter Review
We need this for try-catch to work in Ion. This fixes the 3rd testcase decoder found in bug 902978.

We will also need this for bug 716647, to trigger a bailout from a VM call when the debugger is activated with scripts on the stack.
Attachment #789039 - Flags: review?(nicolas.b.pierron)
Comment on attachment 789039 [details] [diff] [review]
Patch

Review of attachment 789039 [details] [diff] [review]:
-----------------------------------------------------------------

Safepoints have no way to safely support bailouts, Bug 850068 was intended to address this problem in the context of the debugger.

The model implemented as part of Bug 866888 part 5 is bogus. A Safepoint describes a state which is valid during the execution of an instruction.  This means, that all allocations captured by a Safepoint must remain as-is untouched during the execution of the Safepoint.  Adding more information into a Safepoint implies that we prevent the instruction to manipulate any of the registers, except with saveLive, restoreLive & restoreLiveIgnore which modifications are acknowledged by the machineState function.

Doing a bailout with a Safepoint implies that we are able to recover the state of all registers, but this is not correct.  Register values can be moved around after a call, even if they are captured by a Safepoint.  The only safe place to do a bailout is with a snapshot with a machine state extracted from a dump made by the bailout path.  So the only safe location for a bailout is an OSI point.  We cannot make Safepoint safe for bailouts without adding too many constraints 

We should back out https://hg.mozilla.org/mozilla-central/rev/383ccc9eb488 And re-implement this JSTRY_CATCH (in HandleExceptionIon) by resuming the execution and using the OSI point's bailout path.
Attachment #789039 - Flags: review?(nicolas.b.pierron) → review-
Nicolas, can you give an example of an actual LIR instruction where my try-catch implementation would fail? Preferably with a testcase. --ion-compile-try-catch seems to work fine other than a small number of fuzz bugs, so it can't be entirely "bogus".

It's true that my patch uses a combination of the call's safepoint + the OsiPoint that follows the call instruction, but I don't think that's really a problem. The register allocator will not insert move groups between an instruction and its OsiPoint (other than for the result of the LIR instruction maybe, which doesn't matter for try-catch). Furthermore, we do *exactly* the same thing in other places (see the frame iterator's thisObject method for instance).

(In reply to Nicolas B. Pierron [:nbp] from comment #1)
> We should back out https://hg.mozilla.org/mozilla-central/rev/383ccc9eb488
> And re-implement this JSTRY_CATCH (in HandleExceptionIon) by resuming the
> execution and using the OSI point's bailout path.

I think that's more complicated. One reason is that we'd have to patch and throw away the Ion code just to handle an exception. Another is that we're returning to JIT code from a VM call even though it threw an exception, so outparams may be bogus.
(In reply to Jan de Mooij [:jandem] from comment #2)
> Nicolas, can you give an example of an actual LIR instruction where my
> try-catch implementation would fail? Preferably with a testcase.
> --ion-compile-try-catch seems to work fine other than a small number of fuzz
> bugs, so it can't be entirely "bogus".

For the same reason why it is unsafe to read arguments out of a Safepoint with f.arguments, see Bug 757428.  It is challenging to make such test case, but this might happen.

As mentioned, more you add things in your Safepoint less freedom you have for optimizing. In fact dvander mentionned it in Bug 759312 comment 2, when the feature has been de-optimized, with the idea that we could come back to it.

Safepoint::spillRegs was just the beginning, even if it does not matter much for OOL paths, this could be interesting for inlined paths which are using callVM, where the register allocator can avoid spilling, and can allocate non-volatile registers for non-GC objects.  Sadly, nobody took time to avoid spilling non-volatile non-GC registers around LCallInstructions.  I don't want to give up on that.

Mainly, I just don't want to consider things be assumed to be as they are today, as if we start relying on the fact that this is working, we are going to prevent any optimization on the call-sites.  Such as avoiding spilling everything on call instructions. 

If you want some special CodeGen in which are doing some crazy things, I can give you
 - CodeGenerator::isInstanceOfV
 - CodeGenerator::visitValueToInt32

But I don't think these will cause anything which would only be wrong with the try-catch, as long as we do not have the same register allocation for the payload input of isInstanceOfV and its output.

> It's true that my patch uses a combination of the call's safepoint + the
> OsiPoint that follows the call instruction, but I don't think that's really
> a problem. The register allocator will not insert move groups between an
> instruction and its OsiPoint (other than for the result of the LIR
> instruction maybe, which doesn't matter for try-catch).

You need to be aware that an instruction can mutate its input registers.  A Safepoint account for the state which is at the 3/4 of the instruction execution.  And the Snapshot of the OSI Point account for the interpreter state after the instruction.

As the OSI point snapshots account for the state after the instruction, using it to interpret anything in between might lead to potentially bad interpretation of non-GC objects.  Only GC-objects should be considered as reliable in the Safepoints.

What strikes me as being extremely bad, is that one maps the allocation to their values, and the other maps their stack location to their allocations, taken at 2 different time and we rely on *a feature which cannot be enforced* to guarantee that they will match, except for the return value.

> Furthermore, we do
> *exactly* the same thing in other places (see the frame iterator's
> thisObject method for instance).

Be careful, all iterators are GC objects, and thus the safepoint must capture them for GCs in order for them to be marked.  But reading a snapshot with a safepoint is not garantee, only if you iterate on GC objects, which is for this exact reasons that the code for iterating frames skip over everything except the JSFunction which must be accessible, but this is still a GC object.

> (In reply to Nicolas B. Pierron [:nbp] from comment #1)
> > We should back out https://hg.mozilla.org/mozilla-central/rev/383ccc9eb488
> > And re-implement this JSTRY_CATCH (in HandleExceptionIon) by resuming the
> > execution and using the OSI point's bailout path.
> 
> I think that's more complicated. One reason is that we'd have to patch and
> throw away the Ion code just to handle an exception. Another is that we're
> returning to JIT code from a VM call even though it threw an exception, so
> outparams may be bogus.

I do not see the issue, we can do something like gdb and re-patch back the OSI Point with the original code.  We have to use only one OSI Point at a time.
The problem with OsiPoint patching is that it doesn't work if you have:

Ion frame 1 - try-catch
Baseline frame 2
Ion frame 3
Ion frame 4 - throws an exception

Currently the exception handler just unwinds all frames and resumes in frame 1. If you patch the OsiPoint in frame 1, how do you cause frames 2-4 to return immediately or be discarded?

(In reply to Nicolas B. Pierron [:nbp] from comment #3)
> As mentioned, more you add things in your Safepoint less freedom you have
> for optimizing. In fact dvander mentionned it in Bug 759312 comment 2, when
> the feature has been de-optimized, with the idea that we could come back to
> it.
>
> Safepoint::spillRegs was just the beginning, even if it does not matter much
> for OOL paths, this could be interesting for inlined paths which are using
> callVM, where the register allocator can avoid spilling, and can allocate
> non-volatile registers for non-GC objects.  Sadly, nobody took time to avoid
> spilling non-volatile non-GC registers around LCallInstructions.  I don't
> want to give up on that.
> 
> Mainly, I just don't want to consider things be assumed to be as they are
> today, as if we start relying on the fact that this is working, we are going
> to prevent any optimization on the call-sites.  Such as avoiding spilling
> everything on call instructions. 

We could still optimize that if the code does not contain try-catch. Deoptimizing only in try-blocks seems pretty reasonable to me.

> But I don't think these will cause anything which would only be wrong with
> the try-catch, as long as we do not have the same register allocation for
> the payload input of isInstanceOfV and its output.

It won't be the same register.

Given that (1) my patch works fine right now, (2) there's no good alternative and (3) we haven't found any real problems so far, I really think we should do what we can to make this work. I can audit all VM calls if you want..
(In reply to Jan de Mooij [:jandem] from comment #4)
> The problem with OsiPoint patching is that it doesn't work if you have:
> 
> Ion frame 1 - try-catch
> Baseline frame 2
> Ion frame 3
> Ion frame 4 - throws an exception
> 
> Currently the exception handler just unwinds all frames and resumes in frame
> 1. If you patch the OsiPoint in frame 1, how do you cause frames 2-4 to
> return immediately or be discarded?

HandleIonException would be patched at the same location, but not for doing a bailout, but for resuming into the OSI Point after the call.  When we enter the bailout from the OSI Point, we can just patch the OSI Point back to restore the original code.

> (In reply to Nicolas B. Pierron [:nbp] from comment #3)
> > As mentioned, more you add things in your Safepoint less freedom you have
> > for optimizing. In fact dvander mentionned it in Bug 759312 comment 2, when
> > the feature has been de-optimized, with the idea that we could come back to
> > it.
> >
> > Safepoint::spillRegs was just the beginning, even if it does not matter much
> > for OOL paths, this could be interesting for inlined paths which are using
> > callVM, where the register allocator can avoid spilling, and can allocate
> > non-volatile registers for non-GC objects.  Sadly, nobody took time to avoid
> > spilling non-volatile non-GC registers around LCallInstructions.  I don't
> > want to give up on that.
> > 
> > Mainly, I just don't want to consider things be assumed to be as they are
> > today, as if we start relying on the fact that this is working, we are going
> > to prevent any optimization on the call-sites.  Such as avoiding spilling
> > everything on call instructions. 
> 
> We could still optimize that if the code does not contain try-catch.
> Deoptimizing only in try-blocks seems pretty reasonable to me.
> 
> > But I don't think these will cause anything which would only be wrong with
> > the try-catch, as long as we do not have the same register allocation for
> > the payload input of isInstanceOfV and its output.
> 
> It won't be the same register.
> 
> Given that (1) my patch works fine right now, (2) there's no good
> alternative and (3) we haven't found any real problems so far, I really
> think we should do what we can to make this work. I can audit all VM calls
> if you want..

(3) I already did so, and I have not found anything.

(2) What about the double-patching of the OSI Point?

(1) That's does not mean that this is a sane approach.  We used to have this one assertion in the bailout code, which is checking that the stack depth was identical to the resume point size.  Sadly, the function was never meant to be used like that, and this assertion was wrong in the first place.  That's was not sane, and you can see the result in bug 799185 comment 8.

My point is that is seems harder to break the current bailout mechanism by patching the OSI Point twice than doing a bailout with information which is not guarantee to be in-sync.
Comment on attachment 789039 [details] [diff] [review]
Patch

Review of attachment 789039 [details] [diff] [review]:
-----------------------------------------------------------------

Asking for re-review, see bug 905148 and bug 905091.
Attachment #789039 - Flags: review- → review?(nicolas.b.pierron)
Comment on attachment 789039 [details] [diff] [review]
Patch

Review of attachment 789039 [details] [diff] [review]:
-----------------------------------------------------------------

r=me, when Bug 905148 has landed.
Attachment #789039 - Flags: review?(nicolas.b.pierron) → review+
https://hg.mozilla.org/mozilla-central/rev/3b4b734eaa1a
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
You need to log in before you can comment on or make changes to this bug.