Closed Bug 970252 Opened 6 years ago Closed 6 years ago

Assertion failure: size_before == *profiler->size_, at vm/SPSProfiler.cpp:289

Categories

(Core :: JavaScript Engine, defect, critical)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: decoder, Assigned: djvj)

References

(Blocks 1 open bug)

Details

(Keywords: assertion, regression, testcase, Whiteboard: [jsbugmon:update,testComment=15,origRev=d7c07694f339])

Attachments

(3 files, 3 obsolete files)

The following testcase asserts on mozilla-central revision ecf20a2484b6 (run with --fuzzing-safe --ion-eager):


enableSPSProfilingAssertions(true);
function testcase() {
  function f(o) {
    function innerf(o, x) {
      try {
        f('');
      } catch (e) {}
    }
    return innerf(o, 42);
  }
  new f(testcase);
}
testcase();
Whiteboard: [jsbugmon:update,bisect]
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
JSBugMon: Bisection requested, failed due to error (try manually).
Attachment #8373251 - Attachment is obsolete: true
autoBisect shows this is probably related to the following changeset:

The first bad revision is:
changeset:   http://hg.mozilla.org/mozilla-central/rev/e473c952d233
user:        Jan de Mooij
date:        Wed Aug 28 13:13:11 2013 +0200
summary:     Bug 909389 - Enable IonMonkey try-catch compilation by default. r=djvj

Jan, is bug 909389 a possible regressor?
Blocks: 909389
Flags: needinfo?(jdemooij)
Keywords: regression
Attached patch WIP v0Splinter Review
The problem is that we bailout, then hit the BAILOUT_RETURN_OVERRECURSED case. Then EnsureExitFrame effectively "pops" the Ion frame without popping SPS frames and the SPS stack becomes unbalanced. This patch fixes that problem and the test no longer asserts.

However, I realized the BAILOUT_RETURN_FATAL_ERROR case has the same problem, and that's a lot harder to get right because there are many cases where we return that :(

Kannan/Nicolas, any ideas?
Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(kvijayan)
Flags: needinfo?(jdemooij)
An alternative is to not call probes::ExitScript and instead add SPSProfiler::popN(numFrames) and call that from Bailout and InvalidationBailout. Then we don't need the script+callee Vector and it's a bit simpler.

probes::ExitScript has this DTraceExitJSFun/cx->doFunctionCallback stuff. I doubt anyone uses that atm, it's not compatible with the JITs at all...
From what I understand this should not be needed anymore if we disable profiling of inlined frames as Kannan started.  From what I understand we want to pop SPS frames as part of HandleException, right?

Another option would be to store the number of SPS frame that we need to pop as part of the snapshot/recover-data, thus we can read the snapshot (before allocating anything) and pop the right number of SPS frames in case of failures.
Flags: needinfo?(nicolas.b.pierron)
(In reply to Nicolas B. Pierron [:nbp] from comment #7)
> From what I understand this should not be needed anymore if we disable
> profiling of inlined frames as Kannan started.

This works already but it's behind a pref (js_JitOptions.profileInlineFrames), so we'll still have to handle the case where it's turned on.

> From what I understand we
> want to pop SPS frames as part of HandleException, right?

In this case we have a normal (non-exception) bailout, then we call EnsureExitFrame in jit::Bailout and this means this bailing Ion frame is no longer "visited" in HandleException, because we turned it into an exit frame...

> Another option would be to store the number of SPS frame that we need to pop
> as part of the snapshot/recover-data, thus we can read the snapshot (before
> allocating anything) and pop the right number of SPS frames in case of
> failures.

The recover data also stores the number of frames, right? Could we use that here if profileInlineFrames is true?
(In reply to Jan de Mooij [:jandem] from comment #8)
> (In reply to Nicolas B. Pierron [:nbp] from comment #7)
> > Another option would be to store the number of SPS frame that we need to pop
> > as part of the snapshot/recover-data, thus we can read the snapshot (before
> > allocating anything) and pop the right number of SPS frames in case of
> > failures.
> 
> The recover data also stores the number of frames, right? Could we use that
> here if profileInlineFrames is true?

No, it contains the number of instructions (>= nbFrames), but if needed we can add it back.
(In reply to Jan de Mooij [:jandem] from comment #8)
> (In reply to Nicolas B. Pierron [:nbp] from comment #7)
> > From what I understand this should not be needed anymore if we disable
> > profiling of inlined frames as Kannan started.
> 
> This works already but it's behind a pref
> (js_JitOptions.profileInlineFrames), so we'll still have to handle the case
> where it's turned on.

I am open to removing that flag and just simplifying the code entirely.  BenWa would be onboard with that too.
(In reply to Jan de Mooij [:jandem] from comment #5)
> Created attachment 8403150 [details] [diff] [review]
> 
> However, I realized the BAILOUT_RETURN_FATAL_ERROR case has the same
> problem, and that's a lot harder to get right because there are many cases
> where we return that :(
> 
> Kannan/Nicolas, any ideas?

Can we just fix EnsureExitFrame to accept a JSContext, and have it pop the appropriate pseudo-stack frames as needed?

EnsureExitFrame should be able to get at the IonScript for any ion-compiled code on stack, which will inform it about the pseudo-stack status of any given ion frame.  It can use that to pop conditionally.

All the places where EnsureExitFrame is called have easy access to a JSContext.

This code will not be pretty though.  Lots of sadness, sadness everywhere.
Flags: needinfo?(kvijayan)
(In reply to Kannan Vijayan [:djvj] from comment #11)
> Can we just fix EnsureExitFrame to accept a JSContext, and have it pop the
> appropriate pseudo-stack frames as needed?

No EnsureExitFrame is not poping the frames, it just replace the parent frame type to mention that the current frame is being recycled.  We should not pop anything as in a normal process baseline stack is still valid and it should pop the right set of SPS frames.

Also, this is not the right location for manipulating the SPS stack.
Kannan can you take this? I've a pretty long list of needinfo/review requests :(

try-catch compilation exposed this somehow but the bug is not there. It's just a normal bailout where we hit the over-recursion check during bailout.

The best fix seems to be removing the profile-inline-frames flag, after that we can maybe pop the SPS frame in jit::Bailout and jit::InvalidationBailout where we call EnsureExitFrame.
Flags: needinfo?(kvijayan)
Assignee: nobody → kvijayan
Flags: needinfo?(kvijayan)
Blocks: 1002795
Whiteboard: [jsbugmon:update] → [jsbugmon:update,ignore]
JSBugMon: The testcase found in this bug no longer reproduces (tried revision b7062df8c7c3).
enableSPSProfilingWithSlowAssertions();
function testcase() {
  function f(o) {
    function innerf(o, x) {
      try {
        f('');
      } catch (e) {}
    }
    return innerf(o, 42);
  }
  new f(testcase);
}
testcase();
Whiteboard: [jsbugmon:update,ignore] → [jsbugmon:update,testComment=15,origRev=d7c07694f339]
After a ping from gkw, I started looking at this again today.  Inline frame profiling has been removed now, so the really hairy parts should be OK.

Took me a while to re-trace and confirm Jandem's diagnosis.  Here's what happens:

1. We hit an invalidation bailout.  The bailout expects to replace the ion frame on stack with a baseline frame.  It doesn't pop any SPS pseudo-stack frames for the replaced ion frame since we expect to still be in the script on resume.

2. The invalidation bailout starts constructing the baseline frame, but errors in the process, and returns with a non-BAILOUT_RETURN_OK status.

3. The trampoline code executed afterward removes the invalidated ion frame, expecting to replace it with a baseline frame.  But instead of jumping to the replacement code, it notices the error and jumps to the exception code.

4. The exception handler starts its frame iter at the caller frame (since the invalidated frame was removed from the C stack).  At this point, any SPS entry for the invalidated frame is still on stack, and the exception handler doesn't have any idea that it needs to be popped.


My tentative fix for this:

We note that BailoutIonToBaseline is the only place where this BAILOUT_RETURN_FATAL_ERROR and BAILOUT_RETURN_OVERRECURSED can occur.  In both cases, if we are not handling an exception, then we need to pop any SPS entry associated with the frame being bailed out.

So I wrapped BailoutIonToBaseline in a wrapper that examines its results, and if it's returning an error code and there's no exception pending, then we pop any SPS frame needed before returning.

Fixes bug on my build.  Running through try.
Attached patch fix-bug-970252.patch (obsolete) — Splinter Review
After thinking again about the problem, I do not think the problem is in the bailout, but in HandleException.

The bailout should not change the number of active frames on the stack, the HandleException code does.  There is no reason to pop SPS frames during the bailout, as this will hide the frame after the bailout.  In fact, we might even want to restore the SPS frame of the inlined frames during the bailout, if we did not had them before.

The only thing is that HandleException should look at the newly constructed baseline frame and pop the corresponding frame if the frame has the BaselineFrame::HAS_PUSHED_SPS_FRAME flag [1].  But to look for this flag, we might have to restore it on inline frames too [2].

Also, when we do OSR, we do not replace the current SPS frame by a new one, we have only one SPS frame for both Baseline and Ion.  Unless we distinguish the Jit frames now, in which case I am not aware of the code yet.

[1] http://dxr.mozilla.org/mozilla-central/source/js/src/jit/BaselineFrame.cpp#188
[2] http://dxr.mozilla.org/mozilla-central/source/js/src/jit/BaselineBailouts.cpp#561
That won't fix the issue here.  The problem is that the by the time we get to the exception handler, the frame has already been "popped" (in preparation for replacing it).  The exception handler can't check the frame against the pseudostack because the frame is not there anymore (since the exception occurred in the middle of clobbering the frame).

The only place this check can be done is in the bailout code, as it's the only code with access to both pieces of information: 1) the old frame script whose bailout caused the exception and which might have an SPS entry, and 2) the knowledge that it is an exception caused by frame unpacking during bailout.

If we wanted to put this check in the exception handling code, we'd have to thread #2 through to the exception handler (through various bits of trampoline code).  Seems like a lot of work that would make the design more complicated and less easy to understand.
Attached patch fix-bug-970252.patch (obsolete) — Splinter Review
Patch needed some minor changes to avoid try build failures and style-check errors.

New one works on try: https://tbpl.mozilla.org/?tree=Try&rev=1740d2efd554
Attachment #8426565 - Attachment is obsolete: true
Attachment #8427081 - Flags: review?(jdemooij)
Comment on attachment 8427081 [details] [diff] [review]
fix-bug-970252.patch

Review of attachment 8427081 [details] [diff] [review]:
-----------------------------------------------------------------

Wouldn't it also work to change the code in jit::Bailout and jit::InvalidationBailout:

    if (retval != BAILOUT_RETURN_OK)
        EnsureExitFrame(iter.jsFrame());

to:

    if (retval != BAILOUT_RETURN_OK) {
        JSScript *script = iter.script();
        probes::ExitScript(cx, script, script->functionNonDelazifying(), iter.ionScript()->hasSPSInstrumentation);
        EnsureExitFrame(iter.jsFrame());

    }

I think it'd be nice to have the ExitScript() right before the EnsureExitFrame, and this way we don't need to split up BailoutIonToBaseline.

::: js/src/jit/BaselineBailouts.cpp
@@ +1258,5 @@
>      return true;
>  }
>  
>  uint32_t
> +jit::BailoutIonToBaselineImpl(JSContext *cx, JitActivation *activation, IonBailoutIterator &iter,

static uint32_t
BailoutIonToBaselineImpl

::: js/src/jit/BaselineJIT.h
@@ +414,5 @@
>                       bool invalidate, BaselineBailoutInfo **bailoutInfo,
>                       const ExceptionBailoutInfo *exceptionInfo = nullptr);
>  
> +uint32_t
> +BailoutIonToBaselineImpl(JSContext *cx, JitActivation *activation, IonBailoutIterator &iter,

This doesn't need to be in the header file.
Attachment #8427081 - Flags: review?(jdemooij)
(In reply to Jan de Mooij [:jandem] from comment #21)
> Comment on attachment 8427081 [details] [diff] [review]
> fix-bug-970252.patch
> 
> Review of attachment 8427081 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Wouldn't it also work to change the code in jit::Bailout and
> jit::InvalidationBailout:
> 
>     if (retval != BAILOUT_RETURN_OK)
>         EnsureExitFrame(iter.jsFrame());
> 
> to:
> 
>     if (retval != BAILOUT_RETURN_OK) {
>         JSScript *script = iter.script();
>         probes::ExitScript(cx, script, script->functionNonDelazifying(),
> iter.ionScript()->hasSPSInstrumentation);
>         EnsureExitFrame(iter.jsFrame());
> 
>     }
> 
> I think it'd be nice to have the ExitScript() right before the
> EnsureExitFrame, and this way we don't need to split up BailoutIonToBaseline.

I was thinking that it would be nice to not duplicate the SPS pop logic, but this does seem simpler.
Blocks: 994406
Updated fix.
Attachment #8427081 - Attachment is obsolete: true
Attachment #8427799 - Flags: review?(jdemooij)
Comment on attachment 8427799 [details] [diff] [review]
fix-bug-970252.patch

Review of attachment 8427799 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great.
Attachment #8427799 - Flags: review?(jdemooij) → review+
Added: https://hg.mozilla.org/integration/mozilla-inbound/rev/7809455a72a7

Will wait for uplift before adding testcase.
Flags: in-testsuite?
This seems good on the try run: https://tbpl.mozilla.org/?tree=Try&rev=eb8a724ec936

Crossing fingers and re-trying push:

https://hg.mozilla.org/integration/mozilla-inbound/rev/2e06119fd268
Flags: needinfo?(kvijayan)
https://hg.mozilla.org/mozilla-central/rev/2e06119fd268
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Duplicate of this bug: 994406
You need to log in before you can comment on or make changes to this bug.