Jit test failures with GVN off.

RESOLVED FIXED in Firefox 52

Status

()

P3
normal
RESOLVED FIXED
4 years ago
2 years ago

People

(Reporter: dougc, Unassigned)

Tracking

unspecified
mozilla52
x86_64
Linux
Points:
---

Firefox Tracking Flags

(firefox52 fixed)

Details

Attachments

(5 attachments, 2 obsolete attachments)

(Reporter)

Description

4 years ago
While exploring GVN, some jit test failures were noted with '--ion-gvn=off'.

./jit_test.py --show-failed-cmd --tbpl --args --ion-gvn=off js

fails on the following commands:

js --ion-gvn=off -f lib/prolog.js --js-cache .js-cache --ion-eager --ion-check-range-analysis --no-sse3 --no-threads --ion-offthread-compile=off -e "const platform='linux2'; const libdir='lib/'; const scriptdir='tests/asm.js/'" -f tests/asm.js/bug1126251.js

js --ion-gvn=off -f lib/prolog.js --js-cache .js-cache -e "const platform='linux2'; const libdir='lib/'; const scriptdir='tests/asm.js/'" -f tests/asm.js/testControlFlow.js

js --ion-gvn=off -f lib/prolog.js --js-cache .js-cache --ion-eager --ion-offthread-compile=off -e "const platform='linux2'; const libdir='lib/'; const scriptdir='tests/asm.js/'" -f tests/asm.js/testControlFlow.js

js --ion-gvn=off -f lib/prolog.js --js-cache .js-cache --baseline-eager -e "const platform='linux2'; const libdir='lib/'; const scriptdir='tests/asm.js/'" -f tests/asm.js/testControlFlow.js

js --ion-gvn=off -f lib/prolog.js --js-cache .js-cache --baseline-eager --no-fpu -e "const platform='linux2'; const libdir='lib/'; const scriptdir='tests/asm.js/'" -f tests/asm.js/testControlFlow.js

js --ion-gvn=off -f lib/prolog.js --js-cache .js-cache --no-baseline --no-ion -e "const platform='linux2'; const libdir='lib/'; const scriptdir='tests/asm.js/'" -f tests/asm.js/testControlFlow.js

The failure on bug1126251.js is avoided if --ion-check-range-analysis is removed which might be a clue.

Is the compiler expected to work with GVN disabled?

Might it be possible to move GVN and LICM after range analysis.
Created attachment 8564274 [details] [diff] [review]
no-gvn.patch

The failure on bug 1126251.js is likely the same as bug 1130618.

Several of the others are caused by a codegen optimization to skip "trivial" blocks -- blocks containing just a goto (which occur because of split critical edges, etc.). If the function entry itself is a trivial block, the optimization gets confused because it assumes blocks in a sequence of trivial blocks are never explicitly entered. GVN optimizes code like this away. The attached patch fixes it, though it's a little rough.
(Reporter)

Comment 2

4 years ago
(In reply to Dan Gohman [:sunfish] from comment #1)
> Created attachment 8564274 [details] [diff] [review]
> no-gvn.patch

Thank you, the patch fixes the jit-test failures, except test bug1126251.js and I'll follow bug 1130618.

Fwiw moving GVN and LICM after range analysis seems to work with this patch and the workaround in bug 1133203. The large asm.js demos I use for testing still run fine.
New results:

    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --ion-eager --ion-offthread-compile=off -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/basic/'" -f js/src/jit-test/tests/basic/bug1296249.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/dce-with-rinstructions.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --ion-eager --ion-offthread-compile=off -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/dce-with-rinstructions.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --baseline-eager -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/dce-with-rinstructions.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-arrays.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --ion-eager --ion-offthread-compile=off -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-arrays.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --baseline-eager -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-arrays.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-lambdas.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --ion-eager --ion-offthread-compile=off -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-lambdas.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --baseline-eager -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-lambdas.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --ion-pgo=on -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-objects.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --baseline-eager --ion-pgo=on -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-objects.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --ion-eager --ion-pgo=on --ion-offthread-compile=off -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-objects.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --no-unboxed-objects --ion-pgo=on -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-objects.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --ion-eager --no-unboxed-objects --ion-pgo=on --ion-offthread-compile=off -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-objects.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --baseline-eager --no-unboxed-objects --ion-pgo=on -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-objects.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-typed-array.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --ion-eager --ion-offthread-compile=off -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-typed-array.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --baseline-eager -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/ion/'" -f js/src/jit-test/tests/ion/recover-typed-array.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --ion-eager --ion-offthread-compile=off -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/SIMD/'" -f js/src/jit-test/tests/SIMD/recover.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/SIMD/'" -f js/src/jit-test/tests/SIMD/recover.js
    js --ion-gvn=off -f js/src/jit-test/lib/prologue.js --js-cache js/src/jit-test/.js-cache --baseline-eager -e "const platform='linux2'; const libdir='js/src/jit-test/lib/'; const scriptdir='js/src/jit-test/tests/SIMD/'" -f js/src/jit-test/tests/SIMD/recover.js

Mostly tests with recover instructions, with the following assertion:
Assertion failure: input()->isRecoveredOnBailout() == mustBeRecovered_ (assertRecoveredOnBailout failed during compilation), at js/src/jit/Recover.cpp:1564

Redirecting to Nicolas, because I am not sure it's safe to just add an `if (!JitOptions.disableGvn)`.
Flags: needinfo?(bbouvier) → needinfo?(nicolas.b.pierron)
I was tracking some of them last week.

Apparently the issue comes from the fact that we eagerly remove NOPs, even if they have a resume point, which is the case of all the Nop instructions added by AssertRecoveredOnBailout.  So, as we have no check to ensure that AssertRecoveredOnBailout infra works, we basically disabled our test suite.

Then, branch pruning basically did not removed the branches as expected, when the heuristics changed back in June.  Thus causing these failures.

I have some-what of a patch, fixing gvn, and modifying the test cases to heat-up a bit more for branch pruning to kick-in.
Blocks: 1276955
Created attachment 8804719 [details] [diff] [review]
part 1 - Re-add recover instruction for recovering typed array allocations.

This fix an issue raised in Bug 1276955, which is that the recover
instruction code path got removed after disabling the code for
assertRecoveredOnBailout.
Attachment #8804719 - Flags: review?(hv1989)
Created attachment 8804720 [details] [diff] [review]
part 2 - Prevent spewers from consuming almost all the ballast space.

Not related, but let's fix it as it caused me trouble while debugging.
Attachment #8804720 - Flags: review?(hv1989)
Created attachment 8804722 [details] [diff] [review]
part 3 - Fix assertRecoveredOnBailout optimization assertion.

Re-enable the test cases, by bumping the number of iteration needed for
branch pruning to remove some unused branches.

Fix some of the expectations, in case of object with a large number of
dynamic slots.
Attachment #8804722 - Flags: review?(hv1989)
Created attachment 8804724 [details] [diff] [review]
part 4 - Test that we do not disable assertRecoveredOnBailout assertion.

This is a meta test case to ensure that this does not happen again.
This check that assertRecoveredOnBailout will fail as expected.
Attachment #8804724 - Flags: review?(hv1989)
Comment on attachment 8804720 [details] [diff] [review]
part 2 - Prevent spewers from consuming almost all the ballast space.

Review of attachment 8804720 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/JitSpewer.cpp
@@ +306,5 @@
> +    // Do not use ensureBallast here, as this function is used for debugging,
> +    // and as such we should not attempt to report OOM failures when we are
> +    // debugging.
> +    if (!graph_->alloc().lifoAlloc()->ensureUnusedApproximate(TempAllocator::BallastSize))
> +        MOZ_CRASH("[OOM] Buy more RAM!");

Shouldn't this use AutoEnterOOMUnsafeRegion ?
And can you fix the message "Could not ensure enough ballast space after spewing graph information."
Attachment #8804720 - Flags: review?(hv1989)
Comment on attachment 8804722 [details] [diff] [review]
part 3 - Fix assertRecoveredOnBailout optimization assertion.

Review of attachment 8804722 [details] [diff] [review]:
-----------------------------------------------------------------

Those testcases are really fragile sadly. Having specific tests per pass would really help here.
Attachment #8804722 - Flags: review?(hv1989) → review+
Comment on attachment 8804724 [details] [diff] [review]
part 4 - Test that we do not disable assertRecoveredOnBailout assertion.

Review of attachment 8804724 [details] [diff] [review]:
-----------------------------------------------------------------

I don't really understand the need for the "-11" exitstatus. Can't we just return early without doing anything?
Attachment #8804724 - Flags: review?(hv1989)
(In reply to Hannes Verschore [:h4writer] from comment #12)
> Comment on attachment 8804724 [details] [diff] [review]
> part 4 - Test that we do not disable assertRecoveredOnBailout assertion.
> 
> Review of attachment 8804724 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I don't really understand the need for the "-11" exitstatus. Can't we just
> return early without doing anything?

The SEGV is not catcheable, the "exitstatus: -11" is used to ensure that the jit-test harness consider this SEGV, from assertRecoveredOnBailout (Recover.cpp) as the expected value.

We have to replicate this exit code, if the jit are disabled, otherwise we would never enter assertRecoveredOnBailout, and inIon() will return a string as an error code.

This test case is here to ensure that we no longer disable assertRecoveredOnBailout by accident.
Flags: needinfo?(nicolas.b.pierron)
Comment on attachment 8804724 [details] [diff] [review]
part 4 - Test that we do not disable assertRecoveredOnBailout assertion.

Review of attachment 8804724 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit-test/tests/self-test/assertRecoveredOnBailout-1.js
@@ +1,1 @@
> +// |jit-test| exitstatus: -11

Instead of exitstatus, we should have "crash" here as the expected result.
Can you add that instead of relying on -11 that might be platform specific?

@@ +1,5 @@
> +// |jit-test| exitstatus: -11
> +
> +var opts = getJitCompilerOptions();
> +if (!opts['ion.enable'] || !opts['baseline.enable'])
> +  quit(-11);

We should add a helper function "crash('foo');"
Attachment #8804719 - Flags: review?(hv1989) → review+
Created attachment 8805612 [details] [diff] [review]
part 2 - Prevent spewers from consuming almost all the ballast space.

Delta:
 - Apply nit.
 - Add MIRGraph header.
Attachment #8805612 - Flags: review?(hv1989)
Attachment #8804720 - Attachment is obsolete: true
Created attachment 8805617 [details] [diff] [review]
part 4 - Test that we do not disable assertRecoveredOnBailout assertion.

Delta:
 - Use the existing crash builtin of the shell.
 - Add "|jit-test| crash" support, with Windows error code (reported on try)
 - Add --ion-check-range-analysis flag to getJitCompilerOptions (as
     MAssertRecoveredOnBailout is not produced if it is enabled)
Attachment #8805617 - Flags: review?(hv1989)
Attachment #8804724 - Attachment is obsolete: true
Comment on attachment 8805612 [details] [diff] [review]
part 2 - Prevent spewers from consuming almost all the ballast space.

Review of attachment 8805612 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8805612 - Flags: review?(hv1989) → review+
Comment on attachment 8805617 [details] [diff] [review]
part 4 - Test that we do not disable assertRecoveredOnBailout assertion.

Review of attachment 8805617 [details] [diff] [review]:
-----------------------------------------------------------------

Nice! I assume you did a try run to be sure?
Attachment #8805617 - Flags: review?(hv1989) → review+
(In reply to Hannes Verschore [:h4writer] from comment #18)
> Comment on attachment 8805617 [details] [diff] [review]
> part 4 - Test that we do not disable assertRecoveredOnBailout assertion.
> 
> Review of attachment 8805617 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Nice! I assume you did a try run to be sure?

Reading your comment that is a yes. Overlooked that. Thanks!
Yes, I did push to try, and but at the time I had submitted the ASan and ARM64 where not yet red.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=221b9349be3f5b7fbb1bcba6b91bbbcb59542b0a

I made a few changes, to fix this minor issues and double-checking with try again.

Comment 21

2 years ago
Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e4b1a95a7f39
part 1 - Re-add recover instruction for recovering typed array allocations. r=h4writer
https://hg.mozilla.org/integration/mozilla-inbound/rev/5e2739dedc3a
part 2 - Prevent spewers from consuming almost all the ballast space. r=h4writer
https://hg.mozilla.org/integration/mozilla-inbound/rev/f596578a143e
part 3 - Fix assertRecoveredOnBailout optimization assertion. r=h4writer
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd3ff163021b
part 4 - Test that we do not disable assertRecoveredOnBailout assertion. r=h4writer
I had to bafck this out for Spidermonkey failures like https://treeherder.mozilla.org/logviewer.html#?job_id=38666405&repo=mozilla-inbound

https://hg.mozilla.org/integration/mozilla-inbound/rev/9c43bca082b7
Flags: needinfo?(nicolas.b.pierron)
(In reply to Wes Kocher (:KWierso) from comment #22)
> I had to bafck this out for Spidermonkey failures like
> https://treeherder.mozilla.org/logviewer.html#?job_id=38666405&repo=mozilla-
> inbound
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/9c43bca082b7

It hit twice out of 9 total runs, if that helps you in your attempts to reproduce.
I tried "rr record -h" on an optimized build of the JS shell, without the root-analysis.
I did not managed to get this assertion while running a JS shell compiled by clang 3.7 (~2000 runs), but I did managed to get this assertion while running a JS shell compiled by gcc 4.8 (after ~275 runs).

I will investigate.
Ok, the problem faced with this intermittent is that we are compiling "rceil_number" later than expected, which gives some times to monitor the return type of "Math.ceil" as potentially being a double.

The reason this can now be a double is that we only measured in Baseline, and to accommodate from the branch pruning modifications, I added some extra cycle in the negative.  As other test failed when using negative numbers, I had to wrap around the values with Math.abs.  "rceil_number" does the ceiling of (-i -0.12…) which can be rounded to -0 when i is 0.

I've changed the wrap-around to add a positive immediate value to avoid the i == 0 case.

If the following try-push works, I will squash the simple fixup patch and re-push the changes to inbound.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=26a51f90f990eb7dbc50116f92816f8bb9d6f134
Flags: needinfo?(nicolas.b.pierron)

Comment 26

2 years ago
Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/968aa5ce9ed4
part 1 - Re-add recover instruction for recovering typed array allocations. r=h4writer
https://hg.mozilla.org/integration/mozilla-inbound/rev/40ae0b40fd73
part 2 - Prevent spewers from consuming almost all the ballast space. r=h4writer
https://hg.mozilla.org/integration/mozilla-inbound/rev/5c1b90a1926d
part 3 - Fix assertRecoveredOnBailout optimization assertion. r=h4writer
https://hg.mozilla.org/integration/mozilla-inbound/rev/8daddd6054f2
part 4 - Test that we do not disable assertRecoveredOnBailout assertion. r=h4writer

Comment 27

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/968aa5ce9ed4
https://hg.mozilla.org/mozilla-central/rev/40ae0b40fd73
https://hg.mozilla.org/mozilla-central/rev/5c1b90a1926d
https://hg.mozilla.org/mozilla-central/rev/8daddd6054f2
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox52: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
You need to log in before you can comment on or make changes to this bug.