Closed Bug 731683 Opened 12 years ago Closed 9 years ago

IonMonkey: Improve Ion compilation/bailouts checks

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: nbp, Assigned: layus, Mentored)

References

(Blocks 1 open bug)

Details

(Whiteboard: [good first bug][lang=c++][ion:t])

Attachments

(4 files, 8 obsolete files)

(Good as first bug)

Bug 730111 already introduced modifications in AssertJit, but these modifications cause some new error because IonMonkey may not compile the function directly.

This bug goal is to add AssertIon and getIon, to ensure that if we enter the function, at the beginning, we are still in it at the end.  In addition, to this patch we can update ion test suite to avoid arbitrary constant which are waiting for compilation and change the behaviour for eager builds.

no ion flags:
- getIon returns true
- assertIon(…) does not fail

with non eager ion:
- getIon returns the value of runningInIon flag
- assertIon(true) fails if runningInIon is false.

with eager compilation:
- getIon fails if runningInIon is false.
- assertIon(true) fails if runningInIon is false.

Many loops are looking like:


function test() {
  …
  AssertEq(…, …);
}

for (var i = 0; i < …; i++)
  test();


If possible, make the condition of the loop depends on getIon result, such as:




function test() {
  …
  AssertEq(…, …);
  return getIon();
}

var inIon = false;
while(!inIon)
  inIon = test();


May be add an extra function to check for OSR when we run with/without it or add an argument to getIon to select the compilation mode (OSR, Inlined, …).

PS: assertIon(x) = AssertEq(getIon(), x)
Marking as [good first bug] based on comment #0.
Whiteboard: [good first bug][lang=c++]
Whiteboard: [good first bug][lang=c++] → [good first bug][lang=c++][mentor=pierron]
Nicolas: Could you add a list of what files will need to be updated to this bug?
(In reply to Jared Wein [:jaws] from comment #2)
> Nicolas: Could you add a list of what files will need to be updated to this
> bug?

Just adding the test case for bug 730111 (attachment 600587 [details] [diff] [review]) would be fine.

Most of the modification exists in the attachement of bug 730111, but they are made to the wrong function.  This implies adding a new native function, and documenting it (otherwise it does not compile …).  And making sure that the case is working as expected.
Whiteboard: [good first bug][lang=c++][mentor=pierron] → [good first bug][lang=c++][mentor=pierron][ion:t]
Is this 'good first bug' still unassigned? And can I work on it? 

I have gone through the general building rules for mozilla. Was able to set up environment and build the nightly build. 
Can I start working on this bug? 
What should be a good place to understand the context? 

-Manish
Yup, this bug is still unassigned. As I understand it, the goal is to be able to simplify performance testing by asserting that certain features stay JIT'd. I suspect it's actually hard to get that working though.

Exactly when we JIT depends on both whether the baseline compiler and/or parallel compilation (TBD) are in play, and when we JIT can produce very different results because of type profiling. So we'd have to write tests that work around that, and our experience with checking bailout stats in the TraceMonkey days was that the stats changed frequently and were a pain to keep updated.

Maybe there is a better way, but I suspect performance tests are probably easier served by using existing infrastructure (like arewefastyet) and spotting regressions through that.

Though they don't always have "good first bug" in the whiteboard, you might want to look at higher impact bugs, like ones attached to bug 705294 (all performance stuff). Folks in #ionmonkey or #jsapi will be glad to mentor :)
Whiteboard: [good first bug][lang=c++][mentor=pierron][ion:t] → [good first bug][lang=c++][mentor=nbp][ion:t]
Assignee: general → nobody
Mentor: nicolas.b.pierron
Whiteboard: [good first bug][lang=c++][mentor=nbp][ion:t] → [good first bug][lang=c++][ion:t]
I think we should have this.

To me, it's not about speed (who cares about speed) but correctness. Say I write a test exercising a particular kind of bailout, and a month later stuff gets shifted around in the implementation so that the test isn't testing that anymore. Currently we just lose test coverage. The investment in correctness embodied by that test is lost and we never hear about it.

The hard thing here is figuring out an API that isn't prone to the easy mistake of writing a test that runs beautifully under default settings, but fails or iloops under --no-ion or --no-baseline.
Part 1
------

I have started investigating this, ant the state of the tests is indeed worrying.
Loops intended to ensure that the code is compiled by ion range from a few iterations to 100000 (http://dxr.mozilla.org/mozilla-central/source/js/src/jit-test/tests/ion/bug764432.js#2)

Also, some comments show that running the tests is not sufficient to ensure that the intended checks are run. (eg: http://dxr.mozilla.org/mozilla-central/source/js/src/jit-test/tests/ion/mathFloor.js#2)

I agree that it would be nice to have some way to access the state of the interpreter.

Part 2
------

(In reply to Jason Orendorff [:jorendorff] from comment #6)
> The hard thing here is figuring out an API that isn't prone to the easy
> mistake of writing a test that runs beautifully under default settings, but
> fails or iloops under --no-ion or --no-baseline.

If I understand well, you mean that we cannot have a function 'inIon()' and use it in a while loop, like :

while(!inIon()) {
    // Continue, the code is not yet compiled
    // ...
}

Because that would loop forever if run with the --no-ion flag.
What are the risks with the --no-baseline flag ?


By reading the tests in js/src/jit-test/tests/ion, it appears that we need the following concepts :

1/ Run some code in Ion (i.e. test if the compiled version of the code behaves correctly).
This could be provided via a method runInIon(foo) that calls foo repeatedly until it is compiled, and runs once the compiled version.

In this case, the common practice seems to call the function once with the --ion-eager option

2/ Run some code in Ion after some profiling has been done.
Common practice is to use something like:
setJitCompilerOption("baseline.usecount.trigger", 10);
setJitCompilerOption("ion.usecount.trigger", 20);
and custom loops that call the test function more than 20 times.

3/ Check that some instructions are optimized out, as in http://dxr.mozilla.org/mozilla-central/source/js/src/jit-test/tests/ion/dce-with-rinstructions.js.
This cannot be automated and requires manual inspection of the debug graphs.

4/ Ensure that a bailout did/didn't occur by checking that i) we are in Ion at the beginning of a function, and ii) that we are still/no more in Ion at the end, of in some branch.

Part 3
------

The proposed implementation (getIon()/AssertIon()) in comment 1 seems good for point 4/ and may be used for point 1/ and 2/.
But it has an iloop problem if a bailout occurs within the function.
In the following example, the code would run forever. 

function test() {
  …
  // Bailout.
  …
  return getIon(); // return false, because of bailout.
}

var inIon = false;
while(!inIon)
  inIon = test();

Of course a fix exists with


function test() {
  var wasInIon = getIon();
  …
  // Bailout.
  …
  AssertEq(wasInIon, getIon());
  return wasInIon; // obviously equivalent to 'return getIon();'
}

var inIon = false;
while(!inIon)
  inIon = test();

This shows that the API is indeed a bit cumbersome to use, and that a good one is difficult to design.

for point 3/, I was thinking that a dead code barrier would be a good idea.
Something in the line of 

...
startDeadCodeBarrier();
var x = 1;
endDeadCodeBarrier();
...

The two methods should fail if ever executed within Ion, and should only be optimized out by DCE if they are consecutive, meaning any intermediate code as already been eliminated by DCE.
If the code executes within Ion without failing, this means that all the intermediary instruction have been optimized out.
Is this a good idea ?

Part 4
------

Strange thing is, I have found no such api in other projects, like in the JVM.
I wonder how other JIT implementations manage testing... Any idea ?

I will continue to work on this issue, but would welcome some advice(s).
First, does this (long) comment make any sense, or am I looking in the wrong direction ?
Second, should I implement the methods presented in comment 1, or continue to search for a nice API ?

Thanks in advance.
Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(jorendorff)
(In reply to Guillaume Maudoux from comment #7)
> Also, some comments show that running the tests is not sufficient to ensure
> that the intended checks are run. (eg:
> http://dxr.mozilla.org/mozilla-central/source/js/src/jit-test/tests/ion/
> mathFloor.js#2)

Note that this test highlight another missing thing, which is the empty try-catch statement was used (at the time when it was not supported by IonMonkey) to prevent the compilation of the outer loop.

> Part 2
> ------
> 
> (In reply to Jason Orendorff [:jorendorff] from comment #6)
> > The hard thing here is figuring out an API that isn't prone to the easy
> > mistake of writing a test that runs beautifully under default settings, but
> > fails or iloops under --no-ion or --no-baseline.
> 
> If I understand well, you mean that we cannot have a function 'inIon()' and
> use it in a while loop, like :
> 
> while(!inIon()) {
>     // Continue, the code is not yet compiled
>     // ...
> }
> 
> Because that would loop forever if run with the --no-ion flag.
> What are the risks with the --no-baseline flag ?

Exactly.  If the compiler is disabled, which is expected while testing, then such test might just do infinite loops.  So maybe we should guard these tests with:

  if (!isIonEnabled())
      quit();

> By reading the tests in js/src/jit-test/tests/ion, it appears that we need
> the following concepts :
> 
> 1/ Run some code in Ion (i.e. test if the compiled version of the code
> behaves correctly).
> This could be provided via a method runInIon(foo) that calls foo repeatedly
> until it is compiled, and runs once the compiled version.
> 
> In this case, the common practice seems to call the function once with the
> --ion-eager option

Sadly --ion-eager might not be enough as a standalone thing.  We often use --ion-eager to check the transition of states where it might be harder otherwise.  For example a function might be compiled the first time, then invalidated once/twice before hitting the useful case that we want to check.

So running once in Ion is not always enough.

> 2/ Run some code in Ion after some profiling has been done.
> Common practice is to use something like:
> setJitCompilerOption("baseline.usecount.trigger", 10);
> setJitCompilerOption("ion.usecount.trigger", 20);
> and custom loops that call the test function more than 20 times.

This is useful, but tests cases might have to be adapted too.  For example the inlining rules might not comply well with these reduced triggers.

> 3/ Check that some instructions are optimized out, as in
> http://dxr.mozilla.org/mozilla-central/source/js/src/jit-test/tests/ion/dce-
> with-rinstructions.js.
> This cannot be automated and requires manual inspection of the debug graphs.

I think the approach taken by Benjamin on Float32 might be a good fit for recover instructions.  The goal being not to change the semantic of the code which is being analyzed.  We could add assertIsRecovered(…), but then we need a good control over the compilation rules of the script, as recover instruction tests are using Inlining and UCE to induce the Recover On Bailout state.

> 4/ Ensure that a bailout did/didn't occur by checking that i) we are in Ion
> at the beginning of a function, and ii) that we are still/no more in Ion at
> the end, of in some branch.

Another way for checking bailout, if this is our concern, is to have them aggregated somewhere such as we can compare it with a list if the CLI is matching.

AssertBailoutsIfCLI("--ion-eager", [
  "Function foo, line:4, reason: out-of-bound",
  "Function foo, line:5, reason: invalidation",
]);

> Part 3
> ------
> 
> The proposed implementation (getIon()/AssertIon()) in comment 1 seems good
> for point 4/ and may be used for point 1/ and 2/.
> But it has an iloop problem if a bailout occurs within the function.
> In the following example, the code would run forever. 

If the bailout is unexpected, then having an iloop is desirable as the test suite would fail because of this timeout.

> function test() {
>   …
>   // Bailout.
>   …
>   return getIon(); // return false, because of bailout.
> }
> 
> var inIon = false;
> while(!inIon)
>   inIon = test();
> 
> Of course a fix exists with
> 
> 
> function test() {
>   var wasInIon = getIon();
>   …
>   // Bailout.
>   …
>   AssertEq(wasInIon, getIon());
>   return wasInIon; // obviously equivalent to 'return getIon();'
> }
> 
> var inIon = false;
> while(!inIon)
>   inIon = test();
> 
> This shows that the API is indeed a bit cumbersome to use, and that a good
> one is difficult to design.
> 
> for point 3/, I was thinking that a dead code barrier would be a good idea.
> Something in the line of 
> 
> ...
> startDeadCodeBarrier();
> var x = 1;
> endDeadCodeBarrier();
> ...
> 
> The two methods should fail if ever executed within Ion, and should only be
> optimized out by DCE if they are consecutive, meaning any intermediate code
> as already been eliminated by DCE.
> If the code executes within Ion without failing, this means that all the
> intermediary instruction have been optimized out.
> Is this a good idea ?
> 

As stated before this is highly dependent on other optimizations being triggered.
The question, is do we want to write JavaScript code to test the compilation, or should we aim at making better unit tests for the compilation?

v8 has a way for exporting graphs, such as they can import them in their unit tests.  Currently we are already exporting the graph in some incomplete forms, such as /tmp/ion.json and /tmp/ion.cfg.  If we could re-import these graph and compare them, then we should be able to have somekind of unit tests.

> Part 4
> ------
> 
> Strange thing is, I have found no such api in other projects, like in the
> JVM.
> I wonder how other JIT implementations manage testing... Any idea ?

I have no idea, but I would think they are more likely to have unit tests for each compilation stages.

> I will continue to work on this issue, but would welcome some advice(s).
> First, does this (long) comment make any sense, or am I looking in the wrong
> direction ?

I think this makes sense, but this is a complex problem to handle all different kind of CLI context.

> Second, should I implement the methods presented in comment 1, or continue
> to search for a nice API ?

I would say, as you prefer.  The comment 0 describes a first step to get some control back, but as you remarked, it has a limited capabilities.
Flags: needinfo?(nicolas.b.pierron)
hello, is somebody working on this bug because i would like to work on this bug, i know c++ this would be my first bug so i might need some extra guidance debugging it. thanks!
Attached patch inJit.patch (obsolete) — Splinter Review
Hi,

It has been a long time, but I am back with something (at least).

As a small first step, here is a patch introducing inJit(), a function that tells if we should continue waiting for jit to happen.

The provided test case should halt in any CLI context. (tested with --no-ion, --ion-eager and --baseline)
With --no-ion, it ends at the first invocation of inJit().
With --baseline (default) it ends after the tenth call because the code has been compiled by ion.

Next step is to provide smarter APIs, but I wanted to check that this first implementation is correct before doing something more complex.
Flags: needinfo?(jorendorff)
Attachment #8517011 - Flags: feedback?(nicolas.b.pierron)
Attached patch inJit.patch (obsolete) — Splinter Review
Attachment #8517011 - Attachment is obsolete: true
Attachment #8517011 - Flags: feedback?(nicolas.b.pierron)
Attachment #8517013 - Flags: feedback?(nicolas.b.pierron)
Comment on attachment 8517013 [details] [diff] [review]
inJit.patch

Review of attachment 8517013 [details] [diff] [review]:
-----------------------------------------------------------------

This sounds good :)
Does it work with --no-baseline?

::: js/src/shell/js.cpp
@@ +1715,5 @@
> +InJit(JSContext *cx, unsigned argc, jsval *vp)
> +{
> +    CallArgs args = CallArgsFromVp(argc, vp);
> +    
> +    if (!jit::IsIonEnabled(cx) || cx->currentlyRunningInJit()) {

This sounds good, while I think we might be interested in checking baseline / Ion separately, or by adding an argument to this function.

Also, such function is better place in the TestingFunction builtins, this way we could use it in the browser as well.
Attachment #8517013 - Flags: feedback?(nicolas.b.pierron) → feedback+
(In reply to Guillaume Maudoux from comment #10)
> It has been a long time, but I am back with something (at least).

Welcome back :)
Assignee: nobody → layus.on
Attached patch inJit.patch (obsolete) — Splinter Review
1/ Moved the code to TestingFunctions.cpp
2/ Added inIon to distinguish baseline from ion

Specific review items :
1/ Location of test files (jit-test/tests/self-test/in{Ion,Jit}.js)
2/ Usage of inlining in ion for inIon()
3/ in{Jit,Ion} really mean shouldWaitFor{Jit,Ion}. Is the name in{Jit,Ion} right ?

I have tested the implementation with the following code and several flags.

>var i = 1;
>while(!inJit()) i++;
>print(i);
>
>var j = 1;
>while(!inIon()) j++;
>print(j);

The ouptut is 

>$ <options parallel -j1  "echo {}; " eval "build_OPT.OBJ/dist/bin/js {} jit-test/tests/tools.js"
>--baseline --ion --ion-offthread-compile=on
>10
>34026
>--baseline --ion --ion-offthread-compile=on --ion-eager
>1
>177834
>--baseline --ion --ion-offthread-compile=on --baseline-eager
>1
>32851
>--baseline --ion --ion-offthread-compile=on --ion-eager --baseline-eager
>1
>182978
>--baseline --ion --ion-offthread-compile=off
>10
>1088
>--baseline --ion --ion-offthread-compile=off --ion-eager
>1
>2
>--baseline --ion --ion-offthread-compile=off --baseline-eager
>1
>1097
>--baseline --ion --ion-offthread-compile=off --ion-eager --baseline-eager
>1
>2
>--baseline --no-ion --ion-offthread-compile=on
>10
>1
>--baseline --no-ion --ion-offthread-compile=on --ion-eager
>1
>1
>--baseline --no-ion --ion-offthread-compile=on --baseline-eager
>1
>1
>--baseline --no-ion --ion-offthread-compile=on --ion-eager --baseline-eager
>1
>1
>--baseline --no-ion --ion-offthread-compile=off
>10
>1
>--baseline --no-ion --ion-offthread-compile=off --ion-eager
>1
>1
>--baseline --no-ion --ion-offthread-compile=off --baseline-eager
>1
>1
>--baseline --no-ion --ion-offthread-compile=off --ion-eager --baseline-eager
>1
>1
>--no-baseline --ion --ion-offthread-compile=on
>1
>1
>--no-baseline --ion --ion-offthread-compile=on --ion-eager
>1
>1
>--no-baseline --ion --ion-offthread-compile=on --baseline-eager
>1
>1
>--no-baseline --ion --ion-offthread-compile=on --ion-eager --baseline-eager
>1
>1
>--no-baseline --ion --ion-offthread-compile=off
>1
>1
>--no-baseline --ion --ion-offthread-compile=off --ion-eager
>1
>1
>--no-baseline --ion --ion-offthread-compile=off --baseline-eager
>1
>1
>--no-baseline --ion --ion-offthread-compile=off --ion-eager --baseline-eager
>1
>1
>--no-baseline --no-ion --ion-offthread-compile=on
>1
>1
>--no-baseline --no-ion --ion-offthread-compile=on --ion-eager
>1
>1
>--no-baseline --no-ion --ion-offthread-compile=on --baseline-eager
>1
>1
>--no-baseline --no-ion --ion-offthread-compile=on --ion-eager --baseline-eager
>1
>1
>--no-baseline --no-ion --ion-offthread-compile=off
>1
>1
>--no-baseline --no-ion --ion-offthread-compile=off --ion-eager
>1
>1
>--no-baseline --no-ion --ion-offthread-compile=off --baseline-eager
>1
>1
>--no-baseline --no-ion --ion-offthread-compile=off --ion-eager --baseline-eager
>1
>1

So my implementation seems to be correct.
Attachment #8517013 - Attachment is obsolete: true
Attachment #8567861 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8567861 [details] [diff] [review]
inJit.patch

Review of attachment 8567861 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/TestingFunctions.cpp
@@ +1451,5 @@
> +{
> +    CallArgs args = CallArgsFromVp(argc, vp);
> +
> +    args.rval().setBoolean(!IsBaselineEnabled(cx) ||
> +            cx->currentlyRunningInJit());

nit: unwrap this line.

::: js/src/jit/MCallOptimize.cpp
@@ +252,5 @@
>      if (native == testingFunc_assertFloat32)
>          return inlineAssertFloat32(callInfo);
> +    if (native == testingFunc_inIon ||
> +        native == testingFunc_inJit)
> +    {

nit: unwrap and remove the braces.

@@ +2462,5 @@
> +    callInfo.setImplicitlyUsedUnchecked();
> +
> +    MConstant *res = MConstant::New(alloc(), BooleanValue(true));
> +    current->add(res);
> +    current->push(res);

nit: current->push(constant(BooleanValue(true)));
Attachment #8567861 - Flags: review?(nicolas.b.pierron) → review+
(In reply to Guillaume Maudoux from comment #14)
> Specific review items :
> 1/ Location of test files (jit-test/tests/self-test/in{Ion,Jit}.js)
> 2/ Usage of inlining in ion for inIon()

Clever idea, this avoid scanning the stack to figure out.
Also, I do not expect us to write down test cases similar to:

| var x = [inJit, inIon];
| var i = 0;
| while (i < x.length)
|   if (x[i]())
|     i++

Which I guess might work, because we inline polymorphic call-sites.

Note, if the inlining is disabled, inIon might not work as expected.

> 3/ in{Jit,Ion} really mean shouldWaitFor{Jit,Ion}. Is the name in{Jit,Ion}
> right ?

inJit is fine, if you interpret it as a predicate.  On the other hand shouldWaitForJit, sounds like it would be blocking the execution.

> I have tested the implementation with the following code and several flags.
> […]
> The ouptut is 
> 

Nice!
Attached patch inJit.patch (obsolete) — Splinter Review
> ::: js/src/jit/MCallOptimize.cpp
> @@ +2462,5 @@
> > +    callInfo.setImplicitlyUsedUnchecked();
> > +
> > +    MConstant *res = MConstant::New(alloc(), BooleanValue(true));
> > +    current->add(res);
> > +    current->push(res);
> 
> nit: current->push(constant(BooleanValue(true)));

nit: pushConstant(BooleanValue(true)); 
;)

> Note, if the inlining is disabled, inIon might not work as expected.

The implementation is the same as for bailout() (which I mostly copied) and it did not bother the author of that function. So this should be fine in most places.
Of course, failing to bailout when inlining is disabled is not a big deal, where looping forever might be a problem.

Up to you !
Attachment #8567861 - Attachment is obsolete: true
Attachment #8568162 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8568162 [details] [diff] [review]
inJit.patch

Review of attachment 8568162 [details] [diff] [review]:
-----------------------------------------------------------------

Sounds good to me :)
Attachment #8568162 - Flags: review?(nicolas.b.pierron) → review+
Flags: needinfo?(nicolas.b.pierron)
I have a pending request to get try access.

If I ever get access to it, how can I push this patch to try, and what tests should I select ?
Flags: needinfo?(nicolas.b.pierron)
(In reply to Guillaume Maudoux [:layus] from comment #19)
> I have a pending request to get try access.
> 
> If I ever get access to it, how can I push this patch to try, and what tests
> should I select ?

You can make commit subjects by visiting http://trychooser.pub.build.mozilla.org/ , otherwise, I often use 

   try: -b do -p linux,linux64,linux64-asan,linux64-st-an,linux64-valgrind,linux64-sh-haz -u jsreftest,jittests,jittest-1,jittest-2 -t none

As this build and tests the JavaScript shell for the tests which are the most likely to break after JS shell changes.
Flags: needinfo?(nicolas.b.pierron)
Testing uncovered a problem with the design of inIon.
It is difficult to reliably detect if ion will eventually compile the code or not.
With JS_GC_ZEAL=14, ion is enable but will never inline a test function like

| function test(i) {
|    print(i);
|    return !inIon();
| }
| var i=0;
| while(test(i)) i++;

Simply stated, it seems that baseline-compiled code gets garbage-collected before ion could possibly inline and run it.
So while ion is enabled, it is effectively prevented to run.

I see three fixes :

1/ (Hacky) We could simply detect zealous gc's and act as if ion was disabled.
This is hacky because we will prevent ion to run where it coul theoretically have,
and because we fix inIon by in a non-generic way.

2/ (Complex) We could fix cgc's to ensure that they do not prevent ion compilation and inlining.
This is the cleanest fix. It preserves the intuition that a code that runs infinitely often will eventually get inlined/optimized by ion.
Maybe that invariant is not intended to be true, but I find it intuitive and worth maintaining.

3/ (Lazy) We could drop inIon() completely. This makes sense if we decide that it is not possible to reliably detect if a code will eventually be inlined or not.

4/ (Fatalist) Finally, we can keep inIon() as is, and combine it with a loop counter to avoid infinite loops when ion does not kick in :
| while(test(i) && i < 1000) i++;
In this case, inIon only serves as an early success/failure detector, avoiding useless iterations once the code has been inlined.
We loose the nice feat. that the code timeouts if the method could not be inlined.
Going further, we could also maintain a counter of calls to each inIon() instruction and return true when it reaches 10000.
(In reply to Guillaume Maudoux [:layus] from comment #21)
> Simply stated, it seems that baseline-compiled code gets garbage-collected
> before ion could possibly inline and run it.

How many times do we try to compile it, do we flag the script as not being compilable after too many attempts?  If so, we could just make  inIon  return true when the script cannot be compiled with IonMonkey.
Flags: needinfo?(nicolas.b.pierron)
I found the culprit : http://hg.mozilla.org/mozilla-central/file/993eb76a8bd6/js/src/gc/Zone.cpp#l196

As far as I understand, resetting the warmUpCounter is a bad move.
We discard the counter of valid and active scripts !

After the execution of the previous line (jit::FinishDiscardBaselineScript(fop, script);), the script can either be truly discarded, or remain in use because it is in use (marked explicitly as active).
The comment is quite clear : Discard baseline script if it's not marked as active.

BUT, we reset the warmUpCounter nonetheless...

I do not see where resetting that counter makes sense but my knowledge is limited.
Can you confirm that removing the 
| script->resetWarmUpCounter();
instruction does not break anything ?
If it is not the case, then we will need to be clever about when to reset the warmUpCounter.
Flags: needinfo?(nicolas.b.pierron)
(In reply to Guillaume Maudoux [:layus] from comment #23)
> As far as I understand, resetting the warmUpCounter is a bad move.

We reset the warmup counter, because at the garbage collection collects the information needed for making sounds optimizations in Ion.  So, we do want to warm-up again.  The active exception is a hack to ensure that we keep code which is currently running in Ion, when we are in a requestAnimationFrame callback.
Flags: needinfo?(nicolas.b.pierron)
Attached patch inIon.diff (obsolete) — Splinter Review
Clearing review+ as the patch is not complete.
Adding a test ace failing with JS_GC_ZEAL=14.
Attachment #8568162 - Attachment is obsolete: true
Attached patch warmUpResetCounter.diff (obsolete) — Splinter Review
This is an implementation of my idea of a warmUpResetCounter, that counts the number of resets of the warmUpCounter that occurred since the last jit compilation (either ion or baseline).

This counter exposes the side effect of garbage collection. When garbage-collecting a script, all type data needed by ion to compile are deleted.

If the garbage collector is too zealous, it will prevent ion to collect enough type data to ever compile the script.
This is not unlike a livelock where ion, while enabled, is forever denied access to a resource it needs to compile.
A zealous gc breaks my assumption that an hot script will eventually get (ion) compiled, and therefore breaks my implementation of inIon().

With the warmUpResetCounter, inIon is now able to detect such cases, and act upon it. The current implementation just returns false from the builtin, which translates in an error being thrown in js.
This behavior is not good, but just making inIon() return true prevents from detecting such cases.

The best proposal so far is to throw a "Gc discards type data repeatedly, giving up on ion compilation." when warmUpResetCount() hits 10.
This behavior could also be used when ion is disabled (throw "inIon() will never be true because ion is disabled!")
This would ensure that inIon never returns true when not in ion, which is way more intuitive! :-D


*Expected feedback* : Do you agree with the introduction of the warmUpResetCounter and the usage of exceptions in corner cases ?


Note: It seems that with JS_GC_ZEAL=14, garbage collection always occur two times consecutively, increasing the warmUpResetCounter by steps of two... any idea why this happens ?
Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(jdemooij)
Comment on attachment 8574848 [details] [diff] [review]
warmUpResetCounter.diff

Review of attachment 8574848 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsscript.h
@@ +871,5 @@
>                                    * ion, also increased for any inlined scripts.
>                                    * Reset if the script's JIT code is forcibly
>                                    * discarded. */
> +    uint32_t        warmUpResetCount; /* Number of times the |warmUpCount| was 
> +                                       * forcibly discarded. */

Can we add 32 bits without increasing the size of the JSScript?

@@ +1325,5 @@
>      void setIonScript(JSContext *maybecx, js::jit::IonScript *ionScript) {
>          if (hasIonScript())
>              js::jit::IonScript::writeBarrierPre(zone(), ion);
>          ion = ionScript;
> +        resetWarmUpResetCounter();

Nice :)
Attachment #8574848 - Flags: feedback+
(In reply to Guillaume Maudoux [:layus] from comment #26)
> Note: It seems that with JS_GC_ZEAL=14, garbage collection always occur two
> times consecutively, increasing the warmUpResetCounter by steps of two...
> any idea why this happens ?

I guess this might be because a moving GC has to update the pointers, thus it traverse the heap a first time to index, and a second time to update the pointers.  Unless this is just a debug thing?

Jon might know better.
Flags: needinfo?(nicolas.b.pierron) → needinfo?(jcoppeard)
(In reply to Nicolas B. Pierron [:nbp] from comment #27)
> ::: js/src/jsscript.h
> @@ +871,5 @@
> > +    uint32_t        warmUpResetCount; /* Number of times the |warmUpCount| was 
> > +                                       * forcibly discarded. */
> 
> Can we add 32 bits without increasing the size of the JSScript?

I suppose not ? how could it be ?
(In reply to Guillaume Maudoux [:layus] from comment #29)
> (In reply to Nicolas B. Pierron [:nbp] from comment #27)
> > ::: js/src/jsscript.h
> > @@ +871,5 @@
> > > +    uint32_t        warmUpResetCount; /* Number of times the |warmUpCount| was 
> > > +                                       * forcibly discarded. */
> > 
> > Can we add 32 bits without increasing the size of the JSScript?
> 
> I suppose not ? how could it be ?

Either by doing some bit-packing or by using holes in the structure.
Attached patch inIon.diff (obsolete) — Splinter Review
I merged the two patches, and made inIon() throw an exception when it is not worth waiting for compilation.

Tests seem okay: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3f1cbd12b7c0

I am not fond of throwing an exception for a corner case that any normal code will want to ignore, but I agree that it is better to distinguish cases where we are in ion from cases where we give up on waiting.
Maybe a different return value would be better ?
Attachment #8574835 - Attachment is obsolete: true
Attachment #8574848 - Attachment is obsolete: true
Attachment #8576635 - Flags: feedback?(nicolas.b.pierron)
Comment on attachment 8576635 [details] [diff] [review]
inIon.diff

Review of attachment 8576635 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit-test/tests/self-test/inIon.js
@@ +14,5 @@
> +
> +try {
> +    f();
> +} catch (err) {
> +    print(err);

Do not use "print" in test cases, use assertEq instead, to make sure that if there is any exception, then that's the exception that we expected to see.

::: js/src/jit-test/tests/self-test/inJit.js
@@ +12,5 @@
> +}
> +
> +try {
> +    f();
> +} catch (err) {

This try catch seems that it would be a common pattern, I am not sure if we want to keep it in every test case or just handle it as part of the jit-test.
Maybe it would be interesting to have a jit-test/lib function to catch such errors.

function safeInJit() {
  try {
    return inJit();
  } catch (err) {
    return false;
  }
}


function wrapInJit(f) {
  with ({}) {}; // prevent compilation.
  try {
    return f();
  } catch (err) {
    if ("" + err == "A zealous GC repeatedly prevents ion from running. Giving up.")
       return true;
    if ("" + err == "Ion is disabled.")
       return true;
    throw err;
  }
}
Attachment #8576635 - Flags: feedback?(nicolas.b.pierron) → feedback+
Attached file MozReview Request: bz://731683/layus (obsolete) —
/r/5307 - Bug 731683 - IonMonkey: Improve Ion compilation/bailouts checks. r=nbp

Pull down this commit:

hg pull review -r 9445927f9272370ea6bcc7de57c3bb9512b530b7
I should definitely stop asking for review of patches that are not complete.
My problem with the previous one was that I was not "fond of" exceptions, but I did not bring any answer.
It is just that I am not too sure about design decisions.

So my new patch in review board is complete and does not come with more questions.
It is exactly what I think is the best solution to the current problem.

Indeed, it seems to me that relying on the truthiness of non empty strings is a smart solution.

If you do not care about errors (most common use case), then just wait for inIon() to return a truthy value. Otherwise, test inIon() == true.

The previous behaviour is still accessible :

unsafeInIon() {
    var res = inIon();
    if (res && res != true) {
        throw res;
    }
    return res;
}
Flags: needinfo?(nicolas.b.pierron)
https://reviewboard.mozilla.org/r/5305/#review4297

::: js/src/jit-test/tests/basic/bug908915.js
(Diff revision 1)
> +        print(e.name);

Do not add print in test cases.

::: js/src/builtin/TestingFunctions.cpp
(Diff revision 1)
> +        JSString *answer = JS_NewStringCopyZ(cx, "Ion is disabled.");
> +        args.rval().setString(answer);
> +        return true;

I was not sure about it at the beginning, but I think this make sense as long as we do not use this function in asserEq arguments.

The reason why I think this is better, is that we would exit Ion if we return a non-empty reason string.

::: js/src/jsscript.h
(Diff revision 1)
> +    uint16_t        warmUpResetCount; /* Number of times the |warmUpCount| was
> +                                       * forcibly discarded. */

Also mention when is this field reset.  Is there any perf differences?
Comment on attachment 8577190 [details]
MozReview Request: bz://731683/layus

This patch looks good, please fix the comment left in comment 35 before landing.
Flags: needinfo?(nicolas.b.pierron)
Attachment #8577190 - Flags: review+
(In reply to Nicolas B. Pierron [:nbp] from comment #35)
> > +        JSString *answer = JS_NewStringCopyZ(cx, "Ion is disabled.");
> > +        args.rval().setString(answer);
> > +        return true;

You should check the return value of JS_NewStringCopyZ and return false on OOM.
(In reply to Jan de Mooij [:jandem] from comment #37)
> (In reply to Nicolas B. Pierron [:nbp] from comment #35)
> > > +        JSString *answer = JS_NewStringCopyZ(cx, "Ion is disabled.");
> > > +        args.rval().setString(answer);
> > > +        return true;
> 
> You should check the return value of JS_NewStringCopyZ and return false on
> OOM.

Got it !

Is there no way to obtain directly a JSString from the static c string in the bytecode in an infallible way ? I mean, the C string is part of the static data, could we store a static JSString instead ? It would also avoid creating the JSString on the fly...
https://reviewboard.mozilla.org/r/5305/#review4299

> Also mention when is this field reset.  Is there any perf differences?

I did not observe differences, but how to prove that ?

I can tell that
1. The JSScript does not increase in size (still 200 bytes).
2. The counter is used only by inIon
3. Th counter is incremented only on GC of the script.
   The impact of resetting a counter must be invisible compared to a gc run.
4. The counter is reset on compilation of the script.
   Again, the assignment of a counter is incomparable to running ion on a script.
   
So I think this counter does not impact performance, and does not increase memory usage.
(In reply to Guillaume Maudoux [:layus] from comment #39)
> https://reviewboard.mozilla.org/r/5305/#review4299
> 
> > Also mention when is this field reset.  Is there any perf differences?
> 
> I did not observe differences, but how to prove that ?

Use some benchmarks, such as sunspider / kraken / octane.

> I can tell that
> 1. The JSScript does not increase in size (still 200 bytes).

Good to know :)

> 2. The counter is used only by inIon
> 3. Th counter is incremented only on GC of the script.
>    The impact of resetting a counter must be invisible compared to a gc run.
> 4. The counter is reset on compilation of the script.
>    Again, the assignment of a counter is incomparable to running ion on a
> script.

What I am worried is that this field might be shifting the addresses of the 16 bits fields, but this should not have a large impact.
Comment on attachment 8577190 [details]
MozReview Request: bz://731683/layus

/r/5307 - Bug 731683 - IonMonkey: Improve Ion compilation/bailouts checks. r=nbp

Pull down this commit:

hg pull review -r ae8bc7c54664b8db3baa3b00dcadeaef98f0d2cc
Attachment #8577190 - Flags: review+
I think I have finished the patch.

There are
1/ A patch where all comments have been addressed an fixed (see Attachment 8577190 [details])
2/ A try run that is all green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f66f4db7a1dd
3/ A comparison of octane performance with and without the patch, based on ten runs on each build. If higher score is better, it seems that my patch improves a bit the performance on the test, but nothing significant with a 95% confidence interval :). (This attachment; compare.png)

PS: I finally found how to deterministically reproduce a zealous gc:
| while(!inIon()) gc();
and added the test case in the patch.

AFAIK, this is ready to get merged ! (at last :-)
Attachment #8576635 - Attachment is obsolete: true
Flags: needinfo?(nicolas.b.pierron)
Attachment #8577736 - Flags: review+
(In reply to Nicolas B. Pierron [:nbp] from comment #28)
> (In reply to Guillaume Maudoux [:layus] from comment #26)
> > Note: It seems that with JS_GC_ZEAL=14, garbage collection always occur two
> > times consecutively, increasing the warmUpResetCounter by steps of two...
> > any idea why this happens ?
> 
> I guess this might be because a moving GC has to update the pointers, thus
> it traverse the heap a first time to index, and a second time to update the
> pointers.  Unless this is just a debug thing?

Yes, we call Zone::discardJitCode() both when we sweep a zone and after we compact it (this can be in two separate GC slices), and that calls warmUpResetCounter().
Flags: needinfo?(jcoppeard)
(In reply to Guillaume Maudoux [:layus] from comment #42)
> Created attachment 8577736 [details]
> Analysis of the influence of the patch on octane score.
> 
> I think I have finished the patch.
> 
> There are
> 1/ A patch where all comments have been addressed an fixed (see Attachment
> 8577190 [details])
> 2/ A try run that is all green:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=f66f4db7a1dd
> 3/ A comparison of octane performance with and without the patch, based on
> ten runs on each build. If higher score is better, it seems that my patch
> improves a bit the performance on the test, but nothing significant with a
> 95% confidence interval :). (This attachment; compare.png)
> 
> PS: I finally found how to deterministically reproduce a zealous gc:
> | while(!inIon()) gc();
> and added the test case in the patch.
> 
> AFAIK, this is ready to get merged ! (at last :-)

Sounds good to me :)
I have no idea how review board patches are submitted.
Flags: needinfo?(nicolas.b.pierron)
Comment on attachment 8577190 [details]
MozReview Request: bz://731683/layus

The last version (revision 2) sounds good.
Attachment #8577190 - Flags: review+
Sorry, I have no idea how to push patches stored in review board from git.
I see that we can download a diff, but this will not keep the original meta-data of the commit.

 => setting checkin-needed.
Keywords: checkin-needed
Attachment #8577190 - Flags: review+
Comment on attachment 8577190 [details]
MozReview Request: bz://731683/layus

/r/5307 - Bug 731683 - IonMonkey: Improve Ion compilation/bailouts checks. r=nbp

Pull down this commit:

hg pull review -r 58dd315b8fd03db41bf9f7c9d9968a5d52c82091
Could this have introduced a regression?:

AWFY detected a regression/improvement on:
- slave: Mac OS X 10.10 32-bit (Mac Pro, shell)
- mode: Ion

Regression(s):
- kraken: 0.66% (regression)

Recorded range:
- http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=84f7fa96af4f&tochange=1aa2afb3de8f
Flags: needinfo?(nicolas.b.pierron)
It could have introduced a regression, yes.

The patch adds a new field to the JSScript structure. (See comment 40)
I made some tests with octane on linux, but I have no Mac machine available for testing. (See comment 42)

If it is the cause of the regression, then we could move the field around to avoid shifting the other 16bits fields or make it an 8bit-field.

@h4writer As I said, I can not run the test suites on Mac.
How can I be sure that this patch is the cause of the regression ?
Flags: needinfo?(hv1989)
(In reply to Guillaume Maudoux [:layus] from comment #50)
> It could have introduced a regression, yes.
> 
> The patch adds a new field to the JSScript structure. (See comment 40)
> I made some tests with octane on linux, but I have no Mac machine available
> for testing. (See comment 42)
> 
> If it is the cause of the regression, then we could move the field around to
> avoid shifting the other 16bits fields or make it an 8bit-field.
> 
> @h4writer As I said, I can not run the test suites on Mac.
> How can I be sure that this patch is the cause of the regression ?

As a first step it might be good to test kraken on Linux. And see if you can see something. It is quite noisy and the regression isn't that big. (So use more runs). Also it is always good to close other applications (esp. browser) and to set cpu-scaling to performance. If you can see the regression locally also, you have answered your question. If not you can needinfo me again and we look for another solution.
Note: make sure you are compiling a 32bit build (if you are on a 64bit OS). Since I only saw the regression on 32bit IIRC.
Flags: needinfo?(hv1989)
Thanks Guillaume for looking into this issue :)

(In reply to Hannes Verschore [:h4writer] from comment #49)
> Regression(s):
> - kraken: 0.66% (regression)

Is there any specific benchmark to look for?  I cannot spot the bump caused by this regression when zooming on the curve while browsing AWFY.

Also, a 0.66% regression is a first for me!  It seems to me that just a C++ compiler would cause much more random regression bigger than that.
Flags: needinfo?(nicolas.b.pierron)
I think this might just be a random noise, which just appear to be continuous with a regression which appear in the next dot, as kraken-parse-financial curve suggests:

http://arewefastyet.com/#machine=28&view=single&suite=kraken&subtest=parse-financial&start=1427126510&end=1427244772

Which would mean that this patch does not cause any regression.
Flags: needinfo?(hv1989)
This patch has landed on MC: https://hg.mozilla.org/mozilla-central/rev/0d4e6b99f0c4
Should it be marked as fixed?
The detailed information is available here:
http://arewefastyet.com/regressions/#/regression/17098

(In reply to Nicolas B. Pierron [:nbp] from comment #52)
> Thanks Guillaume for looking into this issue :)
> 
> (In reply to Hannes Verschore [:h4writer] from comment #49)
> > Regression(s):
> > - kraken: 0.66% (regression)
> 
> Is there any specific benchmark to look for?  I cannot spot the bump caused
> by this regression when zooming on the curve while browsing AWFY.

I have no sub benchmark tests. So cannot give a more specific subtests. Working on that.
But if you view the above data, you can click on the benchmark and get a nice graph.
It seems that if you look at the graph and disable Chrome (v8-turbofan) you get a sense of the regression.
Also click to view 25 datapoints.

It seems to be around 1035.2 before the regression (ignoring the platform regression).
And around 1042 after this landed.

> 
> Also, a 0.66% regression is a first for me!  It seems to me that just a C++
> compiler would cause much more random regression bigger than that.

So there is indeed noise on the benchmark. But looking at a view points before/after the regression it seems that the average over multiple points before/after are consistently slower.

Now I'm still not saying this is definitely a regression. The data just makes that I cannot ignore there is a potential regression. That is why I requested to try to reproduce this locally. If he is unable to reproduce, I will probably try to reproduce on the Mac slave and see if I can manually reproduce. Now I prefer to first let the author do this, since it doesn't scale for me to test all regressions locally!
Flags: needinfo?(hv1989)
(In reply to Guilherme Lima from comment #54)
> This patch has landed on MC:
> https://hg.mozilla.org/mozilla-central/rev/0d4e6b99f0c4
> Should it be marked as fixed?

Indeed, thanks.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Flags: needinfo?(jdemooij)
(In reply to Hannes Verschore [:h4writer] from comment #51)
> (In reply to Guillaume Maudoux [:layus] from comment #50)
> > @h4writer As I said, I can not run the test suites on Mac.
> > How can I be sure that this patch is the cause of the regression ?
> 
> As a first step it might be good to test kraken on Linux. And see if you can
> see something. It is quite noisy and the regression isn't that big. (So use
> more runs).

Guillaume, did you manage to reproduce this issue locally?
Flags: needinfo?(layus.on)
Not yet.

Kraken is not included in ion source,
and I am still struggling with NixOS.

Still on my TODO list...
Flags: needinfo?(layus.on)
(In reply to Guillaume Maudoux [:layus] from comment #58)
> Kraken is not included in ion source,

You can find the benchmark here (which is also the version on awfy):
https://github.com/h4writer/arewefastyet/tree/master/benchmarks
Regression set as 'wontfix': The introduction of a new "uint16_t" in JSScript causes some small ?cpu cache? changes. Not-actionable.
Attachment #8577190 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: