Closed Bug 1019983 Opened 5 years ago Closed 5 years ago

Predict types of MCompare better

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: h4writer, Assigned: h4writer)

Details

Attachments

(1 file, 1 obsolete file)

Currently the types of MCompare are predicted by looking at the comparestubs we saw during baseline. This has two issues:

1) What about having "undefined/boolean/int32 == int32". Baseline will say there is a int32==int32 stub. But nothing more. Since the other paths take the fallback path. IonMonkey sees the int32==int32 stub and eagerly only supports int32 == int32. For all other types (which we have observed) it will bail.

2) What about places where we don't have type information yet. Looking at the baseline stubs will also fail and don't give use a better compareType.

I propose to not look at Baseline stubs, since we can make wrong predictions due of it. We should look at the resultTypeSet() of the inputs and if the type information is missing, we can make a prediction based on type of the other input. Else we don't, since in that case the reason we haven't found a better compareType is because there flows more inputs type through.

This fixes some constant bailing found in octane-pdfjs (due to point 1).
Attached patch Patch (obsolete) — Splinter Review
This doesn't cause a regression for me. Even though I don't support
Compare_XXXXXMaybeCoerceXXX. We can always add this again if needed lateron.
Assignee: nobody → hv1989
Attachment #8433685 - Flags: review?(bhackett1024)
With this patch, we can predict all compares in deltablue/crypto/raytrace/earleyboyer which we didn't do before. Also it removes the bailing in pdfjs due to undefined/boolean/int32 == int32 being specialized to MCompare_Int32
The way we usually do this is by setting a bit on the Baseline fallback stub if we had operands we didn't attach a stub for. Ion can check that and ignore the Baseline info in that case.
(In reply to Jan de Mooij [:jandem] from comment #3)
> The way we usually do this is by setting a bit on the Baseline fallback stub
> if we had operands we didn't attach a stub for. Ion can check that and
> ignore the Baseline info in that case.

I can totally do that. But after the new guessing code (to support more cases), the expectedCompareType is almost never used and when it is it are cases that expectedCompareType doesn't help. So for simplicity I removed that code...
Can you do some digging to find out why it was added in the first place? To make sure we don't regress that...
(In reply to Jan de Mooij [:jandem] from comment #5)
> Can you do some digging to find out why it was added in the first place? To
> make sure we don't regress that...

It was added for "move typeanalysis to ionbuilder". To fix some regressions in kraken/octane. Brian didn't provide which subtests. But running locally I see no regression with this patch. 

An examplory testcase that benifits a lot is:

> function f(y) {
>     return (y > 0) == y;
> 
>     // < Make sure f doesn't get inlined
>     for (var i=0; i<1000; i++) {
>       print(1); print(1); print(1); print(1); print(1); print(1)
>     }
>     // >
> }
> for (var i=0; i<10000; i++) {
>   assertEq(f(0), true);
> }
> for (var i=0; i<10000; i++) {
>   assertEq(f(null), false);
> }

This would bail every iteration in the second loop. Since we didn't emit code to handle the "null" case.
(In reply to Hannes Verschore [:h4writer] from comment #6)
> It was added for "move typeanalysis to ionbuilder". To fix some regressions
> in kraken/octane. Brian didn't provide which subtests. But running locally I
> see no regression with this patch. 

Great!

> This would bail every iteration in the second loop. Since we didn't emit
> code to handle the "null" case.

I see, nice find.
Comment on attachment 8433685 [details] [diff] [review]
Patch

Review of attachment 8433685 [details] [diff] [review]:
-----------------------------------------------------------------

Well, I have some concerns about this approach:

- Re: point #1, if the comparison is between a boolean and int32 then baseline should say there is a Compare_Int32WithBoolean stub and we should generate code which can coerce the non-int32 operand.  So it sounds more like something in this mechanism is broken, and less like it needs to be removed entirely.

- Is the problem with pdf.js and the microbenchmark that we eventually invalidate the IonScript but recompile generating the same bailing-out code?  If so then that sounds like it could be easily fixed per comment 3.

- This change leaves us unable to specialize comparisons when IonBuilder doesn't have good type information about either operand.  Even if this doesn't help us on octane or kraken I don't think that warrants removing this optimization entirely.

::: js/src/jit/MIR.cpp
@@ +1946,5 @@
> +static bool
> +TypeInformationIsMissing(MDefinition *input)
> +{
> +    return (input->resultTypeSet() && input->resultTypeSet()->empty()) ||
> +           (!input->resultTypeSet() && input->type() == MIRType_Value);

This test is kind of strange, and looks pretty overspecialized --- if the input has a result type set but it is unknown then this won't match, and what's wrong with type sets that have multiple types but still end up as MIRType_Value?  What is the property you're looking for here?
Attachment #8433685 - Flags: review?(bhackett1024)
(In reply to Brian Hackett (:bhackett) from comment #8)
> Comment on attachment 8433685 [details] [diff] [review]
> Patch
> 
> Review of attachment 8433685 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Well, I have some concerns about this approach:
> 
> - Re: point #1, if the comparison is between a boolean and int32 then
> baseline should say there is a Compare_Int32WithBoolean stub and we should
> generate code which can coerce the non-int32 operand.  So it sounds more
> like something in this mechanism is broken, and less like it needs to be
> removed entirely.
> 
> - Is the problem with pdf.js and the microbenchmark that we eventually
> invalidate the IonScript but recompile generating the same bailing-out code?
> If so then that sounds like it could be easily fixed per comment 3.
> 
> - This change leaves us unable to specialize comparisons when IonBuilder
> doesn't have good type information about either operand.  Even if this
> doesn't help us on octane or kraken I don't think that warrants removing
> this optimization entirely.

Answering the three points together:
So we are bailing (not invalidating, but invalidating would indeed create the same code again and we will bail again.). We decided to look at the stubs present in the compare. Now comment 3 would indeed solve the issue (the bails). But it also renders that code as good as useless, since it almost never triggers anymore ...
Since it isn't really needed and also complex and we can get the same behaviour (without bails) with this code, I think it might be a good idea to remove the code.
If you think it will still give benefits and we should definitely keep that code. I definitely also include the idea in comment 3. But I feel (since it will happen almost never) it would be a place where bugs can creep which are hard to find.

> 
> ::: js/src/jit/MIR.cpp
> @@ +1946,5 @@
> > +static bool
> > +TypeInformationIsMissing(MDefinition *input)
> > +{
> > +    return (input->resultTypeSet() && input->resultTypeSet()->empty()) ||
> > +           (!input->resultTypeSet() && input->type() == MIRType_Value);
> 
> This test is kind of strange, and looks pretty overspecialized --- if the
> input has a result type set but it is unknown then this won't match, and
> what's wrong with type sets that have multiple types but still end up as
> MIRType_Value?  What is the property you're looking for here?

TypeInformationIsMissing returns if there is absolute no indication about the type.

i.e. for "string == x" where x is missing we can speculate the type is string. Now if there is any indication the x is not string, we shouldn't speculate. So "string == Value", where value is "string/int", we shouldn't speculate "string == string".
So this function reports if we can speculate (== if there is no reason why we shouldn't speculate).
In this case that is when resultTypeSet() is empty or type is MIRType_Value, but no resultTypeSet is set...
(In reply to Hannes Verschore [:h4writer] from comment #9)
> Answering the three points together:
> So we are bailing (not invalidating, but invalidating would indeed create
> the same code again and we will bail again.). We decided to look at the
> stubs present in the compare. Now comment 3 would indeed solve the issue
> (the bails). But it also renders that code as good as useless, since it
> almost never triggers anymore ...

I'm not following here, it seems like there are plenty of cases where baseline could still provide information that lets the compare be specialized.  Comment 3 is just saying that baseline should better distinguish when it is providing complete vs. incomplete information about the types that it has seen at the opcode.  If there was a int32/boolean vs. int32 compare then baseline could provide that information without requiring accurate types in IonBuilder, but we just don't want baseline to pretend it has information about e.g. an int32/null vs. int32 comparison when it didn't generate stubs for all combinations of input types it saw.

> TypeInformationIsMissing returns if there is absolute no indication about
> the type.
> 
> i.e. for "string == x" where x is missing we can speculate the type is
> string. Now if there is any indication the x is not string, we shouldn't
> speculate. So "string == Value", where value is "string/int", we shouldn't
> speculate "string == string".

OK, but why not?  If the value is 'string|int' then ideally here we could look at the stubs which baseline generated as a hint that maybe this is always a string comparison.  If baseline says it is then we generate string code; if an int comes in then we should bail and (immediately or eventually) invalidate, with a new baseline stub (or flag on the fallback stub) that prevents us from specializing on string when we re-ion-compile.  If we keep bailing out but never actually invalidate the code, that is a separate bug that we shouldn't be papering over.

Generally, if we have better type information (i.e. 'string|int' is better than 'unknown') we shouldn't end up generating worse code.  A value with 'unknown' type is definitely an indication the value might not be a string, at least as much as if the type set is 'string|int'.  Treating that 'unknown' value as being better than 'string|int' goes against the grain and to me means we are overspecializing on the code that happens to appear in these benchmarks we're optimizing for, and will make it harder to improve this code when optimizing for other code bases.
(In reply to Brian Hackett (:bhackett) from comment #10)
> Generally, if we have better type information (i.e. 'string|int' is better
> than 'unknown') we shouldn't end up generating worse code.  A value with
> 'unknown' type is definitely an indication the value might not be a string,
> at least as much as if the type set is 'string|int'.  Treating that
> 'unknown' value as being better than 'string|int' goes against the grain and
> to me means we are overspecializing on the code that happens to appear in
> these benchmarks we're optimizing for, and will make it harder to improve
> this code when optimizing for other code bases.

We aren't generating worse code by doing this ?!

- string/int == string => Should use a MCompare_Unkown, since that handles all cases. Which is the optimal solution.
- string == string => Should use a MCompare_String, since that will be the optimal solution.
- missing == string => Currently does a MCompare_Unknown, untill we hopefully notice a new type. I would change this to use MCompare_String. Since it would be faster if it is indeed a string==string. If it isn't we will bail and see the new type, which should make us invalidate and optimize the compare to the right type...
(In reply to Hannes Verschore [:h4writer] from comment #11)
> We aren't generating worse code by doing this ?!
> 
> - string/int == string => Should use a MCompare_Unkown, since that handles
> all cases. Which is the optimal solution.
> - string == string => Should use a MCompare_String, since that will be the
> optimal solution.
> - missing == string => Currently does a MCompare_Unknown, untill we
> hopefully notice a new type. I would change this to use MCompare_String.
> Since it would be faster if it is indeed a string==string. If it isn't we
> will bail and see the new type, which should make us invalidate and optimize
> the compare to the right type...

The idea here is that there are two lattices, one for the quality of type information and one for the quality (i.e. speed) of generated code.  If we have better type information we should generate better code [1].

For type information, {} (aka missing) is better than {string}, which is better than {string,int}, which is better than the set of all types (aka unknown).

For code quality, generating MCompare_String is better than generating MCompare_Unknown.

So if you generate MCompare_String when a type set is unknown but MCompare_Unknown when a type set is {string,int} then you're going against principle [1] above; code quality will actually decrease if we are able to improve type information and determine that the type set isn't unknown but rather can only contain strings and ints.  This wouldn't be the end of the world but having non-monotonic behavior like this can make further optimization and perf analysis difficult and imo should be avoided whenever possible.

btw one source of confusion I think is that the TypeInformationIsMissing function in the patch conflates 'missing' type sets with 'unknown' type sets.  These are at opposite ends of the lattice; you can generate whatever comparison kind you want for 'missing' type sets since we will need to invalidate before the associated code runs anyways.  (In some cases we try to avoid this invalidation by speculatively degrading type information, e.g. see PropertyReadNeedsTypeBarrier after the 'If this access has never executed ...' comment.  This is an exception to principle [1]).

string == string => MCompare_String for sure, yeah.

string/int == string => There's not enough static type information to really say that MCompare_String or MCompare_Unknown is the right choice here, but we can get additional type information from baseline info.  If the lhs has only ever seen strings then I think we should use MCompare_String, since that's faster than MCompare_Unknown and there really isn't solid information that in practice this won't be a string.  Per the baseline info, if the lhs has ever seen an integer we should use MCompare_Unknown.  If there isn't any baseline information due to the compare not executing then that is better (if maybe not as helpful) information so we should generate MCompare_String.  If an int comparison executes in the future we will invalidate and add a baseline stub/info which prevents us from using MCompare_String after recompiling the function.

unknown == string => There isn't enough static type information here either to determine whether we should use MCompare_String or MCompare_Unknown, so I think we should do the same thing that we do for string/int == string.
That is not correct.

I agree about:
For type information, {} (aka missing) is better than {string}, which is better than {string,int}.

I don't agree about:
For code quality, generating MCompare_String is better than generating MCompare_Unknown.

MCompare_String is only better if the only types that flow in are {string}. If {string,int} flows in MCompare_Unknown is code quality-wise the best/most performant compare!

So given: "If we have better type information we should generate better code" ... We still honor this!

{missing} == {string} => MCompare_String [1]
{string} == {string} => MCompare_String
{string,int} == {string} => MCompare_Unknown

[1] if {missing} becomes {string} we have better type information and still generate the optimal solution
[1] if {missing} becomes {string,int} we have better type information and decide to generate a compare that fits the better type information. Which is MCompare_Unknown.

{unknown} == {string} => still stays MCompare_Unknown! That's what the TypeInformationIsMissing tries to capture. We only do this optimization when the typeset is missing. If it is unknown, we treat it as "all values can flow through". So the best compare (code-wise) is MCompare_Unknown...
Since missing and unknown are opposite ends of the spectrum, perhaps s/TypeInformationIsMissing/TypeInformationIsUnknown/?
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #14)
> Since missing and unknown are opposite ends of the spectrum, perhaps
> s/TypeInformationIsMissing/TypeInformationIsUnknown/?

Missing and unknown are indeed on opposite ends of the spectrum. But I'm deliberate choosing "Missing". So "IsMissing" is the correct term here. (I don't want to adjust behaviour with Unknown)
(In reply to Hannes Verschore [:h4writer] from comment #15)
> Missing and unknown are indeed on opposite ends of the spectrum. But I'm
> deliberate choosing "Missing". So "IsMissing" is the correct term here. (I
> don't want to adjust behaviour with Unknown)

Okay, I misunderstood what you meant in comment #15 then. Sorry about the noise! :)
To finally get this landed, just do the easy part, instead of making better predictions.

Decreases the amount of bailing on octane-pdfjs from 158 to 85. I see no improvement on score though.
Attachment #8433685 - Attachment is obsolete: true
Attachment #8448189 - Flags: review?(jdemooij)
Comment on attachment 8448189 [details] [diff] [review]
NoteUnoptimizableAccess

Review of attachment 8448189 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/BaselineInspector.cpp
@@ +227,5 @@
>          return MCompare::Compare_Unknown;
>  
> +    ICStub *fallback = (second) ? second->next() : first->next();
> +    JS_ASSERT_IF(fallback, fallback->isFallback());
> +    if (fallback) {

Nit: you could write this as:

if (ICStub *fallback = second ? second->next() : first->next()) {
    JS_ASSERT(fallback->isFallback());
    ...
}
Attachment #8448189 - Flags: review?(jdemooij) → review+
(You don't need the fallback->isFallback() assert; toCompare_Fallback() will assert if it's a different kind of stub.)
https://hg.mozilla.org/integration/mozilla-inbound/rev/0f85fbed35dc

Oh I only read your comment that the assert is not needed. Guess it can't hurt?
https://hg.mozilla.org/mozilla-central/rev/0f85fbed35dc
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Depends on: 1066659
No longer depends on: 1066659
You need to log in before you can comment on or make changes to this bug.