Closed Bug 939835 Opened 11 years ago Closed 11 years ago

tcanvasmark regression detected on osx and windows platforms

Categories

(Core :: JavaScript Engine, defect)

28 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla28
Tracking Status
firefox27 --- unaffected
firefox28 + fixed

People

(Reporter: jmaher, Assigned: bzbarsky)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression][qa-])

Attachments

(1 file, 2 obsolete files)

A regression was posted to dev.tree-management this weekend:
https://groups.google.com/forum/#!topic/mozilla.dev.tree-management/cUL_uzUZjbI

This only shows the worse offender, actually this is the case on osx 10.6, 10.7, 10.8 and windows xp, 7, and 8.

You can see the graph server view of this regression:
http://graphs.mozilla.org/graph.html#tests=[[297,131,25]]&sel=none&displayrange=30&datatype=running

You can also see the datazilla view of this (all platforms):
https://datazilla.mozilla.org/?start=1384184903&stop=1384789703&product=Firefox&repository=Mozilla-Inbound&test=tcanvasmark&page=3D%20Rendering%20-%20Maths-%20polygons-%20image%20transforms&graph_search=7a33656ad047&tr_id=3564135&graph=3D%20Rendering%20-%20Maths-%20polygons-%20image%20transforms&x86=true&x86_64=true&error_bars=false&project=talos

and for windows 7, you can see this details of all the subtests for tcanvasmark:
https://datazilla.mozilla.org/?start=1384184903&stop=1384789703&product=Firefox&repository=Mozilla-Inbound-Non-PGO&os=win&os_version=6.1.7601&test=tcanvasmark&graph_search=7a33656ad047&tr_id=3564135&graph=3D%20Rendering%20-%20Maths-%20polygons-%20image%20transforms&x86=true&x86_64=true&error_bars=false&project=talos

We only fail on 1 test, but it is noticeable. 

There was a series of changes landed during this one push, I wasn't sure which one caused the regression.

bz, can you help pinpoint the root cause of this and get a fix in?
Flags: needinfo?(bzbarsky)
I don't see how those changes could possibly have affected canvas performance for the worse, unless the test is doing a bunch of cloneNode/importNode() calls....

How confident are we in the changeset regression range?  Where does the source for this test live?  Specifically, I'd need the "3D Rendering - Maths- polygons- image transforms" test, right?
Flags: needinfo?(bzbarsky) → needinfo?(jmaher)
The range is pretty clear pre/post the commit to inbound- that is the only data I have to work with.

For the sources, it is all mashed up in a big file:
http://hg.mozilla.org/build/talos/file/55fd35f46f08/talos/page_load_test/canvasmark/scripts/canvasmark_v6.js#l994

It looks like we end up calling:
http://hg.mozilla.org/build/talos/file/55fd35f46f08/talos/page_load_test/canvasmark/scripts/canvasmark_v6.js#l1059

which appears to call the k3d library.  snorp, do you know more about this test?
Flags: needinfo?(jmaher) → needinfo?(snorp)
It looks to be some sort of library for drawing 3d stuff to the 2d canvas. I'd wager there is a bunch of math in there, maybe this is just a JS perf regression?
Flags: needinfo?(snorp)
Maybe, but the patches involved here should not have caused a JS perf regression (and on the contrary, should have made us output better code in JS that does DOM calls).

Are there any instructions for running this test locally?
talos is easy to run:
https://wiki.mozilla.org/Buildbot/Talos/Running

You can run canvasmark:
http://www.kevs3d.co.uk/dev/canvasmark/

ping me on irc if you have any problems running it or getting the non talos version to run.
Thanks.  Going to try to reproduce the regression locally.
Assignee: nobody → bzbarsky
Version: Trunk → 28 Branch
Hrm.  The non-talos version doesn't show the per-test scores.

The instructions in comment 5 don't seem to give me an option to run just canvasmark, or better yet just the "Maths- polygons- image transforms" test, right?  I have to run all of "other"?
revision,cmark1, cmark2, cmark3,try link
baseline,5698,5741,5755,https://tbpl.mozilla.org/?tree=Try&rev=a396d912eb06
e8e777ff1e47,5711,5758,5731,https://tbpl.mozilla.org/?tree=Try&rev=308e850a316f
adf943f1879e,5639,5754,5775,https://tbpl.mozilla.org/?tree=Try&rev=9f1647ce1dba
a90375867a74,5734,5778,5705,https://tbpl.mozilla.org/?tree=Try&rev=30b6e9cc0ea7
b1e54e176909,5682,5663,5651,https://tbpl.mozilla.org/?tree=Try&rev=2314f920eb01
7a33656ad047,5661,5635,5665,https://tbpl.mozilla.org/?tree=Try&rev=56d8f630c4bd

I retriggered each run so I could have 3 data points per try server.  Right now the results haven't come in via datazilla (a backlog in processing, some are in though), but all the tests are done.

It appears the last 2 try pushes are causing the benchmark overall to slow down and report lower scores.

This translates to:
Bug 937772. Make better use of our out-of-band type information for unboxing object-valued return values of DOM getters and methods. r=h4writer
Bug 932501. Drop nextElementSibling/previousElementSibling from DocumentType. r=smaug

At least this is narrowing down the patches a bit.
That would point to bug 937772, then.

Which is odd, since the whole point of that patch was to remove some jitcode that's not needed...

Hannes, any idea what could be going wrong there?
Blocks: 937772
Flags: needinfo?(hv1989)
So yes, comparing the run before that change:

09:22:31     INFO -  |6;3D Rendering - Maths- polygons- image transforms;755;737;750;746;747

with the run after:

10:24:51     INFO -  |6;3D Rendering - Maths- polygons- image transforms;632;623;631;633;627

the regression is unmistakable.  I'm going to see if I can somehow extract that test to run locally after all so I can narrow down more which code is responsible....
I can only think of one thing responsible for this:

IonAnalysis.cpp TryEliminateTypeBarrierFromTest and TryEliminateTypeBarrier

It checks if an "if" filters undefined or null and does some optimization for that.
Now it is hardcoded to test for MUnbox::TypeBarrier and disregards MUnbox::Infallible. Can you also let MUnbox::Infallible through and check if that removes the regression?
Flags: needinfo?(hv1989)
Hmm.  So I tried that, and it seems to help a bit, but not nearly enough (maybe removes 1/3 of the regression.... maybe).

I've definitely determined that the change responsible is that unbox inserted in getPropTryCommonGetter.  If I just comment that bit out, the performance is where it should be.

I'm going to see if I can get a better idea of which getter might be responsible here...
There are two changes here.

1) When trying to eliminate type barriers, skip over infallible
unboxes, not just barriered ones.  For this bit, should I just be
testing barrier->input()->toUnbox()->mode() != Fallible instead?

2) When we statically know a value is double or int but dynamically
we've only observed integers, keep allowing TI to barrier and
specialize to integer instead of forcing the value to be treatd as a
double.  I considered adding another JSValueType argument to
EnsureDefiniteType and passing in the types->getKnownTypeTag() so I
wouldn't have to duplicate the logic, but wasn't quite sure what to
call the argument and in what sense EnsureDefiniteType would ensure
things if it would sometimes explicitly fall back on barriering.  If
you'd prefer we consolidated the logic somehow, let me know?
Attachment #8334826 - Flags: review?(hv1989)
Joel, thank you very much for catching this!
Component: Talos → JavaScript Engine
Product: Testing → Core
Whiteboard: [talos_regression] → [need review][talos_regression]
Comment on attachment 8334826 [details] [diff] [review]
Fix up performance regressions from bug 937772.

Review of attachment 8334826 [details] [diff] [review]:
-----------------------------------------------------------------

For (1) I would remove the check for mode(). I'm pretty sure that is allowed. But please do a try build for test coverage.
For (2) it is getting really dense there. I think using pushDOMTypeBarrier would clarify the code a lot...

::: js/src/jit/IonAnalysis.cpp
@@ +1573,5 @@
>      // Disregard the possible unbox added before the Typebarrier for checking.
>      MDefinition *input = barrier->input();
> +    if (input->isUnbox() &&
> +        (input->toUnbox()->mode() == MUnbox::TypeBarrier ||
> +         input->toUnbox()->mode() == MUnbox::Infallible || true))

I added this check just to be sure. But I actually think you can remove the mode() checking. It should work for all modes. So:
if (input->isUnbox())

@@ +1578,5 @@
>          input = input->toUnbox()->input();
>  
>      if (test->getOperand(0) == input && direction == TRUE_BRANCH) {
>          *eliminated = true;
>          barrier->replaceAllUsesWith(barrier->input());

Can you add a follow-up bug to make barrier->input() Infallible if we arrive here and barrier->input() is a Unbox operation? Should give us some more performance improvement.

@@ +1625,5 @@
>  
>      // Disregard the possible unbox added before the Typebarrier.
>      if (barrier->input()->isUnbox() &&
> +        (barrier->input()->toUnbox()->mode() == MUnbox::TypeBarrier ||
> +         barrier->input()->toUnbox()->mode() == MUnbox::Infallible || true))

Here also no need to check the mode(). Should be fine without.

::: js/src/jit/IonBuilder.cpp
@@ +5268,1 @@
>          }

What about stuffing all this into a function called:

pushDOMTypeBarrier(call, types, target);

That way we don't need this code on two different locations.
Attachment #8334826 - Flags: review?(hv1989)
Comment on attachment 8334826 [details] [diff] [review]
Fix up performance regressions from bug 937772.

Review of attachment 8334826 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/IonAnalysis.cpp
@@ +1573,5 @@
>      // Disregard the possible unbox added before the Typebarrier for checking.
>      MDefinition *input = barrier->input();
> +    if (input->isUnbox() &&
> +        (input->toUnbox()->mode() == MUnbox::TypeBarrier ||
> +         input->toUnbox()->mode() == MUnbox::Infallible || true))

When thinking more about this, we shouldn't do this for MUnbox::Fallible:

if (input->isUnbox() && input->toUnbox()->mode() != MUnbox::Fallible)

@@ +1625,5 @@
>  
>      // Disregard the possible unbox added before the Typebarrier.
>      if (barrier->input()->isUnbox() &&
> +        (barrier->input()->toUnbox()->mode() == MUnbox::TypeBarrier ||
> +         barrier->input()->toUnbox()->mode() == MUnbox::Infallible || true))

When thinking more about this, we shouldn't do this for MUnbox::Fallible:

if (input->isUnbox() && input->toUnbox()->mode() != MUnbox::Fallible)
Filed bug 940686 as a followup
Attachment #8334878 - Flags: review?(hv1989)
Attachment #8334826 - Attachment is obsolete: true
Er, I missed comment 17.  Fixing that up.
With that fixup
Attachment #8334886 - Flags: review?(hv1989)
Attachment #8334878 - Attachment is obsolete: true
Attachment #8334878 - Flags: review?(hv1989)
Comment on attachment 8334886 [details] [diff] [review]
Fix up performance regressions from bug 937772.

Review of attachment 8334886 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks. Much better.

::: js/src/jit/IonAnalysis.cpp
@@ +1622,5 @@
>      const types::TemporaryTypeSet *inputTypes = barrier->input()->resultTypeSet();
>  
>      // Disregard the possible unbox added before the Typebarrier.
>      if (barrier->input()->isUnbox() &&
> +        barrier->input()->toUnbox()->mode() != MUnbox::Fallible)

This now fits on a single line. So do this + remove {}

::: js/src/jit/IonBuilder.cpp
@@ +5247,3 @@
>      if (call->isDOMFunction()) {
> +        return pushDOMTypeBarrier(call, types, call->getSingleTarget());
> +    }

no {}

@@ +6125,5 @@
>  
> +bool
> +IonBuilder::pushDOMTypeBarrier(MInstruction *ins, types::TemporaryTypeSet *observed, JSFunction* func)
> +{
> +    JS_ASSERT(func && func->isNative() && func->jitInfo());

newline

@@ +6145,5 @@
> +            current->push(replace);
> +        }
> +    } else {
> +        JS_ASSERT(barrier);
> +    }

newline
Attachment #8334886 - Flags: review?(hv1989) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/887f595b0abb
Whiteboard: [need review][talos_regression] → [talos_regression]
Target Milestone: --- → mozilla28
https://hg.mozilla.org/mozilla-central/rev/887f595b0abb
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
datazilla seems to say this fixed the problem.
thanks for fixing this!
Thanks for filing it!
Whiteboard: [talos_regression] → [talos_regression][qa-]
You need to log in before you can comment on or make changes to this bug.