1242692 - 16-92% Linux 64 * regression on Mozilla-Inbound on Jan 21, 2016 from push 8cd99e9b6034

Reporter

Description

•

8 years ago

Talos has detected a Firefox performance regression from your commit 8cd99e9b6034 in bug 1180942.  We need you to address this regression.

This is a list of all known regressions and improvements related to your bug:
http://alertmanager.allizom.org:8080/alerts.html?rev=8cd99e9b6034a6417ba17b426e6537ccb8aa9bf2&showAll=1

On the page above you can see Talos alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test, please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#tpaint

Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p linux64 -u none -t other  # add "mozharness: --spsProfile" to generate profile data

To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code

Then run the following command from the directory where you set up Talos:
talos --develop -e <path>/firefox -a tpaint

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 1

•

8 years ago

here is a compare view:
https://treeherder.allizom.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=16bb246db642&newProject=mozilla-inbound&newRevision=8cd99e9b6034&framework=1

tsvgx on e10s experiences a win for the hixie* tests:)

My understanding is this is expected and we need to consider these regressions a new baseline.  :nical, please confirm and outline any future work or if this is the new standard and there is no work left to do.

Flags: needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Comment 2

•

8 years ago

(In reply to Joel Maher (:jmaher) from comment #1)
> My understanding is this is expected and we need to consider these
> regressions a new baseline.

Yes. I explained the situation in bug 1180942. With asynchronous backends like xrender we tend to measure something different than what we measure with synchronous backends, since the actual painting work is done outside of gecko and in the best case it is not part of the measurement.
This also means that what we measure is different from what users see, because the xserver can sepnd a long time rendering a frame when gecko has moved on.
On the systems that I tested, the user experience is actually generally improved, and this is the main reason for this change (there are other reasons to want to get rid of xrender in gecko). I am sure we'll find things that some versions of xrender do faster than the cairo-image backend we ship and unless we see catastrophic regressions on the user-experience side of things, we'll have to take them as trade-offs, and I am confident that for performance, this is actually a good one.   

Switching to a synchronous cpu backend indeed sets a new baseline, one which we are comfortable with because it's the drawing backend we currently use in other platforms such as android, b2g and windows xp.

There is no future work planned that is related to or caused by this particular change. Eventually we'll switch from cairo-image to skia, and this time around we'll have a more apple-to-apple comparison on talos, but the move to skia has its own motivations separated from the switch from cairo-xrender to cairo-image.

Flags: needinfo?(nical.bugzilla)

Avi Halachmi (:avih)

Comment 3

•

8 years ago

(In reply to Nicolas Silva [:nical] from comment #2)
> With asynchronous backends like xrender we tend to measure something different
> than what we measure with synchronous backends

So are the tests still valid to detect regressions with the new backend? e.g. let's say in a week from now there's a new user visible regression, would the tests still detect it?

If not, should we drop those (which?) tests for linux? or modify them in a way such that they're still relevant? (recall, the main goal of the tests is to detect regressions between close enough revisions - not from a year ago when we had different technologies).

> On the systems that I tested, the user experience is actually generally
> improved, and this is the main reason for this change (there are other
> reasons to want to get rid of xrender in gecko)

Would you be able to make this comparison for test runs of the tests which supposedly regressed?

Flags: needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Comment 4

•

8 years ago

(In reply to Avi Halachmi (:avih) from comment #3)
> So are the tests still valid to detect regressions with the new backend?
> e.g. let's say in a week from now there's a new user visible regression,
> would the tests still detect it?

Our tests are now more valid to detect regressions than they were before, because if something happens to be awfully slow in the painting phase it will be measured, and as a bonus we'll see what is slow in cleopatra.

> Would you be able to make this comparison for test runs of the tests which
> supposedly regressed?

You mean look at the talos test suite running  before and after?
What I tried was to switch the pref and browse around for a week or two. I also asked :padenot to do the same for a few days and he had the same feedback. browsing "felt" smoother generally and some specific web pages went from incredibly awful to the normal performance we would expect. Off the top of my head the difference was spectacular on treeherder and on pages that use some css filters like blurs. Tab switching felt generally better but I might be biased by the fact that I was often switching to treeherder which would take forever.
Anyway, looking (as in with my eyes) at the talos tests themselves didn't help much because it's hard to get a good feel when the framerate is already high enough before and after a patch. The tests don't react to inputs, and time between input and visual feedback is usually a good way to "feel" the difference.

Flags: needinfo?(nical.bugzilla)

Avi Halachmi (:avih)

Comment 5

•

8 years ago

(In reply to Nicolas Silva [:nical] from comment #4)
> (In reply to Avi Halachmi (:avih) from comment #3)
> Our tests are now more valid to detect regressions than they were before,
> because if something happens to be awfully slow in the painting phase it
> will be measured, and as a bonus we'll see what is slow in cleopatra.
> ...
> What I tried was to switch the pref and browse around for a week or two. I
> also asked :padenot to do the same for a few days and he had the same
> feedback. browsing "felt" smoother ...

These are both great.

> > Would you be able to make this comparison for test runs of the tests which
> > supposedly regressed?
> 
> You mean look at the talos test suite running  before and after?

Yes.

> Anyway, looking (as in with my eyes) at the talos tests themselves didn't
> help much because it's hard to get a good feel when the framerate is already
> high enough before and after a patch.

So nothing you could notice as meaningful diff during the test runs, for better or worse? (just collecting info, not suggesting anything).

Nicolas Silva [:nical]

Comment 6

•

8 years ago

(In reply to Avi Halachmi (:avih) from comment #5)
> So nothing you could notice as meaningful diff during the test runs, for
> better or worse? (just collecting info, not suggesting anything).

I didn't dig much because I (perhaps too) quickly decided that staring at talos runs was not giving me useful data. From our discussion on irc, I guess I didn't spend enough time looking at talos running to find the things you are thinking about.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 7

•

8 years ago

reading the last comment, I am unclear if there is additional work here- is there something we should look into more- subtests, larger data sets, specific tests?

Nicolas Silva [:nical]

Comment 8

•

8 years ago

(In reply to Joel Maher (:jmaher) from comment #7)
> reading the last comment, I am unclear if there is additional work here- is
> there something we should look into more- subtests, larger data sets,
> specific tests?

I don't think there is any additional work there. I would keep xrender diabled even if it was causing user-visible performance regressions anyway (which I am confident is the other way around).

Nicolas Silva [:nical]

Comment 9

•

8 years ago

Attached image Screenshot from 2015-10-20 18-14-22.png — Details

For example, this is what treeherder looks like on a certain computer with stable versions of xorg and the open source (nouveau) driver for my nvidia gpu.

Nicolas Silva [:nical]

Comment 10

•

8 years ago

(In reply to Nicolas Silva [:nical] from comment #9)
> For example, this is what treeherder looks like on a certain computer with
> stable versions of xorg and the open source (nouveau) driver for my nvidia
> gpu.

I meant with xrender enabled. Disabling xrender is more than a win-win situation at this point if you add up all the good things that we got from disabling it (more like win-win-win-win-win).

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 11

•

8 years ago

:avih, please confirm you are ok with this and mark this as resolved/wontfix :)

Flags: needinfo?(avihpit)

Avi Halachmi (:avih)

Comment 12

•

8 years ago

Yeah, can't beat win-win-win-win :)

Status: NEW → RESOLVED

Closed: 8 years ago

Flags: needinfo?(avihpit)

Resolution: --- → WONTFIX

Bugzilla

Quick Search

16-92% Linux 64 * regression on Mozilla-Inbound on Jan 21, 2016 from push 8cd99e9b6034

Categories

(Core :: Graphics, defect)

Tracking

()

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Attachment

General

Description

File Name

Content Type