1500101 - 2.65 - 5.88% sessionrestore / sessionrestore_no_auto_restore / tart / tp5o_webext responsiveness / ts_paint / ts_paint_webext (linux64-qr, windows10-64-qr) regression on push a1f350ca24173934b0633bfff039d672d0187bbb (Wed Oct 17 2018)

Reporter

Description

•

6 years ago

Talos has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=a1f350ca24173934b0633bfff039d672d0187bbb

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

  6%  tart windows10-64-qr opt e10s stylo                          3.85 -> 4.07
  6%  tp5o_webext responsiveness linux64-qr opt e10s stylo         1.76 -> 1.85
  5%  sessionrestore linux64-qr opt e10s stylo                     841.50 -> 881.00
  5%  sessionrestore_no_auto_restore linux64-qr opt e10s stylo     860.17 -> 899.00
  4%  ts_paint linux64-qr opt e10s stylo                           857.25 -> 895.17
  3%  ts_paint_webext linux64-qr opt e10s stylo                    867.92 -> 890.92


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=16918

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling

Ionuț Goldan [:igoldan]

Reporter

Updated

•

6 years ago

Component: General → Graphics: WebRender

Product: Testing → Core

Ionuț Goldan [:igoldan]

Reporter

Updated

•

6 years ago

Flags: needinfo?(jmuizelaar)

Ionuț Goldan [:igoldan]

Reporter

Comment 1

•

6 years ago

Here are the Gecko profiles on Linux 64bit QR:


for tp5o_webext:

before: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FRTvuMmaqTXiVbSq4Vh9yrQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_tp5o_webext.zip

after: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FNoG2hGGeR4eOLdstHhQRvQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_tp5o_webext.zip


for ts_paint_webext:

before: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FRTvuMmaqTXiVbSq4Vh9yrQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_ts_paint_webext.zip

after: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FNoG2hGGeR4eOLdstHhQRvQ%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_ts_paint_webext.zip

Ionuț Goldan [:igoldan]

Reporter

Comment 2

•

6 years ago

for sessionrestore:

before: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FHJ_Zvv_fSSew9AG1Cddh_A%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_sessionrestore.zip

after: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FCttcgsBfSZmOzvq_s3x8OA%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_sessionrestore.zip

Ionuț Goldan [:igoldan]

Reporter

Comment 3

•

6 years ago

Also, these are the Gecko profiles for tart, on Windows 10 QR:

before: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FDPFB9OhJSKCm7DdtSgIisw%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_tart.zip

after: https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FBzUWG3ryTAGcY_MPd2gDBA%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_tart.zip

Markus Stange [:mstange]

Comment 4

•

6 years ago

I took a look at some of the profiles.

(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #1)
> Here are the Gecko profiles on Linux 64bit QR:
> 
> 
> for tp5o_webext:

I didn't take a look at these profiles because the profiles are divided by page, but the talos regression report does not have a breakdown over pages to actually say which got slower.

> for ts_paint_webext:

before: https://perfht.ml/2RYomuj
after: https://perfht.ml/2RYosSH

_init in i965_dri.so has an increase in self time from 230ms to 265ms. The FlushRendering call on the main thread blocks for 40ms longer than before.

(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #3)
> Also, these are the Gecko profiles for tart, on Windows 10 QR:

before: https://perfht.ml/2RXhmOh
after: https://perfht.ml/2RWMgWV

Composite times increase on average from 1.3ms to 1.5ms, roughly.

Jeff Muizelaar [:jrmuizel]

Comment 5

•

6 years ago

Here's the range of webrender commits:

https://github.com/servo/webrender/compare/98d507003c07c003ef0e0297dc4d29ee896a5868...a0a36d9b416ca3295f8def384814ffef60903a60

I guess we need to bisect the regression across these changes.

Flags: needinfo?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Comment 6

•

6 years ago

Kats, do you want to do the bisection?

Flags: needinfo?(kats)

Bobby Holley (:bholley)

Comment 7

•

6 years ago

It's at least plausible that it could be related to the shared depth targets patch [1] - either because creating the extra set of FBOs is more expensive than we thought, or because the sharing causes contention somehow. But it'd be somewhat surprising.

[1] https://github.com/servo/webrender/commit/f2c94a6ff1a6e62421992e84bf8e64391631ddf8

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 8

•

6 years ago

Yup I can do the bisection.

Flags: needinfo?(kats)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 9

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&author=kgupta%40mozilla.com&group_state=expanded&fromchange=1ced6bd8e2e89330f0b5a697c1bf89b596f17b82&tochange=d3cea1aa5aa0a3c3174355a7da538fd1beabb8c0

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 10

•

6 years ago

  6%  tp5o_webext responsiveness linux64-qr opt e10s stylo         1.76 -> 1.85
  5%  sessionrestore linux64-qr opt e10s stylo                     841.50 -> 881.00
  5%  sessionrestore_no_auto_restore linux64-qr opt e10s stylo     860.17 -> 899.00

These at least seem to be coming from servo/webrender#3208.

Still waiting on the windows results.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 11

•

6 years ago

  6%  tart windows10-64-qr opt e10s stylo                          3.85 -> 4.07

Also from servo/webrender#3208. Glenn, any thoughts?

Flags: needinfo?(gwatson)

Glenn Watson [:gw]

Assignee

Comment 12

•

6 years ago

Yep, that patch is likely to make sites with a lot of borders a bit slower, for now. It can result in more work being done per border - this is unlikely to be a problem on real world sites, but I can imagine it showing up if those tests have a lot of borders. Is that the case here?

I expect this regression to be removed once we take advantage of interning border primitives. I'm hoping to have that done in the next week or so, but the change referenced above is a blocker to making that happen, which I wanted to land incrementally.

Are we happy to live with that regression for a couple weeks and revisit once the border interning lands?

Flags: needinfo?(gwatson)

Bobby Holley (:bholley)

Comment 13

•

6 years ago

(In reply to Glenn Watson [:gw] from comment #12)
> Are we happy to live with that regression for a couple weeks and revisit
> once the border interning lands?

I am, assuming you're willing to take an action item to verify that we get a corresponding decrease in this metric once your followup bits land.

Glenn Watson [:gw]

Assignee

Updated

•

6 years ago

Assignee: nobody → gwatson

Markus Stange [:mstange]

Comment 14

•

6 years ago

Both the initialization time regression and the compositing time regression affect a browser window that is mostly empty, so it's not a "site with a lot of borders" - it's the browser UI that's affected.

And the additional 30ms in browser initialization seem unexpected to me.

Glenn Watson [:gw]

Assignee

Comment 15

•

6 years ago

Are we certain it is https://github.com/servo/webrender/issues/3208 ?

I misread the original patch and thought it was the patch that introduced multiple render tasks per border segment. In fact, it's a patch that should strictly be an optimization (adding more border pixels to the opaque pass). Mea culpa.

So, on closer inspection, that patch seems unlikely to cause this regression.

Flags: needinfo?(kats)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 16

•

6 years ago

The try pushes in comment 9 are based on the same inbound revision that bug 1499494 landed on (i.e. inbound revision 16ee6006e57c). And then on that base I applied the WR update to each merged PR and pushed to try. The talos numbers can be seen in the try pushes, it seems clear that all the numbers increase on the last (topmost) try push, which corresponds to PR 3208. I'm fairly certain I did it correctly, but please feel free to double-check.

Flags: needinfo?(kats)

Glenn Watson [:gw]

Assignee

Comment 17

•

6 years ago

Yep, your analysis seems correct to me. I'll do some investigation today to see if I can find out what's going on.

Glenn Watson [:gw]

Assignee

Comment 18

•

6 years ago

Assuming that the patch itself isn't completely broken (I just double checked myself, but perhaps someone could do a sanity check to make sure I didn't break all opacity calculations!), I wonder if it might be a driver thing where the border shader gets rebuilt / re-linked as it's probably now used both with blend enabled in transparent pass, and blend disabled in the opaque pass.

Matt, Dan, is there any easy way to check if this regression is related to driver time?

Flags: needinfo?(matt.woodrow)

Flags: needinfo?(dglastonbury)

Matt Woodrow (:mattwoodrow)

Comment 19

•

6 years ago

(In reply to Glenn Watson [:gw] from comment #18)
> Assuming that the patch itself isn't completely broken (I just double
> checked myself, but perhaps someone could do a sanity check to make sure I
> didn't break all opacity calculations!), I wonder if it might be a driver
> thing where the border shader gets rebuilt / re-linked as it's probably now
> used both with blend enabled in transparent pass, and blend disabled in the
> opaque pass.
> 
> Matt, Dan, is there any easy way to check if this regression is related to
> driver time?

The callstacks in the linux profiles are a bit broken, but it looks like first composite time went up by ~60ms. That's mostly time in _init in i965_dri.so. Hard to know exactly what that time is doing though.

The TART profile above just seem to have perf-html for me. They do affect Windows though, and TART explicitly discards results from the first few frames, and only measures frame times once the animation is running.

That suggests that there's an actual throughput regression, as well as the startup one.

Flags: needinfo?(matt.woodrow)

u480271

Comment 20

•

6 years ago

I'll run the test on my Windows box to see what I can see.

Flags: needinfo?(dglastonbury)

Glenn Watson [:gw]

Assignee

Comment 21

•

6 years ago

I did some experiments / sanity checks on talos:

(1) WR git:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5be6e54ed2c6fadc9fcdcfec07a85a6a1a6c6e66
 -> slow (as expected)

(2) WR git + revert the opaque border patch:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=11dab5a3e0b5745fb6b661f73ae33f026ce2fd65
 -> fast (as expected)

(3) WR git + disable one line in the patch that allows opaque border segments:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1d2a5c28d051f6d96f92b12da5a50a7ac857bd61
 -> slow

---

This suggests that there is nothing wrong with the patch per se (or at least, it's doing what's intended), and that the regression is occurring due to the change in drawing some border segments as opaque and some as transparent.

I would not be too surprised if drivers end up having to do something internally to handle the shader being used both with/without blending, but it's surprising to me that this seems to actually affect compositor time. It can introduce a small number of extra batches, but I'd expect this to be small, and outweighed by the GPU time gains from drawing borders as opaque. Certainly, in the synthetic tests cases I came up with, this is indeed a GPU time win.

I don't know much about talos - is it possible to run some subset of those tests locally, so I can try to get some GPU profile numbers in each case?

Glenn Watson [:gw]

Assignee

Comment 22

•

6 years ago

Errr, of course I meant *fast* for (3)

Ionuț Goldan [:igoldan]

Reporter

Comment 23

•

6 years ago

It seems that this AWSY regression is related to this bug. However, this one got fixed after one day.

== Change summary for alert #16917 (as of Wed, 17 Oct 2018 13:53:18 GMT) ==

Regressions:

  3%  Heap Unclassified linux64-qr opt stylo     275,262,815.12 -> 284,412,126.63

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=16917

Bobby Holley (:bholley)

Comment 24

•

6 years ago

Yes, you should be able to run all these tests locally. Dan can probably help, ping me if y'all get stuck.

Ionuț Goldan [:igoldan]

Reporter

Updated

•

6 years ago

Blocks: 1500846
No longer blocks: 1488324

Maire Reavy [:mreavy]

Updated

•

6 years ago

Blocks: stage-wr-trains

Priority: -- → P2

Ionuț Goldan [:igoldan]

Reporter

Comment 25

•

6 years ago

(In reply to Glenn Watson [:gw] from comment #21)
> I don't know much about talos - is it possible to run some subset of those
> tests locally, so I can try to get some GPU profile numbers in each case?

Yes, it is possible. For either of these tests:

tp5o_webext
sessionrestore
sessionrestore_no_auto_restore
ts_paint
ts_paint_webext

run the command bellow:

| ./mach talos-tests --activeTests <test_name> |

Ionuț Goldan [:igoldan]

Reporter

Comment 26

•

6 years ago

You can do the same for tart, by using | ./mach talos-tests --activeTests tart |.

Ionuț Goldan [:igoldan]

Reporter

Comment 27

•

6 years ago

:gw any updates here?

Flags: needinfo?(gwatson)

Bobby Holley (:bholley)

Comment 28

•

6 years ago

(Glenn is swamped, might make sense to hand off to Dan. Glenn, feel free to redirect the NI if so.)

Glenn Watson [:gw]

Assignee

Comment 29

•

6 years ago

Dan, have you got cycles to look at this? Feel free to ping me for further information / context.

Flags: needinfo?(gwatson) → needinfo?(dglastonbury)

u480271

Comment 30

•

6 years ago

I'll take a look.

Flags: needinfo?(dglastonbury)

u480271

Comment 31

•

6 years ago

I did some investigation and with opaque/alpha split enabled one extra shader is being created at start up. This might explain the regression in tp_paint_webext.

u480271

Comment 32

•

6 years ago

In discussion on irc, :gw said he would revert this change.

Flags: needinfo?(gwatson)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

6 years ago

See Also: → https://github.com/servo/webrender/pull/3259

Glenn Watson [:gw]

Assignee

Updated

•

6 years ago

Flags: needinfo?(gwatson)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 33

•

6 years ago

Try push of backout is at https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=db774ce00ac9562cf07a02d678c0e28ba8a3cd73 - the R1 failures are from from bug 1503845 and got fuzzed in that bug.

u587307

Comment 34

•

6 years ago

Attached file Bug 1500101 - Update webrender to commit 347e66c2aa117724ac6b0f391b346f9c6898ad11 (WR PR 3259). r?kats — Details

Pulsebot

Comment 35

•

6 years ago

Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8c20f4551080
Update webrender to commit 347e66c2aa117724ac6b0f391b346f9c6898ad11 (WR PR 3259). r=kats

Andrei Ciure[:aciure]

Comment 36

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/8c20f4551080

Status: NEW → RESOLVED

Closed: 6 years ago

status-firefox65: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla65

Ryan VanderMeulen [:RyanVM]

Updated

•

6 years ago

status-firefox63: --- → unaffected

status-firefox64: --- → wontfix

status-firefox-esr60: --- → unaffected

Ryan VanderMeulen [:RyanVM]

Updated

•

6 years ago

tracking-firefox65: --- → +

Ionuț Goldan [:igoldan]

Reporter

Updated

•

5 years ago

Status: RESOLVED → VERIFIED