5.62 - 5.77% tcanvasmark (osx-10-10) regression on push 9beaa75894056714aefe9737fbafc19afa3e8df7 (Wed Aug 10 2016)

VERIFIED FIXED in Firefox 50

Status

()

VERIFIED FIXED
2 years ago
2 years ago

People

(Reporter: ashiue, Assigned: nical)

Tracking

({perf, regression, talos-regression})

51 Branch
Firefox 51
perf, regression, talos-regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox50 fixed, firefox51 fixed)

Details

Attachments

(3 attachments)

(Reporter)

Description

2 years ago
Talos has detected a Firefox performance regression from push 9beaa75894056714aefe9737fbafc19afa3e8df7. As author of one of the patches included in that push, we need your help to address this regression.

Summary of tests that regressed:

  tcanvasmark summary osx-10-10 opt: 6325.46 -> 5969.71 (5.62% worse)
  tcanvasmark summary osx-10-10 opt e10s: 6281.52 -> 5919.33 (5.77% worse)


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=2368

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
(Assignee)

Comment 2

2 years ago
Wow, I didn't expect that.
Assignee: nobody → nical.bugzilla
Flags: needinfo?(nical.bugzilla)
(Assignee)

Comment 3

2 years ago
> *** Please let us know your plans within 3 business days, or the offending
> patch(es) will be backed out! ***

I'll be away tomorrow and Monday so it will take a little longer for me to figure this out.
(Assignee)

Comment 4

2 years ago
It looks like the regression is caused by the extra book-keeping we do to serialize and replay clips and transforms every frame, which was not previously done with skiagl. A workaround would be to have a specific code path for the non-shared persistent buffer provider, which would make us not pop all clips at the end of the frame and not re-apply them at the beginning of the next frame (we can't do that with the shared buffer provider unfortunately, because we get a different DrawTarget every frame).
This is probably going to be quite the ugly hack.
I would like to avoid ugly hacks if possible- is this a regression we should accept?
(Assignee)

Comment 6

2 years ago
Created attachment 8781586 [details] [diff] [review]
Skip the push/pop clip dance if we can

While conceptually ugly it's not that bad in terms of added code, so I guess it'll do. Hopefully we will eventually get to a place where enough configurations use PersistentBufferProviderShared and the performance of the basic one don't matter (and we and remove this hack), although it's not clear whether it will be possible for skiagl in all cases.
Attachment #8781586 - Flags: review?(bas)
Attachment #8781586 - Flags: review?(bas) → review+

Comment 7

2 years ago
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/775fa3ede160
Avoid removing and re-applying the drawing states with PersistentBufferProviderBasic. r=Bas
There is also a mochitest on Linux which failed twice (at the moment):

03:52:04     INFO -  1843 INFO TEST-UNEXPECTED-FAIL | dom/canvas/test/test_canvas.html | pixel 20,20 of c628 is 0,0,0,0; expected 0,255,0,255 +/- 0
03:52:04     INFO -      isPixel@dom/canvas/test/test_canvas.html:140:5
03:52:04     INFO -      test_initial_reset_clip@dom/canvas/test/test_canvas.html:19953:1
03:52:04     INFO -      runTests@dom/canvas/test/test_canvas.html:25437:3
03:52:04     INFO -      SimpleTest._newCallStack/rval@SimpleTest/SimpleTest.js:144:17
03:52:04     INFO -      EventHandlerNonNull*this.addLoadEvent@SimpleTest/SimpleTest.js:169:13
03:52:04     INFO -      @SimpleTest/SimpleTest.js:1369:5

https://treeherder.mozilla.org/logviewer.html#?job_id=34093932&repo=mozilla-inbound

Please check the push for more failures when it's complete.
Lots of things broke from this patch. I would recommend a full try run before relanding this.
(Assignee)

Comment 11

2 years ago
try run with fixed patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2b324f0a4ea2e54bb0bd53fd7acb66541387a75b

The result is full of intermittents but none of them are actually canvas2d related, except test_streams_capture_origin.html which I am confident is not related to my patch.
Flags: needinfo?(nical.bugzilla)

Comment 12

2 years ago
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d35d1cbc59e8
Avoid removing and re-applying the drawing states with PersistentBufferProviderBasic. r=Bas

Comment 13

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/d35d1cbc59e8
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox51: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 51
I looked at the graphs linked in comment 0 and I don't see any change in the values.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 15

2 years ago
I have bisected the problem and I have a patch that restores the performance back ton where it was before the regression. I am looking into why it does, but in any case, a patch is coming soon.
Attachment #8784254 - Flags: review?(bas) → review+

Comment 17

2 years ago
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ac16b2ccd7a1
Move restoring the canvas clip stack to its own method and early return form EnsureTarget. r=Bas

Comment 18

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/ac16b2ccd7a1
Status: REOPENED → RESOLVED
Last Resolved: 2 years ago2 years ago
Resolution: --- → FIXED
I have verified this via the graphs!
Status: RESOLVED → VERIFIED
(Assignee)

Comment 20

2 years ago
Comment on attachment 8781586 [details] [diff] [review]
Skip the push/pop clip dance if we can

Approval Request Comment
[Feature/regressing bug #]:
[User impact if declined]: There are patches that we need to uplift, and are risky to uplift without uplifting this one as well.
[Describe test coverage new/current, TreeHerder]: None.
[Risks and why]: low, this has been in central fo a little while without causing issues.
[String/UUID change made/needed]: None.
Attachment #8781586 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 21

2 years ago
Comment on attachment 8784254 [details] [diff] [review]
Return early from EnsureTarget in the common case.

Review of attachment 8784254 [details] [diff] [review]:
-----------------------------------------------------------------

Uplift request for the same reason as the other patch (this is in the middle of a series of patch that need to be uplifted, and skipping this makes it risky to uplift the others). This patch also fixes a 5% talos regression.
Attachment #8784254 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 22

2 years ago
Comment on attachment 8784254 [details] [diff] [review]
Return early from EnsureTarget in the common case.

I'm trying to find the smallest set of patches that I need uplift if copy-on-write canvas is disabled on aurora. This one should not be needed.
Attachment #8784254 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 23

2 years ago
(In reply to Nicolas Silva [:nical] from comment #20)
> Comment on attachment 8781586 [details] [diff] [review]
> Skip the push/pop clip dance if we can
> [User impact if declined]: There are patches that we need to uplift, and are
> risky to uplift without uplifting this one as well.

Now that I've changed my aurora plans, this patch is more important than I originally explained because it makes the canvas code avoid some risky code paths and lets us skip uplifting the numerous patches that fix these code paths.

Updated

2 years ago
status-firefox50: --- → affected
Comment on attachment 8781586 [details] [diff] [review]
Skip the push/pop clip dance if we can

Fixes a perf regression, fix was verified on Nightly, Aurora50+
Attachment #8781586 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
This hit conflicts on uplift, can we get a rebase patch?
Flags: needinfo?(nical.bugzilla)
(Assignee)

Comment 26

2 years ago
Created attachment 8788810 [details] [diff] [review]
Aurora patch

This should be applied after the patch in bug 1292870.
Flags: needinfo?(nical.bugzilla)

Comment 27

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/3cad1bde8677
status-firefox50: affected → fixed

Updated

2 years ago
Depends on: 1339762
You need to log in before you can comment on or make changes to this bug.