Closed Bug 1557704 Opened 5 years ago Closed 5 years ago

2.55 - 5.05% tsvgr_opacity (linux64-shippable-qr, windows10-64-shippable-qr) regression on push d02c7d06cc798c961d8ee82d490ada2603881fd8 (Fri May 17 2019)

Categories

(Core :: CSS Parsing and Computation, defect, P3)

Desktop
All
defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox67 --- unaffected
firefox68 --- unaffected
firefox69 --- wontfix
firefox70 --- wontfix
firefox71 --- wontfix

People

(Reporter: igoldan, Unassigned)

References

(Regression)

Details

(4 keywords)

Talos has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c110061df4c16715a491ead8cab815a259ae7846&tochange=d02c7d06cc798c961d8ee82d490ada2603881fd8

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

5% tsvgr_opacity windows10-64-shippable-qr opt e10s stylo 91.22 -> 95.82

You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=21245

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Performance_sheriffing/Talos/RegressionBugsHandling

Component: General → CSS Parsing and Computation
Product: Testing → Core
Flags: needinfo?(emilio)

It's likely this affects Linux 64bit QR also, but I need more data to confirm.

Are there any profiles I could look at?

Flags: needinfo?(emilio) → needinfo?(igoldan)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #2)

Are there any profiles I could look at?

At the moment, we cannot provide them on Windows. There's a bug for this on file.

Flags: needinfo?(igoldan)

Also, it seems to me those tests don't have a single transform, are you sure it's blamed to the right bug?

(In reply to Emilio Cobos Álvarez (:emilio) from comment #4)

Also, it seems to me those tests don't have a single transform, are you sure it's blamed to the right bug?

Not entirely, as the graph is very noisy. But that could be confirmed via a before/after Try push.

Could you do that? If not I'm happy to try giving that a shot.

Flags: needinfo?(igoldan)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #6)

Could you do that? If not I'm happy to try giving that a shot.

Yes, will report back. In the meantime, seems like Linus got affected as well, by this regression.

3% tsvgr_opacity linux64-shippable-qr opt e10s stylo 94.83 -> 97.25

Flags: needinfo?(igoldan)
Hardware: Unspecified → Desktop
OS: Unspecified → All

I also noticed bug 1552329 causing some extra regressions. I included it also in the Try bisection.

Regressed by: 1552329

(In reply to Ionuț Goldan [:igoldan], Performance Sheriff from comment #8)

I also noticed bug 1552329 causing some extra regressions. I included it also in the Try bisection.

Wait is that right? That's a build-system-only change, makes no sense to regress anything (other than build-times I guess)

Baseline vs patch which includes bug 1549593, bug 1550554 and bug 1551991

Baseline vs patch which include all 4 bugs (bug 1549593, bug 1550554, bug 1551991 and bug 1552329)

Some extra retriggers will need to be issued.

Priority: -- → P3

Any updates here, Ionut?

:igoldan is pto and will be back on 22-july. :bebe can you help here?

Flags: needinfo?(fstrugariu)

did some retriggers to generate more data for the bisection

Flags: needinfo?(fstrugariu)

Any result on those retriggers?

Assignee: nobody → fstrugariu
Flags: needinfo?(fstrugariu)

From the bisection results we can see that:
Baseline vs patch which includes bug 1549593, bug 1550554 and bug 1551991
6.64% tsvgr_opacity opt e10s stylo on windows10-64-shippable-qr 86.00 -> 91.71

Baseline vs patch which include all 4 bugs (bug 1549593, bug 1550554, bug 1551991 and bug 1552329)
8.70% tsvgr_opacity opt e10s stylo on windows10-64-shippable-qr 86.00 -> 93.48

So Ionut was right bug 1552329 is introducing further regressions

Flags: needinfo?(fstrugariu)

Emilio, Sean, is this regression something you're concerned about? Is there any way to mitigate it?

Flags: needinfo?(svoisen)
Flags: needinfo?(emilio)

So this is quite strange. The only thing bug 1552329 does is fixing some symbol mangling that is relevant for LTO, but I wouldn't expect it to affect performance, it should be a build-time change only. Michael, do you have any clue of how the bindgen link-name fix could have affected WebRender performance? At first I thought some PGO information could've started or stopped working, but it doesn't seem to be possible, and WebRender doesn't use bindgen itself.

Regarding the previous changes, I fail to see how they can make webrender-alone slower. All-in-all I'd expect them to make stuff faster, but it seems to me none of the changes in those bugs affect the two trsvg_opacity test-case. The only properties those use is stroke, opacity, width and height. The patches affect:

I thought of some sort of cache-locality effect, but that cannot be the case since in most cases the size of the struct ended up being the same, and in the cases where it ended up being slightly larger, it should be faster to look at lists of empty properties, since we store the size inline rather than a pointer away.

In any case, I don't think this is terribly concerning. It's a small-ish regression, qr-only, and looking at the graph it doesn't seem that in the big picture there has been a regression over-all.

I'll give a shot at profiling a release build on this test-case a bit and see if I spot something particularly concerning, but I don't think I can figure out the bindgen change without a deeper understanding of our windows rust build setup.

Flags: needinfo?(emilio) → needinfo?(mwoerister)

Hm, nothing comes to mind immediately. If anything the code generated by bindgen after the fix should be less of a corner case for the compilers involved than before.

Flags: needinfo?(mwoerister)

Doesn't look like we're going to make progress on this in time for 69. TBH, I'm wondering if we should just resolve this as wontfix at this point.

(In reply to Ryan VanderMeulen [:RyanVM] from comment #11)

Any updates here, Ionut?

Removing ni? from myself, as Florin provided the response.

Flags: needinfo?(igoldan)

:RyanVM any progress on this?

Flags: needinfo?(ryanvm)

Not that I know of - was there a reason you're asking me?

Flags: needinfo?(fstrugariu)

Ryan we are trying to get some traction on old performance regressions that have no activity. As you stated in comment #19 I'm wondering if we should just resolve this as wontfix at this point
I was thinking you have more information on this. If not and we are not planing to fix this maybe we should just makit as wontfix.

Assignee: fstrugariu → nobody
Flags: needinfo?(fstrugariu)

That's a decision for the Layout team, not me. I was just making a suggestion :)

Flags: needinfo?(ryanvm)

Given Emilio's comment 17 and also comment 18, I don't think we're going to get anywhere with this, and the regression is not too significant. Let's mark wontfix.

Status: NEW → RESOLVED
Closed: 5 years ago
Flags: needinfo?(svoisen)
Resolution: --- → WONTFIX
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.