Closed Bug 1503794 Opened 6 years ago Closed 5 years ago

3.1 - 9.83% displaylist_mutate (linux64-qr, windows10-64-qr) regression on push 68aa8f295bd5bd25ad05b003972cedb977e5a570 (Wed Oct 31 2018)

Categories

(Core :: Graphics: WebRender, defect, P3)

65 Branch
Unspecified
All
defect

Tracking

()

VERIFIED FIXED
mozilla65
Tracking Status
firefox-esr60 --- unaffected
firefox63 --- unaffected
firefox64 --- unaffected
firefox65 --- fixed

People

(Reporter: igoldan, Assigned: gw)

References

Details

(Keywords: perf, regression, talos-regression)

Talos has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=68aa8f295bd5bd25ad05b003972cedb977e5a570

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

 10%  displaylist_mutate windows10-64-qr opt e10s stylo     5,069.52 -> 5,567.95
  3%  displaylist_mutate linux64-qr opt e10s stylo          5,064.31 -> 5,221.44


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=17277

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Performance_sheriffing/Talos/RegressionBugsHandling
Flags: needinfo?(kats)
:kats can you also mention which of the following bugs is more related to these regressions?

bug 1503528
bug 1502585
bug 1503442

Thanks!
Component: General → Graphics: WebRender
Product: Testing → Core
I'll post the Gecko profiles soon.
Based on the try pushes in bug 1503442, it is a regression from servo/webrender#3244 (i.e. bug 1503442).
Blocks: 1503442
Flags: needinfo?(kats) → needinfo?(gwatson)
I tried to look at the profiles, but they seem to only contain profile information for the content process. I expect this regression is probably in the GPU process inside WR code. Is it possible to get a profile with threads from the GPU process? Or is there a way to show that in the existing profiles? (I don't know the profiler very well).

I'm not particularly surprised by this regression. That patch adds a significant amount of extra calculation work, by pre-calculating the bounding rects of primitives and building them into clusters during scene building. However, the follow up work that actually takes advantage of this is not yet implemented (the follow up work will make use of these clusters to remove the need for most per-primitive bounding rect and visibility calculations during frame building).

Because this is a fairly large amount of code changes, I'm landing it in incremental parts.

So, ideally we'd take this hit for now, and follow up on it when the rest of the picture caching work lands (I could take this bug and comment on it when that arrives, if that suits).

Does that seem reasonable?
Flags: needinfo?(gwatson)
(In reply to Glenn Watson [:gw] from comment #5)
> So, ideally we'd take this hit for now, and follow up on it when the rest of
> the picture caching work lands (I could take this bug and comment on it when
> that arrives, if that suits).
> 
> Does that seem reasonable?

Yes, sounds good and I agree. Will be waiting for the next related bugs.
Priority: -- → P3
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #6)
> (In reply to Glenn Watson [:gw] from comment #5)
> > So, ideally we'd take this hit for now, and follow up on it when the rest of
> > the picture caching work lands (I could take this bug and comment on it when
> > that arrives, if that suits).
> > 
> > Does that seem reasonable?
> 
> Yes, sounds good and I agree. Will be waiting for the next related bugs.

:gw have you filed any bugs for this matter?
Flags: needinfo?(gwatson)
Not specifically for this regression - https://bugzilla.mozilla.org/show_bug.cgi?id=1494775 is the metabug for the picture caching work which we expect to resolve the regressions here, when complete. If / when I am able to land incremental patches I'll reference them here too.
Flags: needinfo?(gwatson)
See Also: → picture-caching
Blocks: stage-wr-next
No longer blocks: stage-wr-trains
I think this can be marked fixed and closed now?
Flags: needinfo?(igoldan)
(In reply to Glenn Watson [:gw] from comment #9)
> I think this can be marked fixed and closed now?

It looks like so. But please confirm whether bug 1511042 is the one that won back the performance.
Flags: needinfo?(igoldan) → needinfo?(gwatson)
It's not bug 1511042 that won back the performance from this regression, but it's also not easy to point at a single improvement and say "it was this change that won back the perf from this particular regression". If I had to pick one based on the graph I'd say bug 1509302, since that was the one that brought the displaylist_mutate perf back up to better than the original levels.
Status: NEW → RESOLVED
Closed: 5 years ago
Depends on: 1509302
Flags: needinfo?(gwatson)
Resolution: --- → FIXED
Assignee: nobody → gwatson
Target Milestone: --- → mozilla65
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.