Closed Bug 1463397 Opened 2 years ago Closed 2 years ago

7.67% Resident Memory (osx-10-10) regression on push 9c10d9a7198331b7a650cef79ed37f36d23a2c88 (Tue May 22 2018)

Categories

(Core :: Graphics: Layers, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox60 --- unaffected
firefox61 --- unaffected
firefox62 --- fixed

People

(Reporter: igoldan, Assigned: dthayer)

References

(Blocks 1 open bug)

Details

(Keywords: perf, regression, Whiteboard: gfx-noted)

We have detected an awsy regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=9c10d9a7198331b7a650cef79ed37f36d23a2c88

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

  8%  Resident Memory osx-10-10 opt stylo     710,401,289.13 -> 764,896,204.03

Improvements:

  8%  Images linux64-qr opt stylo     6,896,641.37 -> 6,314,073.83


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=13354

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/AWSY/Tests
Component: General → Graphics: Layers
Product: Testing → Core
:dthayer This is the final question: are we going to accept this memory regression?
Flags: needinfo?(dothayer)
This is currently limited to Nightly via a pref. The plan is to measure the impact of bug 1176019 on tab switch spinner telemetry numbers in Nightly[1] before deciding how we want to go forward with this. If the results are significant, we have a few options we want to talk about, such as capping the memory footprint of our cache at a specific size, and/or flattening the layer tree for cached tabs to reduce the per-tab size.

(Leaving my needinfo until we have more information from telemetry.)

[1] https://mikeconley.github.io/bug1310250/
We noticed yet another set of regressions. From Talos, this time:

== Change summary for alert #13370 (as of Tue, 22 May 2018 16:03:27 GMT) ==

Regressions:

 12%  tps windows7-32 pgo e10s stylo     12.51 -> 14.06
 11%  tps windows10-64 pgo e10s stylo    12.97 -> 14.40
 11%  tps windows10-64-qr opt e10s stylo 11.32 -> 12.56

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=13370
Hmm. That's certainly interesting. Ultimately tab switch time should be improved regardless, and what tps is now measuring is the time before the second, forced paint comes out, so if I understood why this was happening I think it would be acceptable, but I can't say I do understand. Any hunches, mconley?
Flags: needinfo?(mconley)
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #3)
Very interesting. Do you have easy access to before and after profiles for this?
Flags: needinfo?(mconley) → needinfo?(igoldan)
(In reply to Mike Conley (:mconley) (:⚙️) (Catching up on needinfos / reviews) from comment #5)
> (In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #3)
> Very interesting. Do you have easy access to before and after profiles for
> this?

Yes, I can generate them.
Flags: needinfo?(igoldan)
Whiteboard: gfx-noted
Clearing the needinfo - this will go away while we disable the tab layer cache in bug 1465106. I don't know if you want to leave this open for tracking purposes or not given that.
Flags: needinfo?(dothayer)
Ah, for the tps regression, I have a theory.

tps was modified in bug 1451460 to make it so that we use the MozLayerTreeReady event in the parent process to "stop" the Talos stopwatch when measuring tab switch time.

I suspect that when we have cached layers, we don't actually get that event - that event is what's used to transition something from STATE_LOADING to STATE_LOADED.

I suspect we'll either have to:

1) Disable caching for tps, and make it explicit that tps measures the "worst case scenario", where a tab is not cached or warm
2) Modify tps to account for cached layers.
(In reply to Doug Thayer [:dthayer] (PTO on June 4) from comment #8)
> Clearing the needinfo - this will go away while we disable the tab layer
> cache in bug 1465106. I don't know if you want to leave this open for
> tracking purposes or not given that.

I'll close this bug.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Assignee: nobody → dothayer
Target Milestone: --- → mozilla62
You need to log in before you can comment on or make changes to this bug.