Closed Bug 1381804 Opened 2 years ago Closed 2 years ago

3.49 - 16.91% Explicit Memory / Heap Unclassified / Images / JS / Resident Memory (linux64, osx-10-10, windows10-64, windows7-32) regression on push 87e8ccdd8aa6f6b549922b4debc19a4126b7d940 (Tue Jul 18 2017)

Categories

(Firefox :: New Tab Page, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 56
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 --- unaffected
firefox56 --- fixed

People

(Reporter: igoldan, Assigned: Mardak)

References

(Depends on 1 open bug)

Details

(Keywords: perf, regression, Whiteboard: [MemShrink:P2])

Attachments

(1 file)

We have detected an awsy regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=87e8ccdd8aa6f6b549922b4debc19a4126b7d940

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

 17%  Heap Unclassified summary osx-10-10 opt      77,016,564.00 -> 90,037,712.94
 15%  Heap Unclassified summary linux64 opt        55,789,392.80 -> 64,082,094.29
 13%  Explicit Memory summary osx-10-10 opt        337,052,939.74 -> 381,367,079.62
 13%  Explicit Memory summary linux64 opt          309,269,000.94 -> 348,551,130.50
 13%  Heap Unclassified summary windows7-32 opt    43,159,108.62 -> 48,630,002.11
 12%  Explicit Memory summary windows7-32 opt      249,614,815.22 -> 278,800,233.07
 12%  Resident Memory summary windows7-32 opt      324,037,806.09 -> 361,579,509.99
 11%  Images summary windows7-32 opt               5,179,007.74 -> 5,752,930.05
 11%  Heap Unclassified summary windows10-64 opt   49,339,847.70 -> 54,672,534.45
 10%  Explicit Memory summary windows10-64 opt     319,720,719.22 -> 353,055,828.50
 10%  JS summary osx-10-10 opt                     119,265,412.05 -> 131,332,908.21
 10%  JS summary linux64 opt                       115,662,161.37 -> 127,096,421.66
  9%  Resident Memory summary linux64 opt          464,586,620.65 -> 507,798,764.70
  9%  JS summary windows7-32 opt                   96,840,000.77 -> 105,586,174.61
  9%  JS summary windows10-64 opt                  130,129,768.85 -> 141,843,690.92
  8%  Resident Memory summary windows10-64 opt     482,051,675.76 -> 518,390,295.84
  3%  Resident Memory summary osx-10-10 opt        676,904,722.87 -> 700,508,634.93


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=8023

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/AWSY
:Mardak Please review the changes, due to the memory regressions that got in. Can you estimate a fix time for them?
Flags: needinfo?(edilee)
I've filed https://github.com/mozilla/activity-stream/issues/2894 with some info on the memory info and tooling we've already got.
gabor, do you know how to estimate much additional memory usage should be expected with activity-stream with enabled and dom.ipc.processPrelaunch.enabled true/false?
Flags: needinfo?(gkrizsanits)
(In reply to Ed Lee :Mardak from comment #3)
> gabor, do you know how to estimate much additional memory usage should be
> expected with activity-stream with enabled and
> dom.ipc.processPrelaunch.enabled true/false?

I really don't know, that depends mainly on the activity-streams memory footprint I guess, since we don't have any extra content processes (-1 from disabling the process prelaunch, +1 for running the preloaded browser in its own process).
Flags: needinfo?(gkrizsanits)
I manually pushed the patch to try as mozreview try doesn't seem to be working right now ?

https://treeherder.mozilla.org/perf.html#/compare?originalProject=autoland&originalRevision=87e8ccdd8aa6f6b549922b4debc19a4126b7d940&newProject=try&newRevision=b3636c388a2cd3d6029ee8d99637c71c98c9a69a&framework=4&showOnlyImportant=0

Only one run of Linux64 in, but looks like a slight improvement?

(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #1)
> Can you estimate a fix time for them?
It's our top priority right now with some initial leads on fixing it now that we know it's better to rely on `try` instead of our `pine` repository. The primary lead is switching ipc prelaunch to false with bug 1376895 making that logic smarter. But if going false only reduces it by a small amount, there will need to be additional slimming work.
Flags: needinfo?(edilee)
Comment on attachment 8887524 [details]
Bug 1381804 - Disable process prelaunch while activity-stream is enabled.

https://reviewboard.mozilla.org/r/158390/#review163678

LGTM. Let's see how much this eats out of the AWSY regression...
Attachment #8887524 - Flags: review?(mconley) → review+
Expanding the perfherder view to include the explicit subtests it looks like the primary regression is for the "tabs closed" portion of the test. The about:memory reports show that before landing there was 1 web content process, after landing there are now 2 web content processes.
erahm, thanks for taking a closer look. The intended behavior of bug 1376895 is to reuse existing web content processes, so I believe that would result in activity-stream reusing the 1 available web content process.

mconley, how do we make sure bug 1376894 is fixed "in time"?

Initial results from disabling prelaunch appears to improve Linxu64 and Windows64 "Heap Unclassified Fresh start opt" by 6-13% while "Heap Unclassified Tabs closed" only improves by 1-2%. Should we go ahead and land that for now, keeping this bug open blocked on bug 1376894?
Depends on: 1376894
Flags: needinfo?(mconley)
Whiteboard: [MemShrink]
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #0)
> 15% Heap Unclassified summary linux64 opt 55,789,392.80 -> 64,082,094.29
I did a few try pushes with various general changes to measure the behavior of awsy results for linux64 Heap Unclassified summary:

 ~0.1% regression on exporting latest activity-stream that had a number of new content-space changes [1]
 ~2.0% improvement with the prelaunch=false patch from this bug (matches comment 10) [2]
 ~3.0% improvement with an empty activity-stream.bundle.js (disables content logic) and prelaunch=false [3]
 ~3.9% improvement with an empty bootstrap.js (disables add-on) and prelaunch=false [4]
~15.5% improvement with about:newtab in main process (and prelaunch=false) [5]

So basically it's the same results that erahm said in comment 8 of having the additional web content process.

Depending on the urgency of fixing this awsy regression, the "simple" fix is to back out bug 1365643 that forced activity-stream about:newtab to be in the child process. Although it's not so simple as we did try to back it out before, and there were new test failures…

And it sounds like the actual fix is bug 1376895 of not creating new content processes just for about:newtab.

[1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=dece50457378&newProject=try&newRevision=c822acc36fe621856a1e1c7ae8479871d46666e8&framework=4&showOnlyImportant=0
[2] https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=dece50457378&newProject=try&newRevision=4d90f5bbf8236dd783559616e51f63c8cb9bbd10&framework=4&showOnlyImportant=0
[3] https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=dece50457378&newProject=try&newRevision=dfb40c63d672698017c9dd1bda6118b3c9f2395c&framework=4&showOnlyImportant=0
[4] https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=dece50457378&newProject=try&newRevision=f5b74b72c35d6c8ebe83d12e02589c8b37194f9c&framework=4&showOnlyImportant=0
[5] https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=dece50457378&newProject=try&newRevision=ef694bd5da065c9d3b2fc129450dbac5796d54fa&framework=4&showOnlyImportant=0
Drilling into the subtests of Heap Unclassified:

(In reply to Ed Lee :Mardak from comment #11)
> ~2.0% improvement with the prelaunch=false patch from this bug (matches comment 10)
The improvement here is primarily 8.8% improvement of "Fresh start" and 12.5% improvement of "Fresh start [+30s]"

> ~3.0% improvement with an empty activity-stream.bundle.js (disables content logic) and prelaunch=false
The additional improvement here is from ~1% improvement of the various "Tab closed" subtests.

> ~3.9% improvement with an empty bootstrap.js (disables add-on) and prelaunch=false
Even more additional "Tab closed" improvement ranging from ~2-5%.

> ~15.5% improvement with about:newtab in main process (and prelaunch=false)
And the big "Tab closed" improvement of 27-31%.
Pushed by edilee@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/4f418e6b759c
Disable process prelaunch while activity-stream is enabled. r=mconley
Depends on: 1382079
(In reply to Ed Lee :Mardak from comment #12)
> > ~15.5% improvement with about:newtab in main process (and prelaunch=false)
> And the big "Tab closed" improvement of 27-31%.

That sounds like you're turning off a feature (prelaunch=false) to counter the memory cost of activity stream. If we move about:newtab into the child process, we get an extra content process for the preloaded browser, so I'm fine with temporarily disabling the preallocated process (so we don't have two cached processes). But if the about:newtab is in the parent, and we turn off prelaunch then it just means we won't have any cached content processes, which means you sacrifice another feature for activiy stream. Is that correct? Because I don't think we agreed on this version.
Flags: needinfo?(edilee)
Looks like bug 1381569 landed some improvements, which just showed up:

== Change summary for alert #8033 (as of July 17 2017 23:19 UTC) ==

Improvements:

 13%  tart summary windows10-64 opt e10s     6.25 -> 5.43
 13%  tart summary windows7-32 opt e10s      7.71 -> 6.74
 11%  tart summary windows7-32 pgo e10s      5.58 -> 4.96
  8%  tart summary linux64 opt e10s          5.94 -> 5.45
  7%  tart summary linux64 pgo e10s          5.05 -> 4.68
  3%  sessionrestore_many_windows linux64 pgo e10s1,854.33 -> 1,801.33
  2%  sessionrestore_many_windows linux64 opt e10s2,167.29 -> 2,116.92
  2%  sessionrestore_many_windows linux64 opt e10s2,165.62 -> 2,117.83

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8033
Push from comment 13 landed a couple of small improvements:

== Change summary for alert #8064 (as of July 19 2017 01:38 UTC) ==

Improvements:

  2%  Heap Unclassified summary windows10-64 opt      54,676,310.57 -> 53,340,925.84
  2%  JS summary windows7-32 opt                      105,667,548.61 -> 103,210,541.82
  2%  Explicit Memory summary windows7-32 opt         277,747,218.50 -> 271,434,429.71
  2%  JS summary windows10-64 opt                     141,382,101.68 -> 138,319,200.90

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8064
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #14)
> (In reply to Ed Lee :Mardak from comment #12)
> > > ~15.5% improvement with about:newtab in main process (and prelaunch=false)
> > And the big "Tab closed" improvement of 27-31%.
> That sounds like you're turning off a feature (prelaunch=false) to counter
> the memory cost of activity stream.
The prelaunch=false that just landed is intended to be temporary. That's why I put the (prelaunch=false) in parenthesis because it wasn't intentional that I ran those numbers with main thread + prelaunch=false -- it just happened to be at the tail end of a series of increasing removal of activity-stream functionality as I was pushing to try. I've started a new try build with main thread and prelaunch=true to confirm the awsy improvement.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ccb07a59b8f993632bf4974fbcd147cea5d23268
Flags: needinfo?(edilee)
Following up on comment 18 with just moving activity stream to main process and restoring prelaunch=true:
Linux64 Heap Unclassified improves ~12.9% [1] with the specific subtests of "Tabs closed" improving 27-32% [2].

But it sounds like reverting to main process is undesired and should only be used as a last resort.

I attempted to make awsy happier by aggressively removing preloaded browsers when tabs are closed [3], and that appears to have a 13.5% Linux 64 Heap Unclassified improvement [4] with the "Tabs closed" improving 28-32% [5].

However, this is also undesired as it effectively negates the benefit of preloading (and naively breaks a bunch of tests).

[1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=dece50457378&newProject=try&newRevision=ccb07a59b8f993632bf4974fbcd147cea5d23268&framework=4&showOnlyImportant=0
[2] https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-central&originalRevision=dece50457378&newProject=try&newRevision=ccb07a59b8f993632bf4974fbcd147cea5d23268&originalSignature=cf8c73114da80e660331e208dcca846248e060f9&newSignature=cf8c73114da80e660331e208dcca846248e060f9&framework=4
[3] https://hg.mozilla.org/try/rev/a39b99dd0797d4ea08e7d2ebceff7f47c868bf4d
[4] https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=dece50457378&newProject=try&newRevision=27afe04f26c7f349cb1c678327ba75b5a8ce49f5&framework=4&showOnlyImportant=0
[5] https://treeherder.mozilla.org/perf.html#/comparesubtest?framework=4&newProject=try&newRevision=27afe04f26c7f349cb1c678327ba75b5a8ce49f5&newSignature=cf8c73114da80e660331e208dcca846248e060f9&originalProject=mozilla-central&originalRevision=dece50457378&originalSignature=cf8c73114da80e660331e208dcca846248e060f9
So I just had to turn off the preallocated process manager on beta because of some crashes that are probably related. So I'm not sure it will be on for 56 at all. (I will have to fix the crash and somehow make it work together with the preloaded browser).
Flags: needinfo?(edilee)
Depends on: 1382746
I see some talos tp5 memory improvements from this bug:
== Change summary for alert #8068 (as of July 19 2017 01:40 UTC) ==

Improvements:

 20%  tp5o Private Bytes linux64 opt e10s     1,211,844,140.19 -> 971,596,721.00
 20%  tp5o Private Bytes linux64 pgo e10s     1,212,104,562.66 -> 971,949,670.05
 19%  tp5o_webext Private Bytes linux64 opt e10s1,216,135,665.10 -> 979,209,780.22
 19%  tp5o_webext Private Bytes linux64 pgo e10s1,217,831,731.05 -> 981,703,823.05

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8068

this is specifically for this push:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=f190bd0e38174e810d6347d1e8a88f4be4b60d55&tochange=4f418e6b759c5bdbc85f21559c528dfe9d2791c9
No longer depends on: 1382079
Component: Activity Streams: General → Activity Streams: Newtab
I believe we've sorted out a plan for the remaining regression. As I understand, we're going to modify the AWSY test to clear out the preloaded browser before running the tabs closed checkpoints. We're also adding a checkpoint before the preloaded browser goes away.
Flags: needinfo?(mconley)
Just to document the related bugs:

Bug 1382746 added the helper function to remove the preloaded browser
Bug 1382889 will update AWSY to use that helper function

Bug 1363601 turned off the preallocated process manager across all channels
Flags: needinfo?(edilee)
it looks like we are just waiting on bug 1382889, it will be good to see this resolved prior to merge day next Monday.
Whiteboard: [MemShrink] → [MemShrink:P2]
Depends on: 1382889
Resolving fixed with both bug 1382746 and bug 1382889 landing.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Assignee: nobody → edilee
Keywords: leave-open
Target Milestone: --- → Firefox 56
Component: Activity Streams: Newtab → New Tab Page
You need to log in before you can comment on or make changes to this bug.