Closed Bug 1361808 Opened 7 years ago Closed 7 years ago

25.14 - 25.23% tp5o Private Bytes (linux64) regression on push b62eed78de3d901b449021c43401fcd6ac4bd060 (Wed May 3 2017)

Categories

(Core :: DOM: Content Processes, defect)

53 Branch
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, talos-regression)

Talos has detected a Firefox performance regression from push b62eed78de3d901b449021c43401fcd6ac4bd060. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

 25%  tp5o Private Bytes linux64 opt e10s     948722307.93 -> 1188088123.24
 25%  tp5o Private Bytes linux64 pgo e10s     949199290.64 -> 1187866705.13


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=6332

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
:gabor, it looks like your recent change is causing a regression in the private bytes measured for linux.  Can you take a look at this and determine what we need to do (fix this, back out, accept it)?
Flags: needinfo?(gkrizsanits)
Component: Untriaged → DOM: Content Processes
Product: Firefox → Core
Presumably this is because we now have 2 content processes loaded rather than one, if I'm understanding how the preallocated process manager works (and how tp5o works, I'm assuming it's one page at a time, not multitab). So this is somewhat expected.

Also note that private bytes is questionable as a metric, particularly on Linux. See bug 1279396.
(In reply to Eric Rahm [:erahm] (PTO May 4 - 8) from comment #2)
> Presumably this is because we now have 2 content processes loaded rather
> than one, if I'm understanding how the preallocated process manager works
> (and how tp5o works, I'm assuming it's one page at a time, not multitab). So
> this is somewhat expected.

Compared to this huge number (1188088123) one additional content process without much content should not make such a drastic difference... And it's also weird that it only happens on linux.

> 
> Also note that private bytes is questionable as a metric, particularly on
> Linux. See bug 1279396.

Thanks a lot for the link I was freaking out a bit before reading this... So if I understand it correctly what happens is that since we have one more process alive we reserve more address space (probably we had 3 processes before my patch and 4 after?). OR am I misinterpret what these map files do on linux? It feels to me that we should just ignore this regression and fix bug 1279396...
Flags: needinfo?(gkrizsanits)
some regressions from AWSY:
== Change summary for alert #6453 (as of May 08 2017 16:06 UTC) ==

Regressions:

  4%  JS summary linux32 opt      134,258,029.81 -> 139,853,996.38
  3%  Explicit Memory summary linux32 opt 125,156,689.33 -> 129,273,691.54

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=6453
(In reply to Joel Maher ( :jmaher) from comment #4)
> some regressions from AWSY:
> == Change summary for alert #6453 (as of May 08 2017 16:06 UTC) ==
> 
> Regressions:
> 
>   4%  JS summary linux32 opt      134,258,029.81 -> 139,853,996.38
>   3%  Explicit Memory summary linux32 opt 125,156,689.33 -> 129,273,691.54
> 
> For up to date results, see:
> https://treeherder.mozilla.org/perf.html#/alerts?id=6453

Those numbers make a lot more sense. I think we can live with that memory overhead in exchange for the added benefits.
Setting needInfo for Eric about Comment 3 to make sure I'm interpreting this right.
Flags: needinfo?(erahm)
(In reply to Joel Maher ( :jmaher) from comment #4)
> some regressions from AWSY:
> == Change summary for alert #6453 (as of May 08 2017 16:06 UTC) ==
> 
> Regressions:
> 
>   4%  JS summary linux32 opt      134,258,029.81 -> 139,853,996.38
>   3%  Explicit Memory summary linux32 opt 125,156,689.33 -> 129,273,691.54
> 
> For up to date results, see:
> https://treeherder.mozilla.org/perf.html#/alerts?id=6453

In both cases the regression is limited to the "Fresh Start" subtests, which makes sense since we preallocate a process on startup now. While it's sizeable, it's not unexpected and I'm okay with it.
Flags: needinfo?(erahm)
resolved/wontfix then?  I will let :gabor make the call
(In reply to Joel Maher ( :jmaher) from comment #8)
> resolved/wontfix then?  I will let :gabor make the call

I would gladly wontfix it if I could find the wontfix option on this wonderful new UI... But the options
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.