4.58 - 13.41% installer size (windows2012-32, windows2012-64) regression on push 01380eb6a6d072f212f8c2f7d194428e2746f0a4 (Tue Jul 10 2018)

RESOLVED WONTFIX

Status

RESOLVED WONTFIX
8 months ago
5 months ago

People

(Reporter: igoldan, Unassigned)

Tracking

({regression})

Firefox Tracking Flags

(Not tracked)

Details

We have detected a build metrics regression from push:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=01380eb6a6d072f212f8c2f7d194428e2746f0a4

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

 13%  installer size windows2012-32 opt      53,822,911.38 -> 61,042,618.75
  9%  installer size windows2012-32 pgo      56,290,022.67 -> 61,373,612.92
  7%  installer size windows2012-64 opt      59,263,384.46 -> 63,366,772.08
  5%  installer size windows2012-64 pgo      60,893,806.00 -> 63,680,140.08

Improvements:

 27%  build times windows2012-32-noopt debug taskcluster-c4.4xlarge     2,120.04 -> 1,544.56
 27%  build times windows2012-64-noopt debug taskcluster-c4.4xlarge     2,157.27 -> 1,579.23
 20%  build times windows2012-64 pgo taskcluster-c4.4xlarge             4,559.76 -> 3,668.79
 17%  build times windows2012-64 debug plain taskcluster-c4.4xlarge     1,989.48 -> 1,660.96
 15%  build times windows2012-32 pgo taskcluster-c4.4xlarge             4,250.08 -> 3,612.18
 15%  build times windows2012-64 opt plain taskcluster-c4.4xlarge       2,028.71 -> 1,725.02
 11%  build times windows2012-32 debug taskcluster-c4.4xlarge           2,345.17 -> 2,078.53
  9%  build times windows2012-64 debug taskcluster-c4.4xlarge           2,330.82 -> 2,114.59
  3%  build times windows2012-64 opt taskcluster-c4.4xlarge             2,175.09 -> 2,108.32
  3%  build times windows2012-32 opt taskcluster-c4.4xlarge             2,127.05 -> 2,067.04


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=14277

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Automated_Performance_Testing_and_Sheriffing/Build_Metrics
Component: General → General
Product: Testing → Firefox Build System
Version: Version 3 → unspecified
Flags: needinfo?(dmajor)
This regression is not surprising, unfortunately. The size increases are the price we pay for the performance improvements. clang-cl's generated code is generally larger and faster than MSVC's.

I made some attempts in bug 1465633 to mitigate this but have mostly given up as I couldn't get any material gains without regressing Talos benchmarks. Down the road, ThinLTO and PGO are going to change the optimization landscape anyway, so it may be worth revisiting the size/speed balance once those optimizations are enabled.

Anthony, what do you think?
Flags: needinfo?(dmajor) → needinfo?(ajones)
We should look into the size issue once we get PGO support working. I'd like to keep this bug open until we decide when clang-cl rides the trains.
Almost all our perf tests on Windows improved big time. We only have 2 very strange regressions, that don't make sense at all. I consider them invalid. Even if they were real, the sheer amount of wins is staggering.

== Change summary for alert #14271 (as of Tue, 10 Jul 2018 17:22:52 GMT) ==

Regressions:

  3%  speedometer windows10-64 pgo e10s stylo     69.06 -> 66.67
  3%  ts_paint_webext windows10-64 pgo e10s stylo 321.96 -> 331.58

Improvements:

 45%  stylebench windows7-32 opt e10s stylo     29.33 -> 42.41
 40%  displaylist_mutate windows7-32 opt e10s stylo5,054.33 -> 3,037.00
 40%  a11yr windows7-32 opt e10s stylo          305.54 -> 184.29
 29%  tp6_youtube windows7-32 opt e10s stylo    327.48 -> 233.25
 26%  displaylist_mutate windows10-64 opt e10s stylo3,965.85 -> 2,918.32
 26%  tp5o responsiveness windows7-32 opt e10s stylo0.64 -> 0.47
 23%  tp6_amazon windows7-32 opt e10s stylo     296.90 -> 228.04
 23%  displaylist_mutate windows7-32 pgo e10s stylo3,818.06 -> 2,957.17
 21%  speedometer windows7-32 opt e10s stylo    58.48 -> 71.00
 20%  a11yr windows10-64 opt e10s stylo         238.39 -> 190.08
 20%  a11yr windows10-64-qr opt e10s stylo      237.39 -> 189.89
 18%  tsvgx windows7-32 opt e10s stylo          228.77 -> 187.08
 18%  tp6_youtube windows10-64-qr opt e10s stylo290.17 -> 239.12
 17%  tsvgr_opacity windows7-32 opt e10s stylo  141.77 -> 117.63
 16%  displaylist_mutate windows10-64 pgo e10s stylo3,383.95 -> 2,834.30
 16%  stylebench windows10-64 opt e10s stylo    35.70 -> 41.38
 16%  tps windows7-32 opt e10s stylo            15.38 -> 12.99
 15%  tp6_google windows7-32 opt e10s stylo     479.04 -> 405.29
 15%  stylebench windows10-64-qr opt e10s stylo 37.44 -> 42.94
 15%  tscrollx windows7-32 opt e10s stylo       0.80 -> 0.68
 15%  tp6_youtube windows10-64 opt e10s stylo   279.42 -> 238.79
 14%  about_preferences_basic windows7-32 opt e10s stylo167.48 -> 143.67
 14%  tp6_facebook windows7-32 opt e10s stylo   175.10 -> 150.50
 13%  tp5o_scroll windows7-32 opt e10s stylo    0.80 -> 0.70
 13%  tart windows7-32 opt e10s stylo           3.05 -> 2.66
 12%  dromaeo_css windows7-32 opt e10s stylo    12,207.58 -> 13,732.37
 12%  stylebench windows7-32 pgo e10s stylo     37.95 -> 42.62
 12%  tpaint windows7-32 opt e10s stylo         154.64 -> 135.81
 11%  sessionrestore_no_auto_restore windows7-32 opt e10s stylo316.62 -> 280.83
 11%  tscrollx windows10-64 opt e10s stylo      0.73 -> 0.65
 11%  damp windows7-32 opt e10s stylo           100.62 -> 89.52
 11%  tp6_amazon windows10-64 opt e10s stylo    260.81 -> 232.29
 11%  tp6_amazon windows10-64-qr opt e10s stylo 262.98 -> 234.46
 11%  sessionrestore windows7-32 opt e10s stylo 272.33 -> 243.08
 11%  tabpaint windows10-64-qr opt e10s stylo   55.49 -> 49.66
 10%  ts_paint windows7-32 opt e10s stylo       371.42 -> 334.00
 10%  ts_paint_webext windows7-32 opt e10s stylo374.92 -> 337.67
 10%  tsvgr_opacity windows10-64 opt e10s stylo 116.72 -> 105.37
 10%  ts_paint_heavy windows7-32 opt e10s stylo 369.42 -> 334.00
  9%  speedometer windows10-64 opt e10s stylo   60.14 -> 65.76
  9%  tabpaint windows7-32 opt e10s stylo       56.35 -> 51.11
  9%  perf_reftest_singletons windows7-32 pgo e10s stylo53.85 -> 49.01
  9%  cpstartup content-process-startup windows7-32 opt e10s stylo167.92 -> 153.00
  9%  tscrollx windows10-64 pgo e10s stylo      0.70 -> 0.64
  8%  a11yr windows7-32 pgo e10s stylo          192.48 -> 176.48
  8%  speedometer windows10-64-qr opt e10s stylo60.44 -> 65.42
  8%  displaylist_mutate windows10-64-qr opt e10s stylo4,320.47 -> 3,966.04
  8%  tsvgr_opacity windows10-64-qr opt e10s stylo112.58 -> 103.62
  8%  tp6_facebook windows10-64 opt e10s stylo  167.29 -> 154.21
  8%  tps windows10-64 opt e10s stylo           15.39 -> 14.20
  8%  tps windows10-64-qr opt e10s stylo        12.43 -> 11.48
  8%  tp6_google windows10-64 opt e10s stylo    463.15 -> 428.08
  7%  tresize windows7-32 opt e10s stylo        8.49 -> 7.87
  7%  dromaeo_css windows7-32 pgo e10s stylo    12,823.17 -> 13,752.37
  7%  tp6_facebook windows10-64-qr opt e10s stylo169.12 -> 157.21
  7%  tp6_google windows10-64-qr opt e10s stylo 456.29 -> 424.12
  7%  tsvgx windows10-64 opt e10s stylo         150.41 -> 140.19
  7%  tp5o_scroll windows10-64 opt e10s stylo   0.77 -> 0.72
  7%  damp windows10-64-qr opt e10s stylo       91.55 -> 85.41
  7%  sessionrestore_many_windows windows7-32 opt e10s stylo2,435.83 -> 2,273.08
  7%  tsvgx windows10-64-qr opt e10s stylo      417.68 -> 390.00
  6%  about_preferences_basic windows10-64 opt e10s stylo155.94 -> 146.02
  6%  about_preferences_basic windows10-64-qr opt e10s stylo156.44 -> 146.72
  6%  tp6_facebook windows7-32 pgo e10s stylo   159.46 -> 149.67
  6%  tart windows10-64 opt e10s stylo          2.83 -> 2.67
  5%  sessionrestore windows10-64 opt e10s stylo247.62 -> 234.92
  5%  dromaeo_css windows10-64 opt e10s stylo   12,908.23 -> 13,563.62
  5%  damp windows10-64 opt e10s stylo          96.59 -> 91.80
  5%  rasterflood_svg windows7-32 opt e10s stylo10,547.70 -> 10,048.67
  5%  sessionrestore_no_auto_restore windows10-64 opt e10s stylo286.12 -> 272.58
  4%  tp6_google windows7-32 pgo e10s stylo     421.52 -> 403.58
  4%  ts_paint_webext windows10-64-qr opt e10s stylo355.25 -> 340.67
  4%  ts_paint_heavy windows10-64-qr opt e10s stylo350.71 -> 337.92
  4%  tps windows7-32 pgo e10s stylo            13.16 -> 12.68
  3%  tart windows10-64-qr opt e10s stylo       1.53 -> 1.49
  3%  sessionrestore_many_windows windows10-64 opt e10s stylo2,342.33 -> 2,276.25
  2%  rasterflood_svg windows10-64 opt e10s stylo10,941.81 -> 10,717.82

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=14271eeherder.mozilla.org/perf.html#/alerts?id=14271
The regressions are real and were expected. They should go away with work on LTO/PGO for the clang-cl builds.

It's impressive, though, how many (large!) improvements over the PGO(!) MSVC builds there are, considering those builds are not even using LTO or PGO yet!
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #3)
> For up to date results, see:
> https://treeherder.mozilla.org/perf.html#/alerts?id=14271eeherder.mozilla.
> org/perf.html#/alerts?id=14271

Sorry for broken link above. Correct one is: https://treeherder.mozilla.org/perf.html#/alerts?id=14271
For the record, I'm shocked at how much better Clang currently is than MSVC PGO. I thought MSVC would be much better than this. This is shaping up to be much more promising on the performance front that I anticipated! I was content with the Clang transition just being about unifying around an open source toolchain. But now the performance benefits look to be as significant of a win. This is amazing!

Comment 7

8 months ago
I'm curious what are the MSVC build flags used by the opt configs (without pgo), such as for "displaylist_mutate windows10-64 opt e10s stylo". Is it -O1 without LTCG, which is the default with ./mach build?
(In reply to lgratian from comment #7)
> I'm curious what are the MSVC build flags used by the opt configs (without
> pgo), such as for "displaylist_mutate windows10-64 opt e10s stylo". Is it
> -O1 without LTCG, which is the default with ./mach build?

The Windows optimization flags are -O1 -Oi for Gecko proper and -O2 for the JS engine.  It's possible that compiling Gecko with -O2 (-Oi?) would have been a more fair comparison, as we had to bump up the optimization for clang to get performance that was remotely comparable, IIRC.
Enabling PGO in bug 1341525 partially mitigated this regression: we won back 2.8MB/4% of installer size. I don't know why Perfherder didn't raise an alert to this.
(In reply to David Major [:dmajor] from comment #9)
> Enabling PGO in bug 1341525 partially mitigated this regression: we won back
> 2.8MB/4% of installer size. I don't know why Perfherder didn't raise an
> alert to this.

Yeah, there may be an issue with Perfherder. I had to manually create the improvement:

== Change summary for alert #14720 (as of Thu, 02 Aug 2018 09:39:51 GMT) ==

Improvements:

  3%  installer size windows2012-64 pgo      70,330,409.83 -> 68,115,181.08

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=14720
I think our users prefer to have a much faster Firefox than a smaller Firefox.
I think we should accept this trade-off.
Status: NEW → RESOLVED
Last Resolved: 7 months ago
Flags: needinfo?(ajones)
Resolution: --- → WONTFIX
Duplicate of this bug: 1503387
You need to log in before you can comment on or make changes to this bug.