Closed Bug 1553049 Opened 5 years ago Closed 5 years ago

27.17 - 44.44% build times (windows-mingw32) regression noticed around Fri May 17 2019

Categories

(Firefox Build System :: General, defect)

All
Windows
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: igoldan, Unassigned)

References

(Regression)

Details

(Keywords: regression)

We have detected following build metrics regressions:

Regressions:

44% build times windows-mingw32 all 32 clang opt taskcluster-c4.4xlarge 1,639.94 -> 2,368.69
27% build times windows-mingw32 all 64 clang opt taskcluster-c5.4xlarge 1,540.12 -> 1,958.65

You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=20988

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Automated_Performance_Testing_and_Sheriffing/Build_Metrics

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Product: Testing → Firefox Build System

The regression above appeared first on mozilla-inbound. I presented the autoland metrics because they're more obvious.

Investigating build_metrics is pretty hard, so I need your help to identify the correct bug which caused them.
Skimming the commits around the time of regression, I noticed 3 bugs which seem related to Firefox's build system.
If you think your bug isn't properly linked, remove it from the Regressed by field. Thanks!

Flags: needinfo?(nfroyd)
Flags: needinfo?(bbouvier)

What bug 1547682 does is build more things during the build step (it builds more parts of Cranelift meta language in Rust than it does in Python). I don't expect this to cause such a large regression (it should be a matter of seconds), but it has happened in the past that Cranelift build steps would slow down the build in unexpected ways. Any chance to back out patches from bug 1547682 and get measures before/after, please?

Flags: needinfo?(bbouvier)

To be precise: back out in a try build, not from central please :)

Since bbouvier's patch was the only one that landed on mozilla-inbound, and comment 1 says that the regression first appeared on mozilla-inbound, I'm clearing my bugs from Regressed by.

Flags: needinfo?(nfroyd)
No longer regressed by: 1547196, 1550868

(In reply to Benjamin Bouvier [:bbouvier] from comment #3)

To be precise: back out in a try build, not from central please :)

I made that Try push here.

Thanks! If I read the results correctly, there's a big regression on the sccache hit numbers. I don't know how sccache works at all. Are there any chances that the Cargo prebuild steps aren't cached? Or that the files created by the build steps aren't cached?

(In reply to Benjamin Bouvier [:bbouvier] from comment #6)

Thanks! If I read the results correctly, there's a big regression on the sccache hit numbers. I don't know how sccache works at all. Are there any chances that the Cargo prebuild steps aren't cached? Or that the files created by the build steps aren't cached?

:kmoir, could you forward this question to someone who knows these details?

Flags: needinfo?(kmoir)

Redirecting to chmanchester

Flags: needinfo?(kmoir) → needinfo?(cmanchester)

The sccache hit rate in that try push seems to be within a normal range if you look at the graph zoomed out. Looking at some before and after logs and seeing the time cargo is reporting (searching for "Finished release [optimized] target(s) in"), that seems to be where the extra time is being spent.

Flags: needinfo?(cmanchester)

(In reply to Chris Manchester (:chmanchester) from comment #9)

The sccache hit rate in that try push seems to be within a normal range if you look at the graph zoomed out. Looking at some before and after logs and seeing the time cargo is reporting (searching for "Finished release [optimized] target(s) in"), that seems to be where the extra time is being spent.

Can we do something about it or should we mark this as wontfix?

Flags: needinfo?(cmanchester)

Tom, is this something worth investigating? Are these build times getting in people's way?

Flags: needinfo?(cmanchester) → needinfo?(tom)

I think the only reason to investigate this is if we want to save the processing time/money. The wall clock time for building this is not likely to impede anyone's development.

Flags: needinfo?(tom)

Sounds like we're taking this regression.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX

(Note this should now be addressed by the landing of bug 1555894.)

Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.