Closed Bug 1361436 Opened 3 years ago Closed 3 years ago

"summary" build metrics measuring `make check`

Categories

(Firefox Build System :: General, defect, P1)

defect

Tracking

(firefox-esr52 unaffected, firefox53 unaffected, firefox54 unaffected, firefox55 fixed)

RESOLVED FIXED
mozilla55
Tracking Status
firefox-esr52 --- unaffected
firefox53 --- unaffected
firefox54 --- unaffected
firefox55 --- fixed

People

(Reporter: gps, Assigned: gps)

References

(Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(1 file)

https://hg.mozilla.org/mozilla-central/rev/4e61e69a383ca79a9b84c78eb2eb53e5958333e3 (part of bug 1304508) broke the build metrics reporting for the "summary" metric. This metric is essentially end-to-end time.

When we switched from a raw `make check` to `mach build check`, that meant that mach's metrics reporting for `mach build` kicked in and started reporting the `mach build check` execution time in the "summary" metric.

There is a quick hack to fix this. The more involved hack is to run the "check" target during the build itself. But only on TaskCluster, since there is an optimization in buildbot to run "check" after some artifacts have been uploaded so tests start executing sooner.
ted, mshal, KWierso: I'd like your opinion on merging "check" into the main `mach build` invocation.

According to the numbers at https://treeherder.mozilla.org/perf.html#/alerts?id=6298, it appears `make check` is now ~1 minute on Linux, ~2 minutes on Windows, ~3 minutes on OS X. We've moved the bulk of work out of `make check` already. But there's still more we can do. And, as we move to TC, we don't have the BB hack of uploading artifacts to trigger test jobs before running `make check`. So in TC it doesn't make sense to continue running `make check` separately.

I'm really tempted to do away with `make check` being its own invocation from mozharness and to move it into the main `mach build`. If we did so:

* It could be run concurrently with other automation tasks, such as test packaging. This could result in a net speed-up of the overall build task.
* We'd lose the BB speed-up hack (which is going away with the move off BB anyway).
* We may lose --keep-going during `make check`, failing the build and preventing tests from running if anything in `make check` barfed. This means that intermittent failures would short circuit tests. I'm not sure if this would be a problem in practice. Although we may be able to hack around this with $(MAKEFLAGS) adjustment or something.
Flags: needinfo?(wkocher)
Flags: needinfo?(ted)
Flags: needinfo?(mshal)
I'm not going to bother consolidating `make check` into the "build" mozharness action at this time. Too much of a rabbit hole.
Flags: needinfo?(wkocher)
Flags: needinfo?(ted)
Flags: needinfo?(mshal)
(In reply to Gregory Szorc [:gps] from comment #1)
> * We may lose --keep-going during `make check`, failing the build and
> preventing tests from running if anything in `make check` barfed. This means
> that intermittent failures would short circuit tests. I'm not sure if this
> would be a problem in practice. Although we may be able to hack around this
> with $(MAKEFLAGS) adjustment or something.

We do a similar thing for l10n-check since it is the only post-automation step that is not parallel-safe: https://dxr.mozilla.org/mozilla-central/rev/a748acbebbde373a88868dc02910fb2bc5e6a023/build/moz-automation.mk#72

So we could just do a 'AUTOMATION_EXTRA_CMDLINE-check = -k' and set a MOZ_AUTOMATION_CHECK = 1 somewhere to wire it up to the existing automation rules. I agree that once we don't have buildbot doing a sendchange in between the build & check, there's no reason to have check run by mozharness, though I think we should try to finish moving things out to separate tasks if we can. Maybe a good project for a work-week sprint?
Comment on attachment 8863906 [details]
Bug 1361436 - Emit build stats immediately after build;

https://reviewboard.mozilla.org/r/135632/#review138930

It sucks that you have to fix this in so many places, but OK!
Attachment #8863906 - Flags: review?(ted) → review+
Pushed by gszorc@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2877aa1e02c8
Emit build stats immediately after build; r=ted
https://hg.mozilla.org/mozilla-central/rev/2877aa1e02c8
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Blocks: 1362148
Blocks: buildmetrics
I see a build times regression which seems to point to this revision:
== Change summary for alert #6334 (as of May 03 2017 18:06 UTC) ==

Regressions:

4340%  build times summary linux32 pgo taskcluster-c4.4xlarge     64.67 -> 2,871.10
4035%  build times summary linux64 pgo taskcluster-c4.4xlarge     59.92 -> 2,477.21
3441%  build times summary windows2012-64 pgo taskcluster-c4.4xlarge122.92 -> 4,352.14
3233%  build times summary windows2012-32 pgo taskcluster-c4.4xlarge121.10 -> 4,036.21
2612%  build times summary windows8-64 opt buildbot-c4.4xlarge    75.67 -> 2,052.01
2185%  build times summary windows8-64 debug buildbot-c4.4xlarge  88.63 -> 2,025.43
2036%  build times summary windowsxp debug buildbot-c4.4xlarge    85.97 -> 1,836.04
1989%  build times summary linux64-stylo opt taskcluster-c4.4xlarge60.08 -> 1,254.94
1949%  build times summary windowsxp opt buildbot-c4.4xlarge      93.70 -> 1,919.83
1735%  build times summary osx-10-7 debug buildbot-unknown        166.44 -> 3,054.15
1716%  build times summary osx-10-7 opt buildbot-unknown          163.72 -> 2,972.67
1215%  build times summary linux64 asan taskcluster-c4.4xlarge    81.73 -> 1,074.73
1025%  build times summary linux32 debug taskcluster-c4.4xlarge   65.92 -> 741.53
982%  build times summary windows2012-64 opt taskcluster-c4.4xlarge120.87 -> 1,308.11
982%  build times summary linux64-stylo debug taskcluster-c4.4xlarge66.59 -> 720.18
944%  build times summary windows2012-32 opt taskcluster-c4.4xlarge119.54 -> 1,248.29
899%  build times summary linux32 opt taskcluster-c4.4xlarge     65.67 -> 656.31
895%  build times summary linux64 opt taskcluster-c4.4xlarge     64.00 -> 636.48
882%  build times summary linux64 debug taskcluster-c4.4xlarge   68.18 -> 669.30
873%  build times summary windows2012-64 debug taskcluster-c4.4xlarge120.96 -> 1,176.59
856%  build times summary windows2012-32 debug taskcluster-c4.4xlarge122.34 -> 1,169.61

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=6334

not sure if it is really related or coincidence
Those alerts were expected. We broke the reporting for a few days and this alert is the values going back to their expected point.
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.