Closed Bug 1621764 Opened 6 years ago Closed 6 years ago

reduce build-plain to either every 10th push or m-c tier-2 only

Tracking

(firefox77 fixed)

Status:

RESOLVED FIXED

Milestone:

mozilla77

Tracking Flags:

Tracking

Status

firefox77

---

fixed

People

(Reporter: jmaher, Assigned: bc)

References

Details

(Whiteboard: [ci-costs-2020:done])

Attachments

(2 files, 2 obsolete files)

Bug 1621764 - Define PushIntervalStrategy optimization strategy, r=ahal. 6 years ago Bob Clary [:bc] (inactive) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1621764 - Apply push-interval strategies for linux, windows plain and aarch64 builds, r=jmaher. 6 years ago Bob Clary [:bc] (inactive) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1621764 - Define push-interval-{10,25} Backstop optimization strategies, r=ahal 6 years ago Bob Clary [:bc] (inactive) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1621764 - add debug output for Backstop optimizations 6 years ago Bob Clary [:bc] (inactive) 47 bytes, text/x-phabricator-request		Details \| Review

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Description

•

6 years ago

we run linux/windows debug build-plain on every push to autoland. While we do not spend a lot of cpu hours and costs building these, it is unnecessary to be run every commit.

in the ~6 months of data we have in bigquery, there are 272 revisions where Bp jobs fail (most fail both linux/windows), and 6 of those that do not have another build failing at the same time. looking at the 6 revisions all failures are intermittent (failure to download something).

The risk is low to reduce frequency here as we are not finding plain only regressions.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 1

•

6 years ago

we should consider valgrind builds as well

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 2

•

6 years ago

valgrind build jobs have found 3 regressions all in the month of January, but that is all in the 7 months I looked at data.

Geoff Brown [:gbrown]

Updated

•

6 years ago

Priority: -- → P3

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 3

•

6 years ago

in addition we should reduce win/aarch64* builds and linux64/aarch64 builds to be every 10th push as we are not running tests for those on autoland.

icing on the cake is the opt builds (as we only test on shippable) once we move the few remaining tests that depend on regular opt builds to be on shippable.

Bob Clary [:bc] (inactive)

Assignee

Updated

•

6 years ago

Assignee: nobody → bob

Status: NEW → ASSIGNED

Bob Clary [:bc] (inactive)

Assignee

Comment 4

•

6 years ago

jmaher: Can you clarify the meaning of "reduce ... to either every 10th push or m-c tier-2 only" for me?

We want every 10th push on autoland and want every push on mozilla-central but they should be tier 2 there?

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 5

•

6 years ago

we should treat all the builds referenced here like fuzzing builds, forced SETA every 10th push. We might want to reconsider to be like some tier-2 perf tests and be every 25th push, but we don't have that implemented yet. Changing to every 10th would be a boost in the short term with no risk.

Flags: needinfo?(jmaher)

Bob Clary [:bc] (inactive)

Assignee

Comment 6

•

6 years ago

I was looking into this initially with the idea that these would be very much like the seta fuzzing changes I made earlier but that requires simultaneous changes to treeherder to support using seta on the specific builds and is a pain to test. I came to a different idea this morning that I could just use a normal schedule without involving seta at all. The advantage was that there are no treeherder changes required and testing is very easy. The question is whether we would want every 10th push on all trees (projects) or just autoland. I'll assume autoland and allow builds for every push on mozilla-central. I'll put up a phab in a bit to show what I'm talking about.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 7

•

6 years ago

yeah, every build for m-c, beta, release, esr, try- this would only apply to autoland

Bob Clary [:bc] (inactive)

Assignee

Comment 8

•

6 years ago

Attached file Bug 1621764 - Define PushIntervalStrategy optimization strategy, r=ahal. (obsolete) — Details

PushIntervalStrategy is modeled on seta's approach to schedule tasks on
every Nth push. It is restricted to the autoland project.

Two strategies "push-interval-10" and "push-interval-25" are defined for
scheduling tasks for every 10th and every 25th push respectively.

Debugging output is available via the --verbose option to mach taskgraph optimized.

Bob Clary [:bc] (inactive)

Assignee

Comment 9

•

6 years ago

Attached file Bug 1621764 - Apply push-interval strategies for linux, windows plain and aarch64 builds, r=jmaher. — Details

This patch uses the new push-interval-10 to schedule the linux, windows plain and aarch64
builds on autoland every 10th push.

Tested locally with a local checkout whose pushlog_id was not divisible
by 10 using parameters.yml downloaded from the Gecko Decision Task using

./mach taskgraph optimized --verbose --parameters /tmp/parameters.yml

parameters.yml from autoland showed the following optimizations.

0:56.13 PushIntervalStrategy: Removing task build-linux64-aarch64/opt interval 10
0:56.13 PushIntervalStrategy: Removing task build-linux64-plain/debug interval 10
0:56.13 PushIntervalStrategy: Removing task build-signing-win64-aarch64/opt interval 10
0:56.13 PushIntervalStrategy: Removing task build-win64-aarch64/debug interval 10
0:56.13 PushIntervalStrategy: Removing task build-win64-plain/debug interval 10
0:56.18 PushIntervalStrategy: Removing task valgrind-linux64-valgrind/opt interval 10

while parameters.yml from mozilla-central did not show any PushIntervalStrategy
optimizations.

Depends on D70181

Bob Clary [:bc] (inactive)

Assignee

Comment 10

•

6 years ago

Feedback appreciated on the approach and the results before I formally ask for review.

Flags: needinfo?(jmaher)

Phabricator Automation

Updated

•

6 years ago

Attachment #9139124 - Attachment description: Bug 1621764 - Define PushIntervalStrategy optimization strategy. → Bug 1621764 - Define PushIntervalStrategy optimization strategy, r=ahal.

Phabricator Automation

Updated

•

6 years ago

Attachment #9139125 - Attachment description: Bug 1621764 - Apply push-interval strategies for linux, windows plain and aarch64 builds. → Bug 1621764 - Apply push-interval strategies for linux, windows plain and aarch64 builds, r=jmaher.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

6 years ago

Flags: needinfo?(jmaher)

Phabricator Automation

Updated

•

6 years ago

Attachment #9139124 - Attachment is obsolete: true

Bob Clary [:bc] (inactive)

Assignee

Updated

•

6 years ago

Updated

•

6 years ago

Depends on: 1625200

Bob Clary [:bc] (inactive)

Assignee

Comment 11

•

6 years ago

Attached file Bug 1621764 - Define push-interval-{10,25} Backstop optimization strategies, r=ahal — Details

Bob Clary [:bc] (inactive)

Assignee

Comment 12

•

6 years ago

Attached file Bug 1621764 - add debug output for Backstop optimizations (obsolete) — Details

Depends on D70182

Phabricator Automation

Updated

•

6 years ago

Attachment #9142487 - Attachment description: Bug 1621764 - Define PushIntervalStrategy optimization strategy, r=ahal → Bug 1621764 - Define push-interval-{10,25} Backstop optimization strategies, r=ahal

Phabricator Automation

Updated

•

6 years ago

Attachment #9142488 - Attachment description: Bug 1621764 - add debug output for Backstop optimizations, r=ahal. → Bug 1621764 - add debug output for Backstop optimizations

Pulsebot

Comment 13

•

6 years ago

Pushed by bclary@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8808cb9cbff2 Define push-interval-{10,25} Backstop optimization strategies, r=ahal

Pulsebot

Comment 14

•

6 years ago

Pushed by bclary@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2001c1f52aa0 Apply push-interval strategies for linux, windows plain and aarch64 builds, r=jmaher.

Razvan Maries

Comment 15

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/8808cb9cbff2
https://hg.mozilla.org/mozilla-central/rev/2001c1f52aa0

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

status-firefox77: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla77

Mike Hommey [:glandium]

Updated

•

6 years ago

Regressions: 1633927

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

6 years ago

Whiteboard: [ci-costs-2020:todo] → [ci-costs-2020:done]

(Away)

Comment 16

•

6 years ago

•

Edited

Does this mean that an unfortunately-timed patch may make it to m-c only to find that a push-interval job breaks later? Or do we have some way of letting the interval jobs catch up before selecting merge candidates?

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 17

•

6 years ago

Thanks for asking!

we already run most jobs every 10th push, that is as safe as any other job. If we were running every 25th push that is more of what we view as tier-2; most likely it will be caught before the merge, but it could miss the timing window and there could be a regression. All the builds in this bug have been adjusted to the 10th push which will be required to be green before merging to m-c.

Flags: needinfo?(jmaher)

Phabricator Automation

Updated

•

6 years ago

Attachment #9142488 - Attachment is obsolete: true

You need to log in before you can comment on or make changes to this bug.