Closed Bug 1639164 Opened 5 years ago Closed 4 years ago

"mach try auto" should select the best platforms to run manifests on

Tracking

(firefox90 fixed)

Status:

RESOLVED FIXED

Milestone:

90 Branch

Tracking Flags:

Tracking

Status

firefox90

---

fixed

People

(Reporter: marco, Assigned: marco)

References

(Blocks 1 open bug)

Details

Attachments

(10 files)

Bug 1639164 - Add an option to the BugbugPushSchedules optimization strategy to select configurations on which to run manifests based on bugbug decisions. r=ahal 4 years ago Marco Castelluccio [:marco] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1639164 - Define a shadow scheduler that uses bugbug's platform selection. r=ahal 4 years ago Marco Castelluccio [:marco] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1639164 - Translate groups returned by bugbug in the config_groups dict too. r=ahal 4 years ago Marco Castelluccio [:marco] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1639164 - Ignore chunk number when matching task labels with configurations returned by bugbug. r=jmaher 4 years ago Marco Castelluccio [:marco] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1639164 - Rename mock task names to prevent the number being considered as a chunk number. r=jmaher 4 years ago Marco Castelluccio [:marco] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1639164 - Add a new strategy using bugbug's config selection and a low confidence threshold. r=ahal 4 years ago Marco Castelluccio [:marco] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1639164 - Add a shadow scheduler using the new strategy with bugbug's config selection and a low confidence threshold. r=ahal 4 years ago Marco Castelluccio [:marco] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1639164 - Use bugbug's config selection results by default when using 'mach try auto'. r=ahal 4 years ago Marco Castelluccio [:marco] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1639164 - Filter out shippable tasks for try auto target. r=ahal 4 years ago Marco Castelluccio [:marco] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1639164 - Mark mach try auto test that ensures no build signing task is scheduled as failing. r=ahal 4 years ago Marco Castelluccio [:marco] 48 bytes, text/x-phabricator-request		Details \| Review

Marco Castelluccio [:marco]

Assignee

Description

•

5 years ago

Currently, the bugbug reduction strategy doesn't care about the number of platforms selected (since on autoland we have all of them anyway) but just about the "cost" of the tasks.

For try, we should take into account not only the cost, but also the number of selected platforms.

Marco Castelluccio [:marco]

Assignee

Updated

•

5 years ago

Assignee: nobody → mcastelluccio

Status: NEW → ASSIGNED

Marco Castelluccio [:marco]

Assignee

Comment 1

•

5 years ago

We are switching to bugbug_disperse in bug 1638945 (which suffers from the same problem of scheduling too many platforms). There is some work needed in bugbug to support selecting platforms for the manifest-level model.

Marco Castelluccio [:marco]

Assignee

Updated

•

4 years ago

Comment 2

•

4 years ago

I'm not convinced the model should optimize for this. Builds are relatively cheap (and getting cheaper). Plus getting a bit of coverage on more platforms would likely find more regressions than limiting coverage to a few.

Marco Castelluccio [:marco]

Assignee

Comment 3

•

4 years ago

(In reply to Andrew Halberstadt [:ahal] from comment #2)

I'm not convinced the model should optimize for this. Builds are relatively cheap (and getting cheaper). Plus getting a bit of coverage on more platforms would likely find more regressions than limiting coverage to a few.

We can tune it however we want: e.g. run a manifest only on a single configuration if we are absolutely sure they're configuration-independent, run them on multiple configurations if we know they aren't or if we don't know yet (and, if we don't know yet, we can select the configurations in a way to collect more information for the future). Also, we can make sure that, if we select the same manifest across multiple pushes, it is run on a different configuration on each push.

The objective is not only to reduce the number of builds, but mainly to reduce the number of test executions for tests that are likely platform independent. E.g. if you had 30 platform-independent manifests and run them on 3 platforms, you'll nedlessly increase the total runtime by 3x.

Then, we'll also be able to use this work for bug 1637555.

Summary: "mach try auto" should select the smallest number of platforms → "mach try auto" should select the best platforms to run manifests on

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/6431da98cb37
https://hg.mozilla.org/mozilla-central/rev/07ff82216f70

Marco Castelluccio [:marco]

Assignee

Comment 10

•

4 years ago

Attached file Bug 1639164 - Translate groups returned by bugbug in the config_groups dict too. r=ahal — Details

Otherwise there is a mismatch between the group names in "groups" and "config_groups".

Pulsebot

Comment 12

•

4 years ago

Pushed by mcastelluccio@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/76a3adc2c6d8 Translate groups returned by bugbug in the config_groups dict too. r=ahal

Narcis Beleuzu [:NarcisB]

Comment 13

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/76a3adc2c6d8

Marco Castelluccio [:marco]

Assignee

Comment 14

•

4 years ago

Attached file Bug 1639164 - Ignore chunk number when matching task labels with configurations returned by bugbug. r=jmaher — Details

Marco Castelluccio [:marco]

Assignee

Comment 15

•

4 years ago

Attached file Bug 1639164 - Rename mock task names to prevent the number being considered as a chunk number. r=jmaher — Details

Pulsebot

Comment 16

•

4 years ago

Pushed by mcastelluccio@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7a1b37db8c74 Ignore chunk number when matching task labels with configurations returned by bugbug. r=jmaher DONTBUILD https://hg.mozilla.org/integration/autoland/rev/fd47ecdfed71 Rename mock task names to prevent the number being considered as a chunk number. r=jmaher DONTBUILD

Natalia Csoregi [:nataliaCs]

Comment 17

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/7a1b37db8c74
https://hg.mozilla.org/mozilla-central/rev/fd47ecdfed71

Marco Castelluccio [:marco]

Assignee

Comment 18

•

4 years ago

Attached file Bug 1639164 - Add a new strategy using bugbug's config selection and a low confidence threshold. r=ahal — Details

Marco Castelluccio [:marco]

Assignee

Comment 19

•

4 years ago

Attached file Bug 1639164 - Add a shadow scheduler using the new strategy with bugbug's config selection and a low confidence threshold. r=ahal — Details

Depends on D114363

Marco Castelluccio [:marco]

Assignee

Comment 20

•

4 years ago

Attached file Bug 1639164 - Use bugbug's config selection results by default when using 'mach try auto'. r=ahal — Details

By making the new strategy with bugbug's config selection and a low confidence threshold the
default for 'mach try auto'.

Depends on D114364

Marco Castelluccio [:marco]

Assignee

Updated

•

4 years ago

Keywords: leave-open

Pulsebot

Comment 21

•

4 years ago

Pushed by mcastelluccio@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f9c73976484d Add a new strategy using bugbug's config selection and a low confidence threshold. r=ahal DONTBUILD https://hg.mozilla.org/integration/autoland/rev/f1377ee7e2d2 Add a shadow scheduler using the new strategy with bugbug's config selection and a low confidence threshold. r=ahal DONTBUILD https://hg.mozilla.org/integration/autoland/rev/7a4401a358e8 Use bugbug's config selection results by default when using 'mach try auto'. r=ahal DONTBUILD

Alexandru Michis [:malexandru]

Comment 22

•

4 years ago

Backed out 3 changesets (Bug 1639164) for causing python flake8 and ci failures.
Backout link: https://hg.mozilla.org/integration/autoland/rev/00b9154f4315be73495d44b18af4d4b1a1a5b5ec
Push with failures, ci failure, flake failure, cram try failure.

Flags: needinfo?(mcastelluccio)

Marco Castelluccio [:marco]

Assignee

Comment 23

•

4 years ago

(In reply to Alexandru Michis [:malexandru] from comment #22)

Backed out 3 changesets (Bug 1639164) for causing python flake8 and ci failures.
Backout link: https://hg.mozilla.org/integration/autoland/rev/00b9154f4315be73495d44b18af4d4b1a1a5b5ec
Push with failures, ci failure, flake failure, cram try failure.

I fixed the flake8 and tryselect test failures, which were trivial, but I still need to fix the test_mach_try_auto test.
The test is failing because we can now possibly schedule shippable builds. We were relying on bugbug_debug_disperse to filter out all non-debug platforms (shippable included), but the config selection alternative is not restricting to debug platforms anymore.
:ahal, any idea how to fix that?

Flags: needinfo?(mcastelluccio) → needinfo?(ahal)

Marco Castelluccio [:marco]

Assignee

Comment 24

•

4 years ago

I've added a filter in https://searchfox.org/mozilla-central/rev/3151f97de27730793c2e298716df760999423f26/taskcluster/taskgraph/target_tasks.py#373 to exclude shippable tasks.
There is still one problem left: we are now scheduling some build-signing tasks because they depend on xpcshell tasks. IDK why this wasn't the case with bugbug_debug_disperse.

Marco Castelluccio [:marco]

Assignee

Comment 25

•

4 years ago

The build-signing test was marked as failing until https://hg.mozilla.org/mozilla-central/rev/f07222b728fa426f56e269116c14b2c1e2987104.
I think it is passing with bugbug_debug_disperse just because of luck in the way the dispersion was done (and f07222b728fa426f56e269116c14b2c1e2987104 changed it). Indeed, if I increase the dispersion across more platform, the test fails.

So, I'll just mark it as failing.

Flags: needinfo?(ahal)

Marco Castelluccio [:marco]

Assignee

Comment 26

•

4 years ago

Attached file Bug 1639164 - Filter out shippable tasks for try auto target. r=ahal — Details

Depends on D114365

Marco Castelluccio [:marco]

Assignee

Comment 27

•

4 years ago

Attached file Bug 1639164 - Mark mach try auto test that ensures no build signing task is scheduled as failing. r=ahal — Details

Depends on D114489

Pulsebot

Comment 28

•

4 years ago

Pushed by mcastelluccio@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3053350a4a94 Add a new strategy using bugbug's config selection and a low confidence threshold. r=ahal DONTBUILD https://hg.mozilla.org/integration/autoland/rev/99b4cc4801ae Add a shadow scheduler using the new strategy with bugbug's config selection and a low confidence threshold. r=ahal DONTBUILD https://hg.mozilla.org/integration/autoland/rev/90f0b462690c Use bugbug's config selection results by default when using 'mach try auto'. r=ahal DONTBUILD https://hg.mozilla.org/integration/autoland/rev/fff1d447ffd1 Filter out shippable tasks for try auto target. r=ahal DONTBUILD https://hg.mozilla.org/integration/autoland/rev/7b6be863db85 Mark mach try auto test that ensures no build signing task is scheduled as failing. r=ahal DONTBUILD

Sandor Molnar[:smolnar]

Comment 29

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/3053350a4a94
https://hg.mozilla.org/mozilla-central/rev/99b4cc4801ae
https://hg.mozilla.org/mozilla-central/rev/90f0b462690c
https://hg.mozilla.org/mozilla-central/rev/fff1d447ffd1
https://hg.mozilla.org/mozilla-central/rev/7b6be863db85

Status: ASSIGNED → RESOLVED

Closed: 4 years ago

status-firefox90: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 90 Branch

BMO Automation

Updated

•

2 years ago

Product: Firefox Build System → Developer Infrastructure

You need to log in before you can comment on or make changes to this bug.