Closed Bug 1639164 Opened 4 years ago Closed 3 years ago

"mach try auto" should select the best platforms to run manifests on

Categories

(Developer Infrastructure :: Try, enhancement)

enhancement

Tracking

(firefox90 fixed)

RESOLVED FIXED
90 Branch
Tracking Status
firefox90 --- fixed

People

(Reporter: marco, Assigned: marco)

References

(Blocks 1 open bug)

Details

Attachments

(10 files)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

Currently, the bugbug reduction strategy doesn't care about the number of platforms selected (since on autoland we have all of them anyway) but just about the "cost" of the tasks.

For try, we should take into account not only the cost, but also the number of selected platforms.

Assignee: nobody → mcastelluccio
Status: NEW → ASSIGNED

We are switching to bugbug_disperse in bug 1638945 (which suffers from the same problem of scheduling too many platforms). There is some work needed in bugbug to support selecting platforms for the manifest-level model.

See Also: → 1642985

I'm not convinced the model should optimize for this. Builds are relatively cheap (and getting cheaper). Plus getting a bit of coverage on more platforms would likely find more regressions than limiting coverage to a few.

(In reply to Andrew Halberstadt [:ahal] from comment #2)

I'm not convinced the model should optimize for this. Builds are relatively cheap (and getting cheaper). Plus getting a bit of coverage on more platforms would likely find more regressions than limiting coverage to a few.

We can tune it however we want: e.g. run a manifest only on a single configuration if we are absolutely sure they're configuration-independent, run them on multiple configurations if we know they aren't or if we don't know yet (and, if we don't know yet, we can select the configurations in a way to collect more information for the future). Also, we can make sure that, if we select the same manifest across multiple pushes, it is run on a different configuration on each push.

The objective is not only to reduce the number of builds, but mainly to reduce the number of test executions for tests that are likely platform independent. E.g. if you had 30 platform-independent manifests and run them on 3 platforms, you'll nedlessly increase the total runtime by 3x.

Then, we'll also be able to use this work for bug 1637555.

Summary: "mach try auto" should select the smallest number of platforms → "mach try auto" should select the best platforms to run manifests on
See Also: → 1648724
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6431da98cb37
Add an option to the BugbugPushSchedules optimization strategy to select configurations on which to run manifests based on bugbug decisions. r=ahal
https://hg.mozilla.org/integration/autoland/rev/07ff82216f70
Define a shadow scheduler that uses bugbug's platform selection. r=ahal

Adding "leave-open" until we switch "mach try auto" to use the new strategy (which might require some tuning first).

Keywords: leave-open

Otherwise there is a mismatch between the group names in "groups" and "config_groups".

Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/76a3adc2c6d8
Translate groups returned by bugbug in the config_groups dict too. r=ahal
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7a1b37db8c74
Ignore chunk number when matching task labels with configurations returned by bugbug. r=jmaher DONTBUILD
https://hg.mozilla.org/integration/autoland/rev/fd47ecdfed71
Rename mock task names to prevent the number being considered as a chunk number. r=jmaher DONTBUILD

By making the new strategy with bugbug's config selection and a low confidence threshold the
default for 'mach try auto'.

Depends on D114364

Keywords: leave-open
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f9c73976484d
Add a new strategy using bugbug's config selection and a low confidence threshold. r=ahal DONTBUILD
https://hg.mozilla.org/integration/autoland/rev/f1377ee7e2d2
Add a shadow scheduler using the new strategy with bugbug's config selection and a low confidence threshold. r=ahal DONTBUILD
https://hg.mozilla.org/integration/autoland/rev/7a4401a358e8
Use bugbug's config selection results by default when using 'mach try auto'. r=ahal DONTBUILD
Flags: needinfo?(mcastelluccio)

(In reply to Alexandru Michis [:malexandru] from comment #22)

Backed out 3 changesets (Bug 1639164) for causing python flake8 and ci failures.
Backout link: https://hg.mozilla.org/integration/autoland/rev/00b9154f4315be73495d44b18af4d4b1a1a5b5ec
Push with failures, ci failure, flake failure, cram try failure.

I fixed the flake8 and tryselect test failures, which were trivial, but I still need to fix the test_mach_try_auto test.
The test is failing because we can now possibly schedule shippable builds. We were relying on bugbug_debug_disperse to filter out all non-debug platforms (shippable included), but the config selection alternative is not restricting to debug platforms anymore.
:ahal, any idea how to fix that?

Flags: needinfo?(mcastelluccio) → needinfo?(ahal)

I've added a filter in https://searchfox.org/mozilla-central/rev/3151f97de27730793c2e298716df760999423f26/taskcluster/taskgraph/target_tasks.py#373 to exclude shippable tasks.
There is still one problem left: we are now scheduling some build-signing tasks because they depend on xpcshell tasks. IDK why this wasn't the case with bugbug_debug_disperse.

The build-signing test was marked as failing until https://hg.mozilla.org/mozilla-central/rev/f07222b728fa426f56e269116c14b2c1e2987104.
I think it is passing with bugbug_debug_disperse just because of luck in the way the dispersion was done (and f07222b728fa426f56e269116c14b2c1e2987104 changed it). Indeed, if I increase the dispersion across more platform, the test fails.

So, I'll just mark it as failing.

Flags: needinfo?(ahal)
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3053350a4a94
Add a new strategy using bugbug's config selection and a low confidence threshold. r=ahal DONTBUILD
https://hg.mozilla.org/integration/autoland/rev/99b4cc4801ae
Add a shadow scheduler using the new strategy with bugbug's config selection and a low confidence threshold. r=ahal DONTBUILD
https://hg.mozilla.org/integration/autoland/rev/90f0b462690c
Use bugbug's config selection results by default when using 'mach try auto'. r=ahal DONTBUILD
https://hg.mozilla.org/integration/autoland/rev/fff1d447ffd1
Filter out shippable tasks for try auto target. r=ahal DONTBUILD
https://hg.mozilla.org/integration/autoland/rev/7b6be863db85
Mark mach try auto test that ensures no build signing task is scheduled as failing. r=ahal DONTBUILD
Product: Firefox Build System → Developer Infrastructure
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: