Closed Bug 1656465 Opened 4 years ago Closed 4 years ago

Consolidate all `push-interval-*` optimization strategies into one of the two backstops

Categories

(Firefox Build System :: Task Configuration, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aryx, Assigned: ahal)

Details

Attachments

(10 files)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review

Today, Windows debug fuzzing and AArch symbols tasks were broken on central.

During the investigation, it got noticed the fuzzing builds didn't run for the merge candidate because they exactly run every 10th job while test task in general run every 10th push or every 2h, whatever applies first.

This means code which breaks builds can be merged to central.

Quoting decoder: "we need to ensure that merges for fuzzing builds are always green. that was the initial requirement for adding those builds to the 10th push list".

There are two options:

  • we stop scheduling everything every 2 hours, and only rely on push interval;
  • we drop the "push-interval-10", "push-interval-20" and "push-interval-25" strategies and use alternative strategies that take into account the time interval too.

I think option 2 is best. Having half backstop and full backstop, where half backstop is guaranteed green, and full backstop is more tier-2 jobs, makes the most sense. If we do this, the (half|full) backstop will have perf, so we would run shippable builds there (which then makes backfilling a regression require shippable builds which will take hours) which will duplicate cost or add some confusion.

If we did this we would need some time based interval for half|full, maybe half=2hours|10pushes, full=4hours|20pushes ?

Assignee: nobody → mcastelluccio
Status: NEW → ASSIGNED

I've submitted a patch to solve the problem for fuzzing specifically, but the generic problem still applies.

Yet another option would be to have a "merge candidate" indicator on Treeherder for pushes for which we are sure everything was run and there are no failures.

Assignee: mcastelluccio → nobody
Status: ASSIGNED → NEW
Keywords: leave-open

(In reply to Marco Castelluccio [:marco] from comment #4)

Yet another option would be to have a "merge candidate" indicator on Treeherder for pushes for which we are sure everything was run and there are no failures.

I filed bug 1657921 for that.

Assignee: nobody → mcastelluccio
Status: NEW → ASSIGNED

My understanding is that this happens to any task that has a push-interval-* optimization strategy. After some discussion on Matrix, I think we'd like to deprecate all of those strategies and force all the tasks that use them to use either the optimized-backstop (every 10th push or 2 hours) or the full-backstop (every 20th push or 2 hours). Other intervals should be disallowed.

Adjusting the title here to reflect, please tweak if I got something wrong.

Summary: tier 1 backstop pushes need to run on same pushes like full test set → Consolidate all `push-interval-*` optimization strategies into one of the two backstops

(In reply to Andrew Halberstadt [:ahal] from comment #6)

My understanding is that this happens to any task that has a push-interval-* optimization strategy. After some discussion on Matrix, I think we'd like to deprecate all of those strategies and force all the tasks that use them to use either the optimized-backstop (every 10th push or 2 hours) or the full-backstop (every 20th push or 2 hours). Other intervals should be disallowed.

Adjusting the title here to reflect, please tweak if I got something wrong.

Agreed! Option 2 from comment 1.

I only submitted a patch to fix fuzzing, as it's probably the most important one (fuzzing needs green mozilla-central builds), but I won't be able to fix the rest of the builds any time soon, so I'm unassigning myself.

Assignee: mcastelluccio → nobody
Status: ASSIGNED → NEW
Pushed by mcastelluccio@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cd425862bc50 Run fuzzing builds on backstop pushes. r=ahal DONTBUILD
Assignee: nobody → mcastelluccio
Status: NEW → ASSIGNED
Assignee: mcastelluccio → nobody
Status: ASSIGNED → NEW
Assignee: nobody → ahal
Status: NEW → ASSIGNED

Under this model, each test file should correspond to a static set of parameters. Each
set of parameters will only ever generate the taskgraph a single time (so adding a new
file will have a perf hit, but adding new test functions to an existing file will not).

Each file represents a new taskgraph generation in the tests. But now that these exist, we
can add new assertions to the existing files without worry.

Depends on D89054

We want to try to align 'push-interval' tasks to the 'backstop'. This way
we have greater confidence in our backstop pushes, and it will allow us to
simplify a lot of our backstop logic.

Depends on D89055

Pushed by ahalberstadt@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a06084e5fb5f [ci] Move fixtures from test_mach_try_auto.py to conftest to share with new tests, r=taskgraph-reviewers,aki https://hg.mozilla.org/integration/autoland/rev/4f3292b37d8a [ci] Add some taskgraph integration tests to ensure fuzzing builds run where we expect them to, r=taskgraph-reviewers,aki https://hg.mozilla.org/integration/autoland/rev/2e30486edbb3 Move all 'push-interval-25' optimizations to 'push-interval-20', r=jmaher

The intent of a "backstop" push, is to run everything so we can be absolutely sure that
the push in question does not cause any regressions.

Previously, backstops were thought to be only something that ran on autoland.
This was because the other branches already ran everything so the concept of
a "backstop" didn't make much sense.

But going by the above definition, it would make more sense to say that every
non autoland (or try) push is a backstop. Since the intent there is to run
everything to avoid regressions.

This change will allow us to simplify our optimization algorithms.

This still assumes that the bugbug-based strategy is last however.

Depends on D89499

This is a nomenclature change + refactoring. Now there is only a single
"backstop" push. Which is currently set to every 20th push on autoland (or
every push on non-autoland branches).

Now there is also a concept of an "expanded" push. These are pushes that run
more stuff than usual, but not as much as a backstop normally would. These are
currently set to run at half the interval of a backstop.

Concretely, here are the strategy changes:

  • Introduced the 'expanded' strategy which has 'backstop' baked in
  • Renamed 'optimized-backstop' -> 'test-expanded'
  • Baked 'expanded' strategy into the 'test-expanded' strategy (therefore 'backstop' is also baked in)
  • Baked 'test-expanded' strategy into the 'test' strategy

Depends on D89500

This removes the last uses of the 'push-interval-10' and 'push-interval-20' strategies.
They are being removed because they are dangerous in that its easy to accidentally not run
tasks when they should.

Instead, task authors should decide whether they want their tasks to run on
"backstop" pushes (run everything) or "expanded" pushes (run more than usual,
but still not as much as a backstop). Note that using "expanded" means the task
will also run on backstop pushes. It'll just additionally run on "expanded"
pushes.

In practice 'backstop' pushes will be every 20th push and 'expanded' pushes
will be every 10th push. Though this may vary due to the time component in
backstops.

Depends on D89501

In the past, the 'backstop' optimization was applied to tasks by default across
all projects, even though it only really made sense on autoland. To choose what
would happen on non-autoland branches, we invented this 'remove_on_projects'
concept.

These days, we only apply the backstop optimization in the first place for
autoland. So 'remove_on_projects' is no longer necessary.

Depends on D88149

Attachment #9174503 - Attachment description: Bug 1656465 - [taskgraph] Consider all pushes to release branches as 'backstops', marco → Bug 1656465 - [taskgraph] Consider all pushes to release branches as 'backstops', r?marco

It turns out that 'Not' is needed to negate "backstops". E.g, we normally
we want to use a pattern like so:

All("skip-unless-backstop", "test")

Since 'skip-unless-backstop' returns False on backstop pushes, it disables
the test strategy there.

However, suppose we wanted to run a special optimization, only on backstop
pushes. I.e, the opposite of the above example. Then we need to use:

All(Not("skip-unless-backstop"), "test-backstop")

Depends on D89500

Pushed by ahalberstadt@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/64d8b17b920e [taskgraph] Consider all pushes to release branches as 'backstops', r=marco https://hg.mozilla.org/integration/autoland/rev/fe9bb34b3bae [taskgraph.optimize] Allow 'split_bugbug_arg' to work with arbitrary number of substrategies, r=marco https://hg.mozilla.org/integration/autoland/rev/cf06909a30d8 [taskgraph.optimize] Implement a 'Not' composite strategy, r=marco https://hg.mozilla.org/integration/autoland/rev/1020ae833cdb [taskgraph.optimize] Refactor "optimized-backstop" pushes into "expanded" pushes, r=marco https://hg.mozilla.org/integration/autoland/rev/e4f78c4b8a81 [taskgraph.optimize] Rename 'push-interval-{10,20}' strategies to 'expanded' and 'backstop' respectively, r=marco https://hg.mozilla.org/integration/autoland/rev/d8609e8ba6b0 Drop the 'remove_on_projects' feature from the Backstop optimization, r=marco
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: