Closed Bug 1636271 Opened 7 months ago Closed 4 months ago

Optimize by replacement on mozilla-central pushes which are not merges

Categories

(Firefox Build System :: Task Configuration, enhancement)

enhancement

Tracking

(firefox81 fixed)

RESOLVED FIXED
81 Branch
Tracking Status
firefox81 --- fixed

People

(Reporter: marco, Assigned: ahal)

References

(Blocks 1 open bug)

Details

Attachments

(6 files)

When we have a mozilla-central push that just contains commits already pushed to autoland without merges (which appears to be the most frequent case), we could optimize by replacement tasks that already run on the autoland push corresponding to the latest commit in the mozilla-central push.

For example, the mozilla-central push "52cf30bf74473e78c0baf8c4d6055910f67a0a37" only contains commits from autoland up to "52cf30bf74473e78c0baf8c4d6055910f67a0a37".
So, any task that run in the autoland push on 52cf30bf74473e78c0baf8c4d6055910f67a0a37 could be used to optimize by replacement in the corresponding mozilla-central push.

This would be easier if we had tier2/tier3 tasks that never ran on autoland :)

Component: General → Task Configuration
Product: Testing → Firefox Build System
Version: Version 3 → unspecified

(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #1)

This would be easier if we had tier2/tier3 tasks that never ran on autoland :)

It shouldn't be a problem, we'd optimize by replacement on mozilla-central only tasks that run on the corresponding autoland push.

This is a great idea. There was discussion on #ci-cost-reduction on Matrix about moving m-c only tasks over to autoland and then only running them on backstop pushes (or even longer intervals).

So to save all the duplicated resources on m-c, we have two approaches:

Approach A:

  1. Fix this bug
  2. Move m-c only tasks over to autoland (optional)

Approach B:

  1. WONTFIX this bug
  2. Move m-c only tasks over to autoland
  3. Don't run anything on m-c except nightly builds

I like Approach A better as it will be less disruptive from a sherriffing standpoint plus will run tasks that somehow fell through the cracks on autoland. However if this bug ends up being too difficult, we have Approach B in our back pocket.

also we land patches directly on m-c which means Approach A is a bit more of a win. Although I would like to stop landing patches on m-c, but that is yet another problem to solve.

I started poking around here.. assigning to myself for now at least.

Assignee: nobody → ahal
Status: NEW → ASSIGNED

There are a number of things to be careful of when implementing this (though I don't think any of these apply if we limit to tests):

  • we build with a different update channel on central vs. autoland (there may be other configuration changes)
  • we encode the commit the build was built from; I suspect we want to reference m-c for things we ship
  • if we want to implement something like Bug 1637544 on autoland (which would probably be a reasonably big win) we definitely don't want to be shipping builds built like that from autoland

Yes, I think anything needed to build a nightly (or thing we ship), will still need to come from m-c. Was planning to limit this to tests to start, but maybe non-shippable builds could also use it later on. We'll need to see if there are any other differences.

I can think of a few things:

  1. ccov builds/tests
  2. win/aarch64
  3. fenix/reference-browser android perf tests
  4. conditioned profile generation runs for perf tests
  5. wpt-backlog jobs
  6. probably other asan/tsan stuff, and some unique builds/tests
See Also: → 1638316

Bug 1639383 is another difference, hopefully we can make it go away.

See Also: → 1639383

This is a fairly specific use case. Let's just return the TC response directly
and let consumers use it as they see fit.

Attachment #9155339 - Attachment description: Bug 1636271 - [taskgraph] Make 'list_task_group_incomplete_tasks' more general purpose, r?tomprince → Bug 1636271 - [taskgraph] Create utility function for listing all tasks in a task group, r?tomprince

This creates a new set of optimization strategies
(taskgraph.optimize:project.autoland) to use with autoland. Among other things,
it also means there's no need for the 'test-try' optimization as the autoland
strategies are no longer the default behaviour.

Depends on D79704

Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d3f93bd6b1f9
[taskgraph] Create utility function for listing all tasks in a task group, r=tomprince
https://hg.mozilla.org/integration/autoland/rev/4250f49877ba
[taskgraph] Move 'optimize-strategies' from try_task_config.json to a parameter, r=tomprince
https://hg.mozilla.org/integration/autoland/rev/4b0f13fcf941
[taskgraph] Set autoland optimizations via per-project parameter, r=tomprince

The above push passed with mach try auto but causes decision task failures on actual autoland. Not clear to me what the problem is, will need to investigate further tomorrow.

(In reply to Andrew Halberstadt [:ahal] from comment #18)

The above push passed with mach try auto but causes decision task failures on actual autoland. Not clear to me what the problem is, will need to investigate further tomorrow.

Forgive the fly by, but I've been poking around related code so I thought I'd have a look.

I'm not certain, but I suspect that because autoland currently uses the default target tasks, it may end up as None at https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/decision.py#383. It's part of the options for https://searchfox.org/mozilla-central/source/taskcluster/mach_commands.py#173, and it probably ends up with a value of None and gets set by https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/decision.py#291.

Ah yes, in hindsight the fact that every other entry in PER_PROJECT_PARAMETERS explicitly sets target_tasks_method should have been a clue.

Flags: needinfo?(ahal)

Not sure how to test this, as I'd need to fake the "project" value on try (and am unsure how to run the decision.py code path locally). Though it seems pretty obvious this is the issue and so am comfortable re-landing.

Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7adf02fc1803
[taskgraph] Create utility function for listing all tasks in a task group, r=tomprince
https://hg.mozilla.org/integration/autoland/rev/6c1e36ffa940
[taskgraph] Move 'optimize-strategies' from try_task_config.json to a parameter, r=tomprince
https://hg.mozilla.org/integration/autoland/rev/06b2344e94dc
[taskgraph] Set autoland optimizations via per-project parameter, r=tomprince
Pushed by btara@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ce1870c92534
Fix flake8 issue arising from merge conflict CLOSED TREE
You need to log in before you can comment on or make changes to this bug.