backfill tasks with still to built dependencies built these multiple times if multiple task runs per push requested - only 1 dependency build needed
Categories
(Firefox Build System :: Task Configuration, defect)
Tracking
(firefox103 fixed)
Tracking | Status | |
---|---|---|
firefox103 | --- | fixed |
People
(Reporter: aryx, Assigned: ahal)
References
Details
Attachments
(1 file)
See the Windows instr
tasks for this Treeherder view after a backfill of the Windows AArch R1 reftest with 3 runs per push had been requested.
The dependencies need to be built only once and their artifacts can be used by all reftest runs of the push. The current state would be a waste of money.
Comment 1•4 years ago
|
||
is this a taskcluster issue as it would be a dependency graph problem?
Comment 2•2 years ago
•
|
||
This can lead to some pretty bad backlogs on resource-constrained hardware pools. The current logic seems incredibly wasteful for when perf sheriffs are doing backfills. Can we find a way to prioritize fixing this?
Assignee | ||
Comment 3•2 years ago
•
|
||
Ryan, do you know of a recent example of one of these backlogs happening?
I'm wondering if maybe the new backfill action the perftest team set up might be the cause here. Assuming you've noticed it more frequently in recent weeks.
(also the tasks in comment 0 have expired, so would help to have a concrete task to use for debugging)
Comment 4•2 years ago
|
||
There was discussion about it a couple times in #sheriffs last week. I don't know how to trace that back to a specific backfill job triggered by the perf sheriffs.
Reporter | ||
Comment 6•2 years ago
|
||
In this case the backfills were for pushes from 2 weeks ago and had been requested by performance sheriffs afinder (task). The issue only causes noticeable backlog if worker pools with a very restricted pool size get used (like macOS or Android) - see the multiple runs for Linux and Windows.
There has been no subjective increase in backlogs.
Assignee | ||
Comment 7•2 years ago
|
||
Thanks, I found this push which I think illustrates the problem most clearly:
https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&revision=5f7167a8686ef3c397b9df72eca7528fdd8acadb&searchStr=osx
Looks like this happens with retriggers too, and only for perf (or at least talos) tasks.
Reporter | ||
Comment 8•2 years ago
|
||
Retriggers should not be affected because the dependencies are already available. These are backfills which get requested to run 10 times.
Assignee | ||
Comment 9•2 years ago
|
||
Ah, yeah I see the problem. Got confused because they didn't have -bk
in the symbol.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 10•2 years ago
|
||
Previously we we're looping over 'times' and submitting N separate graphs for
each. This is inefficient because it means that we also rerun dependencies that
many times.
This patch fixes this by instead setting the task_duplicates
attribute on
every task we are trying to backfill. Therefore, ensuring the dependencies only
run once.
Comment 11•2 years ago
|
||
Pushed by ahalberstadt@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0abfb85ba5b4 [taskgraph] Ensure backfills with 'times' only apply to desired tasks, r=gabriel,taskgraph-reviewers
Comment 12•2 years ago
|
||
bugherder |
Description
•