Additional filters for mach try task blacklist to filter tasks unless --full is specified
Categories
(Developer Infrastructure :: Try, enhancement)
Tracking
(firefox77 fixed)
Tracking | Status | |
---|---|---|
firefox77 | --- | fixed |
People
(Reporter: egao, Assigned: egao)
References
Details
Attachments
(1 file)
I am proposing to insert additional filters into the blacklist that is applied to mach try
pushes, as implemented in Bug 1630350 and Bug 1627340.
For the most part, these tasks must intentionally be chosen for mach try fuzzy
.
However, if a user pushes with mach try chooser
+ select all, or mach try syntax
+ -b do
, these tasks are inadvertently scheduled quite easily.
Since in Bug 1630350 it was intentionally made such that mach try syntax
does not have a --full
option to bypass filtering, this means users intending to run tasks filtered by the blacklist would be directed to use the alternatives.
Proposed list:
build.*aarch64.*
build*.gcp.*
build-linux/opt
.*-rusttests
.*valgrind.*
Assignee | ||
Comment 1•4 years ago
|
||
:jmaher, :gbrown - I'd like to hear your thoughts about adding the above list (and possibly more) to the task filter blacklist to be applied by default to the mach selectors.
I was particularly annoyed that selecting syntax -b do -p all
would schedule builds for all those items, including the ones we have restricted from running like linux32.
I suspect if I run -u all
I would see a lot of unittests get scheduled, which may be fine additions to the list.
Comment 2•4 years ago
|
||
I think this is a great idea. A lot of these are things that we are already looking at restricting to every 10th push or as tier-2, so this is right in line. Also the plain builds might be good to add, same with mingw32 :)
Assignee | ||
Comment 3•4 years ago
|
||
Changes:
Add more filters to be applied by default to all mach try selectors.
In order to access tasks hidden by the filters, please use --full
with fuzzy/chooser selectors.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 4•4 years ago
|
||
From a task efficiency / cost reduction perspective, this looks great.
On the other hand, a reasonable use of syntax -all is "I'm trying to land a tricky change that has unknown side effects and I want to see if my change breaks anything that runs on autoland or central". Anything that's left out is a potential surprise that will require a backout...I wonder if this change is going too far. But I'm not sure: What are the chances of breaking aarch64 builds, rust tests, valgrind, etc? Hard to say, I'm sure.
Comment 5•4 years ago
•
|
||
We should keep in mind that every task we add to this list makes developers lives harder because it increases the chances of them being backed out over a failure that they had no chance of detecting on try. I've fielded many complaints about this happening and developers always find it extremely frustrating.
I think there's a risk that if we add too much stuff to this list, it will actually have the opposite effect.. we could see it increase cost. This is because developers grow frustrated and will use --full
more often to select the tasks that caused them to get backed out last time. As a result, they'll end up over-selecting.
Historically, we've tried to limit this list to platforms that are resource constrained (like android-hw
) or things that are very expensive to run (like ccov
). But over time it has started to grow larger. I question whether all of the things in the list currently should be, and think in general we should keep the list as small as possible. My personal opinion is that it is already too big.
All of this isn't to say that maybe some of your suggestions actually do belong on the list. I haven't thought about your particular examples.
Comment 6•4 years ago
|
||
Philosophically speaking, the goal of autoland/central is to catch regressions. The goal of try is to help developers not get backed out. So if we think these tasks aren't providing good value, the conversation should be around removing them from autoland/central. The tasks available on try should just mirror those as closely as possible.
Comment 7•4 years ago
•
|
||
In my opinion being backed out is not a bad thing, and that people should expect to be backed out some percentage of the time, and that if they are running a bunch of tests to ensure that they are not being backed out, then they are likely running too many tests. I'm fairly certain that some of the optimizations and things that have been added to try syntax where added because people where running -t all
more than necessary.
Comment 8•4 years ago
|
||
What about pointing people at a defined preset suite like mach try --preset sample-suites
instead of using -t all
if they want nearly complete testing before landing on autoland?
Assignee | ||
Comment 9•4 years ago
•
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #5)
We should keep in mind that every task we add to this list makes developers lives harder because it increases the chances of them being backed out over a failure that they had no chance of detecting on try. I've fielded many complaints about this happening and developers always find it extremely frustrating.
I understand their frustration, but I would like to point out that --full
flag exists to precisely permit developers to access the unfiltered list.
The problem I'm trying to address by adding to the blacklist is to prevent the situation where a user select all with mach try chooser
or mach try fuzzy
. Even worse, for mach try syntax
the user will in many cases select all
for a lot of things because figuring out what syntax works is too long.
Things like build-.*gcp
or valgrind
are low-value tasks whose consumer isn't even well defined yet, let alone exist in some cases.
I'm adding predominantly build tasks here, because this loops back to the topic discussed in the Smart Scheduling meeting this (2020/04/23) morning, that most of the build tasks aren't even clear why they run, who consumes them, or have any tests using the build as dependency.
Those are the situations I'm trying to catch with this extended blacklist. They aren't made permanently unavailable; just removed from the regular view unless developers explicitly know that they want to schedule such tasks.
Assignee | ||
Updated•4 years ago
|
Comment 10•4 years ago
|
||
most if not all the filters here are for jobs that run on limited hardware, only run on m-c as tier-2, or could be tier-2/low risk.
it does get risky when we have a lot of stuff in a blacklist in one part of code, and other filters elsewhere. Ideally all of these tasks would be tier2 or tier3 and we would only schedule tier1 by default without extra flags.
Comment 11•4 years ago
|
||
(In reply to Edwin Takahashi (:egao) from comment #9)
I understand their frustration, but I would like to point out that
--full
flag exists to precisely permit developers to access the unfiltered list.
True, but the hope is that using --full
is a rare edge case that is only there for power users who know what they are doing. The risk comes with devs developing a habit of using it even when not necessary.
The problem I'm trying to address by adding to the blacklist is to prevent the situation where a user select all with
mach try chooser
ormach try fuzzy
. Even worse, formach try syntax
the user will in many cases selectall
for a lot of things because figuring out what syntax works is too long.
Yes, this is a problem, and yes your solution would help improve it. Another way to solve this is to remove the all
alias to ./mach try syntax
and put a hard cap on the number of tasks in ./mach try fuzzy
. This will also annoy developers, but I think it might be more palatable for them.
Things like
build-.*gcp
orvalgrind
are low-value tasks whose consumer isn't even well defined yet, let alone exist in some cases.
Yes those might be good candidates for the list.
I'm adding predominantly build tasks here, because this loops back to the topic discussed in the Smart Scheduling meeting this (2020/04/23) morning, that most of the build tasks aren't even clear why they run, who consumes them, or have any tests using the build as dependency.
A task that is unclear / has no ownership will still cause a backout just the same as well maintained one. If it is truly unmaintained, it shouldn't be running on mozilla-central in the first place.
Those are the situations I'm trying to catch with this extended blacklist. They aren't made permanently unavailable; just removed from the regular view unless developers explicitly know that they want to schedule such tasks.
Understood. I'm not saying the list should be empty (valgrind
seems like a prime candidate for it for instance). I'm just pointing out the negative side of diverging try from mozilla-central. I don't know how to weigh the cost in CI against the cost in developer salary / morale, we don't have data for it.
Comment 12•4 years ago
|
||
(In reply to Tom Prince [:tomprince] from comment #7)
In my opinion being backed out is not a bad thing, and that people should expect to be backed out some percentage of the time, and that if they are running a bunch of tests to ensure that they are not being backed out, then they are likely running too many tests. I'm fairly certain that some of the optimizations and things that have been added to try syntax where added because people where running
-t all
more than necessary.
I actually strongly agree with this. But when you get backed out and then later find that the task you got backed out from can't even be run on try by default, it can be very frustrating. It's psychological more than anything.
Alternate solutions I'd prefer to this blacklist include removing all
from try syntax and putting a hard cap on try_task_config.json
tasks. Basically either force people to be thoughtful, or if not, then to use ./mach try auto
.
Assignee | ||
Comment 13•4 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #12)
Alternate solutions I'd prefer to this blacklist include removing
all
from try syntax and putting a hard cap ontry_task_config.json
tasks. Basically either force people to be thoughtful, or if not, then to use./mach try auto
.
That's a really good alternate solution.
I'll spend some time reviewing what each of the existing blacklists catch, consider any of them for removal and implement your suggestions.
Comment 14•4 years ago
|
||
It's worth pointing out that in the past, we've sort of hesitated to implement a cap because we didn't have a good answer to the question to "How do I test a large / risky patch?". Once ./mach try auto is a bit better, we'll have that answer.
Comment 15•4 years ago
|
||
A cap on the number of tasks less than "one suite on all configurations" would be a problem for me/wptsync.
Do we have data on how often these rare tasks are used on try and how often they don't pass?
Assignee | ||
Comment 16•4 years ago
|
||
I'm going to hold off on the suggestions made by :ahal for the time being, as the alternatives aren't ready yet if those restrictions are put in place.
However, I will do the following in this bug:
- review the filter list to ensure we're filtering what we want
- incorporate build tasks into the list only
In a follow-up bug, once mach try auto
is production ready:
- investigate removing
-u all -t all
, or some combination of that - investigate placing an upper limit on
try fuzzy
ortry chooser
tasks
Since mach try auto
isn't ready yet, the last two items are just things for me to think about.
(In reply to James Graham [:jgraham] from comment #15)
A cap on the number of tasks less than "one suite on all configurations" would be a problem for me/wptsync.
Do we have data on how often these rare tasks are used on try and how often they don't pass?
If I maintain the ability to select -p all
but remove the ability to select -t all
, for example, then that should retain your ability to run all web-platform-tests upon sync from upstream.
Either way, no action will be taken for some time.
Comment 17•4 years ago
|
||
wptsync uses mach try fuzzy
to select the tasks, not try syntax. My concern is if we say you can only have 400 task total in a try push, or something, then we might hit that limit during wptsync where we really do need to run on all platforms that are running on central.
Comment 18•4 years ago
|
||
Pushed by egao@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f5b0f10cff51 additional filters on mach try selectors r=jmaher
Comment 19•4 years ago
|
||
bugherder |
Updated•2 years ago
|
Description
•