Eliminate vanilla opt tests from trunk (except try)
Categories
(Testing :: General, enhancement)
Tracking
(Not tracked)
People
(Reporter: egao, Assigned: bhearsum)
References
(Blocks 1 open bug)
Details
Problem Statement
Mozilla currently runs both the vanilla opt
and the shippable/opt
tests on trunk.
This is true for the following platforms:
linux1804-64
window7-32
windows10-64
android-hw-p2-8-0-android-aarch64
This leads to a level of duplication for these platforms between the vanilla opt
and the shippable
variants.
Update 2020/06/25: the numbers used in this comment is not entirely accurate (incorrect methodologies used). :jmaher was able to arrive at a more accurate number.
Premise
The opt
builds behave almost like a subset of the shippable/opt
builds in that bugs that manifest in shippable/opt
should also manifest itself in opt
, with the exception of compiler bugs.
In the past, developer turnaround when running tests on tryserver or locally have been given as justification for keeping the opt
build and tests around. But there is no reason why opt
should run on non-try branches if that is the reason.
Looking at the current revision of test-platforms.yml
, we see that in all cases of platforms listed above, the shippable/opt
acts as a superset of the vanilla opt
:
linux1804-64/opt:
build-platform: linux64/opt
test-sets:
- awsy
- desktop-screenshot-capture
- linux1804-tests
- mochitest-headless
- web-platform-tests
- web-platform-tests-backlog
linux1804-64-shippable/opt:
build-platform: linux64-shippable/opt
test-sets:
- awsy
- desktop-screenshot-capture
- linux1804-tests
- marionette-headless
- mochitest-headless
- mochitest-webgpu
- web-platform-tests
- web-platform-tests-backlog
- web-platform-tests-wdspec-headless
So we likely will not be losing coverage by turning off the opt
variant.
Metrics
All values are from the Taskcluster Build & Test Suite Report.
These values are for mozilla-central
only.
2020/03/01 - 2020/06/01 | Extrapolated 2020 | |||
---|---|---|---|---|
platform | opt | shippable | opt | shippable |
linux1804 | $1619 | $414 | $6000 | $1400 |
windows7-32 | $4146 | $2087 | $16000 | $8300 |
windows10-64 | $4662 | $1908 | $17500 | $7800 |
android-hw-p2-8-0 | $2783 | $305* | $9300 | $6800 |
- android-hw data does not exist for 2020/03 and 2020/04, so values were extrapolated from the 2020/05 value of ~$580.
Proposal
Except for try
, let's restrict the test suites to run on either shippable/opt
or opt
, not both. This level of duplication is unnecessary.
The exception would be in try
where the faster turnaround time of opt
is welcome.
Q&A
Q1. Wouldn't developers encounter regressions when they test against opt
on try but the patch is run entirely against shippable/opt
?
A1. While I do not have hard data on regression to back this statement up, the impression I get from speaking to a couple of people is that regressions on opt
will also manifest itself on shippable/opt
if it was not addressed. (I may be paraphrasing incorrectly here.) In other words, if problem is fixed on opt
, it is also applicable to shippable/opt
.
The only regressions unique to shippable/opt
likely come from the compiler.
Q2. How much money would we save?
A2. Using the chart above, if opt
tests were discontinued for mozilla-central
only, we stand to save roughly ~$50,000. This is calculated by summing the extrapolated 2020 totals for the opt
variant.
Q3. What about autoland
?
A3. autoland
consumes a lot more than mozilla-central
since it runs a lot more frequently. Turning off the duplication there would yield potential savings of easily double the extrapolated 2020 values for mozilla-central
for each platform. I think a ballpark of $90,000-$100,000 is doable.
Q4. What about builds?
A4. Costs a lot...a lot...of money. Will be in a separate bug.
Comment 1•4 years ago
|
||
If I'm not mistaken, we are already not running opt on autoland, just shippable.
Comment 2•4 years ago
|
||
correct, it runs on central in parallel with shippable and then also runs as primary on try server- so the real duplication is on mozilla-central.
Comment 3•4 years ago
|
||
keys to watch out for here, opt includes:
- asan
- tsan
- ccov
- shippable
- opt
so we really want to see usage of opt only, which this raw bigquery gets us closer:
select workerType, CAST(sum(execution) as INT64) from taskclusteretl.derived_task_summary where date>'2020-06-20' and date<'2020-06-26' and project='mozilla-central' and kind='test' and collection='opt' and platform not like '%shippable%' and platform not like '%san%' and platform not like '%ccov%' group by workerType
this returns:
workerType f0_
t-win7-32 829631
t-win10-64 1683005
t-linux-large 1818292
t-win7-32-gpu 235886
t-linux-xlarge 769449
t-win10-64-gpu-s 341875
gecko-t-win10-64-ref-hw 505711
gecko-t-bitbar-gw-perf-p2 70796
gecko-t-bitbar-gw-unit-p2 19567
gecko-t-win64-aarch64-laptop 99231
I verified the hardware (*ref-hw, bitbar, win64-aarch64) are all running shippable, it is just a platform name syntax issue.
Comment 4•4 years ago
|
||
doing some estimates with math, I think we might be able to save $1K/month by removing opt specific jobs.
Assignee | ||
Comment 5•4 years ago
|
||
I can look at this. It's a bit trickier than what we did on autoland due to needing to keep support for opt
on try - but I'm sure we can find a way.
Reporter | ||
Comment 6•4 years ago
|
||
Thanks to :jmaher whom used better methodologies and pointed out that my use of Google Data Studio's values for platform/opt
lumped in all opt
type tests, so there was some maths that needed to be done.
While unfortunately it isn't the ~100k/year that I had for, ~12k/year is helpful at least.
Assignee | ||
Comment 7•4 years ago
|
||
(In reply to Edwin Takahashi (:egao) from comment #6)
Thanks to :jmaher whom used better methodologies and pointed out that my use of Google Data Studio's values for
platform/opt
lumped in allopt
type tests, so there was some maths that needed to be done.While unfortunately it isn't the ~100k/year that I had for, ~12k/year is helpful at least.
Yeah, you're not the first one to hit this. You can make this a bit easier by adding this join to your queries: JOIN (SELECT metadata.name AS name, taskId FROM taskclusteretl.task_definition) USING (taskId)
-- then name
can be queried, which matches exactly with what you see on Treeherder.
Assignee | ||
Comment 8•4 years ago
|
||
Can we kill opt on release branches too? ie: everywhere except try
Comment 9•4 years ago
|
||
is it on release branches? I thought it was only on m-c+ try
Assignee | ||
Comment 10•4 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #9)
is it on release branches? I thought it was only on m-c+ try
Heh, maybe they're not! In that case, it sounds like the more generous phrasing here is "only run regular opt builds on try".
Comment 11•4 years ago
|
||
If we disable opt builds on central then IMO there is no point in keeping the configs around. Having them be "try-only" means that it's only a matter of time until they break and become useless. Either we care about them (and they should at least run on central as a sanity check) or we don't (and we should just remove them from the configs entirely).
Discussion on newsgroups has consistently shown that developers still find value in them, so to me it seems like we can't remove them on central (maybe it's fine to remove them on other release branches, but I don't know).
Reporter | ||
Comment 12•4 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #11)
If we disable opt builds on central then IMO there is no point in keeping the configs around. Having them be "try-only" means that it's only a matter of time until they break and become useless. Either we care about them (and they should at least run on central as a sanity check) or we don't (and we should just remove them from the configs entirely).
Discussion on newsgroups has consistently shown that developers still find value in them, so to me it seems like we can't remove them on central (maybe it's fine to remove them on other release branches, but I don't know).
I created this original bug, after discussing with :ajones about the amount of duplication we have at a variety of levels.
My counterpoint is that developers likely find it useful because opt
builds quickly and being able to turn around results quickly is a net positive in try. However, for non-try branches, it does not make sense to continue running vanilla opt
because it does not lead to the released code/binary. Neither is it unique in the sense of catching regressions, because supposedly regressions would also be caught by shippable-opt
tests.
Isn't there precedents in something like this, where the configuration is not run on mozilla-central
but developers are permitted to run them elsewhere? Granted, those platforms are typically quite niche or up-and-coming, so there's that.
I think ultimately it makes no sense to run tests on mozilla-central that doesn't lead to released code; doesn't find unique regressions; and costs us to the tune of ~12k a year for no reason.
Comment 13•4 years ago
|
||
yes, developers have a need for opt builds for local building/testing and faster turnaround on try server. A shippable build is an opt build + a few extra stages, there is not much that will not be ignored by disabling opt builds/tests on central (we only run opt builds/tests on central, not anywhere else).
It is my understanding that we get zero value from running these builds/tests on mozilla-central. In the current landscape of trimming costs, this could have a slight risk of breakage on try, but it is a very low risk. We are already making optimizations with higher risk, and I would rather exhaust the list of low risk items.
Are there additional reasons to keep these running on m-c? If not, I would vote for doing this and reverting our decision when there is a real problem.
Assignee | ||
Comment 14•4 years ago
|
||
Callek and I were talking elsewhere about this bug and we came up with a couple of alternative ideas to accomplish this:
-
Combine the current
opt
builds and theinstrumented
builds. The latter are the ones that Joel referred to that are essentially the same as opt. This has the side benefit of letting us run unit tests against a faster build even for pushes that want perf tests. (Right now, if your push asks for a perf test, even unit tests will run against the shippable build - which means you wait longer for those results.) -
Always do shippable builds on try, but grab a profile from elsewhere (probably central - or maybe the most recent commit that the push is based on that we can find a profile for). This means that all unit tests and perf tests will always run on try against a pgo'ed build (arguably making the unit test results slightly more valid). We're not sure how valid it is to re-use profiles - so that may be a downside.
Comment 15•4 years ago
|
||
My comment was mostly directed at the builds themselves, I agree it makes less sense to run the tests there. But we should at least not break the builds. Combining opt
and the instrumented
builds seems like a good idea to me (though I have no idea how feasible it is).
As for tests, I think the approach we should take is to not put shippable
in the test labels. So instead of both:
test-linux1804-64-shippable/opt-mochitest-plain-e10s-1
test-linux1804-64/opt-mochitest-plain-e10s-1
we'd only have:
test-linux1804-64/opt-mochitest-plain-e10s-1
Then what build platform that depends on (shippable or not) could be decided in the configs/transforms on a per project basis. That way we would inherently only have one set of "opt-like" tests on any given project.
The downside to that approach is that there would be no "easy" way to run shippable tests on try. So to get around that we could craft a --enable-pgo
flag to |mach try| that turns that back on.
Comment 16•4 years ago
|
||
I'll also note that Tom Prince had the idea of hiding this abstraction away in the build tasks themselves, but that is outside my domain of expertise and seems a bit scarier with broader implications.
Assignee | ||
Comment 17•4 years ago
|
||
This is somewhat intertwined with bug 1650208, but I want to try to summarize our desired end state here. Implementation details aside, I think that we want the following:
project | build type | unit tests? | perf tests?
--------------------------------------------------------
autoland | opt | yes | no
| shippable* | no | yes*
mozilla-central | shippable | yes | yes
try | opt | yes* | no
| shippable | yes* | yes*
* Doesn't run by default, but can run
For try specifically, ahal's comment #15 about combining the opt/shippable tests is probably something we should do here, too - it should have a notable impact on try push cost by getting rid of duplicated unit tests tasks when perf tests + unit tests are selected.
I know there's some comments about possibly wanting to continue doing just opt builds on mozilla-central, but I'm having trouble finding justification for that when the instrumented builds are already running per push. My understanding (which could be wrong) is that it's exceedingly unlikely that we'll uncaught bustage for the regular opt builds because of that. Additionally, we've decided in bug 1650208 that we're going to run opt instead of shippable on autoland, so any merges from there have a virtually nil chance of being broken for opt builds.
Assignee | ||
Comment 18•4 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #16)
I'll also note that Tom Prince had the idea of hiding this abstraction away in the build tasks themselves, but that is outside my domain of expertise and seems a bit scarier with broader implications.
I think this is probably not viable given bug 1650208 where we want to run opt by default as well as shippable for backstop pushes/backfills.
(In reply to bhearsum@mozilla.com (:bhearsum) from comment #14)
Callek and I were talking elsewhere about this bug and we came up with a couple of alternative ideas to accomplish this:
Combine the current
opt
builds and theinstrumented
builds. The latter are the ones that Joel referred to that are essentially the same as opt. This has the side benefit of letting us run unit tests against a faster build even for pushes that want perf tests. (Right now, if your push asks for a perf test, even unit tests will run against the shippable build - which means you wait longer for those results.)Always do shippable builds on try, but grab a profile from elsewhere (probably central - or maybe the most recent commit that the push is based on that we can find a profile for). This means that all unit tests and perf tests will always run on try against a pgo'ed build (arguably making the unit test results slightly more valid). We're not sure how valid it is to re-use profiles - so that may be a downside.
I think it's probably best to punt on either of these optimizations for now. Both of them are trying to solve the "unit tests on try pushes take longer when perf tests are selected" problem, which is an optimization for a small subset of try pushes.
Comment 19•4 years ago
|
||
Table makes sense to me (though I wouldn't call myself an authoritative source, especially w.r.t perf).
Reporter | ||
Comment 20•4 years ago
|
||
(In reply to bhearsum@mozilla.com (:bhearsum) from comment #17)
This is somewhat intertwined with bug 1650208, but I want to try to summarize our desired end state here. Implementation details aside, I think that we want the following:
project | build type | unit tests? | perf tests? -------------------------------------------------------- autoland | opt | yes | no | shippable* | no | yes* mozilla-central | shippable | yes | yes try | opt | yes* | no | shippable | yes* | yes* * Doesn't run by default, but can run
For try specifically, ahal's comment #15 about combining the opt/shippable tests is probably something we should do here, too - it should have a notable impact on try push cost by getting rid of duplicated unit tests tasks when perf tests + unit tests are selected.
I know there's some comments about possibly wanting to continue doing just opt builds on mozilla-central, but I'm having trouble finding justification for that when the instrumented builds are already running per push. My understanding (which could be wrong) is that it's exceedingly unlikely that we'll uncaught bustage for the regular opt builds because of that. Additionally, we've decided in bug 1650208 that we're going to run opt instead of shippable on autoland, so any merges from there have a virtually nil chance of being broken for opt builds.
This makes sense to me and it achieves the goal that outlined in the initial description. I am in favor.
Comment 21•4 years ago
|
||
(In reply to bhearsum@mozilla.com (:bhearsum) from comment #17)
This is somewhat intertwined with bug 1650208, but I want to try to summarize our desired end state here. Implementation details aside, I think that we want the following:
project | build type | unit tests? | perf tests? -------------------------------------------------------- autoland | opt | yes | no | shippable* | no | yes* mozilla-central | shippable | yes | yes try | opt | yes* | no | shippable | yes* | yes* * Doesn't run by default, but can run
For try specifically, ahal's comment #15 about combining the opt/shippable tests is probably something we should do here, too - it should have a notable impact on try push cost by getting rid of duplicated unit tests tasks when perf tests + unit tests are selected.
I know there's some comments about possibly wanting to continue doing just opt builds on mozilla-central, but I'm having trouble finding justification for that when the instrumented builds are already running per push. My understanding (which could be wrong) is that it's exceedingly unlikely that we'll uncaught bustage for the regular opt builds because of that. Additionally, we've decided in bug 1650208 that we're going to run opt instead of shippable on autoland, so any merges from there have a virtually nil chance of being broken for opt builds.
This works for me.
Bug 1636271 would make "opt" on mozilla-central effectively free (as it wouldn't really run on most m-c pushes), but not running "opt" at all on m-c is very likely easier to implement.
Assignee | ||
Comment 22•4 years ago
|
||
This got fixed as part of bug 1650208.
Description
•