1473048 - Don't run coverage builds for when "mach try all"

Reporter

Description

•

6 years ago

Should we run them on "mach try all"? Or are the results mostly equivalent to a debug build?

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 1

•

6 years ago

the chunking/runtime is different- it seems overkill.  Also the builds will take considerably longer as we can do artifact builds for debug when someone specifies artifact and they are not doing c++/rust coding.

another thing is we have some coverage only failures, but that is minimal.  If we are considering replacing debug proper with coverage, that could be a fun conversation- remember we do things to in mozconfig:
https://searchfox.org/mozilla-central/source/browser/config/mozconfigs/linux64/code-coverage

ac_add_options --disable-install-strip
ac_add_options --disable-elf-hack
ac_add_options --disable-sandbox
ac_add_options --disable-profiling
ac_add_options --disable-warnings-as-errors
ac_add_options --enable-coverage


I am not sure if any of those would have us skip real errors that would be caught by a traditional debug build.

Flags: needinfo?(jmaher)

Greg Mierzwinski [:sparky]

Assignee

Comment 2

•

6 years ago

Here's another bug regarding this issue: https://bugzilla.mozilla.org/show_bug.cgi?id=1397722

Greg Mierzwinski [:sparky]

Assignee

Comment 3

•

6 years ago

And another one that was closed but has some messages from when it started again: https://bugzilla.mozilla.org/show_bug.cgi?id=1305242

Greg Mierzwinski [:sparky]

Assignee

Updated

•

6 years ago

Blocks: 1397722, 1305242

Joel Maher ( :jmaher ) (UTC -8)

Comment 4

•

6 years ago

:sparky- is this a bug you could work on sometime next week?

Flags: needinfo?(gmierz2)

Greg Mierzwinski [:sparky]

Assignee

Comment 5

•

6 years ago

Yup, I'll look into it this next week.

Assignee: nobody → gmierz2

Flags: needinfo?(gmierz2)

Greg Mierzwinski [:sparky]

Assignee

Comment 6

•

6 years ago

Here's another bug about this problem (resolved because we found another solution): https://bugzilla.mozilla.org/show_bug.cgi?id=1391483

Greg Mierzwinski [:sparky]

Assignee

Comment 7

•

6 years ago

This is an interesting issue. From what I see we have a couple problems to deal with here:

(1): Running '-p all' is the only way to schedule linux64-ccov tests with try option syntax.
(2): We can only schedule it with |mach try fuzzy|.
(3): Both seem to depend on the full-task-graph created by the decision taskgraph.

Working on a possible solution by using the build.py transforms here: https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/build.py

Greg Mierzwinski [:sparky]

Assignee

Comment 8

•

6 years ago

Using build.py almost worked: https://treeherder.mozilla.org/#/jobs?repo=try&revision=09e099ddbd4e8e2e0dcac0342dfc7150b8ef67ea

Through |mach try fuzzy| it was enabled because this command uses 'mozilla-central' as the default project. But through the try option syntax it was disabled since it uses 'try'. Now, because the gecko decision task uses 'try' as the project as well, the scheduled tasks were prevented from running.

Greg Mierzwinski [:sparky]

Assignee

Comment 9

•

6 years ago

I was hoping we could make a little hack to get around this but there's no hope there: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c1eb90f94da8ea4d9dfdbd247e04d79f1aa30a71

I tried the patch above, where I disabled it based on the 'project' flag, but it didn't work because |mach try fuzzy| uses 'mozilla-central' to find the tasks, but the decision task on taskcluster then uses the 'try' flag which removes them.

:ahal, is this a bug in `try fuzzy` or expected behaviour? This doesn't block me, I am just curious about why it's different.

I found a way around this issue by using the `config` argument in the transforms to decide based on the `try_mode` flag in parameters.yml: https://dxr.mozilla.org/mozilla-central/source/taskcluster/docs/parameters.rst#82

Here's a working build (just need to add jsdcov there as well): https://treeherder.mozilla.org/#/jobs?repo=try&revision=818a95fb926763c2705a8867a186ee1c4c73de63 

I double checked if everything still works on mozilla-central and it does, or at least I think based on the target_tasks output by this command: 

mach taskgraph decision --base-repository https://hg.mozilla.org/mozilla-central --head-repository https://hg.mozilla.org/mozilla-central --head-ref 000 --head-rev d0d2e0f4b33c --project mozilla-central --message "try: -b do -p all -u none -t none" --owner me --level 1 --pushlog-id 1 --pushdate 11

Swapping `--project mozilla-central` with `--project try` shows that it still works with the 'mozilla-central' project but is completely disabled on try when try option syntax is used. I don't think that's how it's scheduled on mozilla-central but at least we can be sure that this change only prevents try option syntax from triggering any coverage builds (win, osx, linux, and android) on only try.

Flags: needinfo?(ahal)

Andrew Halberstadt [:ahal]

Comment 10

•

6 years ago

Sorry, can you clarify the question?

Fwiw, I think the solution to this bug should be entirely in taskcluster/taskgraph/try_option_syntax.py. Manipulating the task definition just to appease try syntax doesn't seem desirable. Unfortunately (or maybe fortunately) try_option_syntax.py is one giant hack so you'll have to do some experimenting. I think this is the if statement where you'll want to return False for ccov builds:
https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/try_option_syntax.py#601

Flags: needinfo?(ahal)

Andrew Halberstadt [:ahal]

Comment 11

•

6 years ago

Here's where that task_matches() function gets called:
https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/target_tasks.py#110

So any task that returns True there will get scheduled on the try push.

Greg Mierzwinski [:sparky]

Assignee

Comment 12

•

6 years ago

The 'project' entry in the parameters file or object when running |mach try fuzzy| is 'mozilla-central' but when running the regular decision task (D) on try it uses 'try' as the value for the 'project' entry. Is there a reason for the difference?

I think that's around the area where the bug first started when windows 8 was added: https://bug1391483.bmoattachments.org/attachment.cgi?id=8899135

So like you said it is a big hack to play with that so I'd rather use that as a last option. I found a better way of fixing this problem without changing the task definitions: https://hg.mozilla.org/try/rev/e708f408284dcbb87dd75e078f6c70f29f3ee1bc

Or is that the method you are suggesting I don't use?

Greg Mierzwinski [:sparky]

Assignee

Comment 13

•

6 years ago

And if you're wondering, I think the 'mozilla-central' project parameter I mention comes from here: https://dxr.mozilla.org/mozilla-central/source/tools/tryselect/tasks.py?q=path%3Atryselect+project&redirect_type=single#50

Andrew Halberstadt [:ahal]

Comment 14

•

6 years ago

(In reply to Greg Mierzwinski [:sparky] from comment #12)
> The 'project' entry in the parameters file or object when running |mach try
> fuzzy| is 'mozilla-central' but when running the regular decision task (D)
> on try it uses 'try' as the value for the 'project' entry. Is there a reason
> for the difference?

Yes, this is expected. When generating the taskgraph with |mach try fuzzy| the end goal is to prevent a backout. So by default we want to select from the same set of tasks that run on mozilla-central (which is typically all the tasks that might get you backed out). If we included the set of tasks that were 'try-only', then we'd either be wasting resources (as they can't cause a backout anyway) or causing confusion (as they are very error prone).


> So like you said it is a big hack to play with that so I'd rather use that
> as a last option. I found a better way of fixing this problem without
> changing the task definitions:
> https://hg.mozilla.org/try/rev/e708f408284dcbb87dd75e078f6c70f29f3ee1bc
> 
> Or is that the method you are suggesting I don't use?

Yes, that is what I meant not to do :) (although it's not like it's a huge deal or anything).

IMO the rules for how try syntax behaves should live in try_option_syntax.py as much as possible. While it is hacky, at least it keeps the hack in one place, rather than spreading the hack throughout the task generation code. It's also a bit of a case of the "tail wagging the dog". Try syntax should adapt to changes in the taskgraph. We shouldn't adapt the taskgraph to modify try syntax.

All that being said, I think fixing this in try_option_syntax.py isn't terribly difficult. Reading the code, I think you just have to add a new clause similar to this one:
https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/try_option_syntax.py#549

Greg Mierzwinski [:sparky]

Assignee

Comment 15

•

6 years ago

Thanks for the explanation! It's a neat way of accomplishing that backout purpose.

Ok sounds good, I'll implement it that way. I'm more worried that any changes in that function will break something else other than ccov, but I think you're right and that it won't be as bad as I expect.

Greg Mierzwinski [:sparky]

Assignee

Comment 16

•

6 years ago

Attached file Bug 1473048 - Prevent code coverage builds from running when using try option syntax. r?ahal — Details

This patch completely disables *ccov, and *jsdcov builds and tests when scheduling them through try option syntax as these build variations use a lot of resources and are rarely needed to be scheduled. The only way to schedule code coverage from now on will be with the |mach try fuzzy| tool.

Greg Mierzwinski [:sparky]

Assignee

Comment 17

•

6 years ago

Here's a try run for the patch showing that it's still possible to use |mach try fuzzy| after these changes: https://treeherder.mozilla.org/#/jobs?repo=try&revision=83bb87afa8a95b78f8e1f1411f332c33fa26f10d

I checked that it's ignored in the try option syntax by using the following command (changing the message):

mach taskgraph decision --base-repository https://hg.mozilla.org/mozilla-central --head-repository https://hg.mozilla.org/mozilla-central --head-ref 000 --head-rev d0d2e0f4b33c --project try --message "try: -b do -p all -u all -t none" --owner me --level 1 --pushlog-id 1 --pushdate 11

Andrew Halberstadt [:ahal]

Comment 18

•

6 years ago

Comment on attachment 9002920 [details]
Bug 1473048 - Prevent code coverage builds from running when using try option syntax. r?ahal

Andrew Halberstadt [:ahal] has approved the revision.

Attachment #9002920 - Flags: review+

Pulsebot

Comment 19

•

6 years ago

Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/bdd6d0de763d
Prevent code coverage builds from running when using try option syntax. r=ahal

Cosmin Sabou [:CosminS]

Comment 20

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/bdd6d0de763d

Status: NEW → RESOLVED

Closed: 6 years ago

status-firefox63: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla63

Chris Manchester (limited bugmail, email directly)

Updated

•

6 years ago

Comment 21

•

6 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-esr60/rev/07aa5059a687

status-firefox-esr60: --- → fixed