[taskcluster]'s commit_parser.py can schedule the same job multiple times

RESOLVED FIXED

Status

Taskcluster
Integration
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: ahal, Assigned: dustin)

Tracking

Details

MozReview Requests

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
From bug 1213314 comment 9:
Somehow this change causes mochitest-browser-chrome, mochitest-devtools-chrome and mochitest-webgl to get scheduled twice on try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b84d34d7166a&filter-searchStr=mochitest

You can test it out locally by running:
> ./mach taskcluster-graph --project=try --message='try: -b o -p linux64 -u all -t none' --owner ahalberstadt@mozilla.com --extend-graph --print-names-only
both with and without this change.

From bug 1213314 comment 10:
Apologies, the duplicate scheduling was actually not regressed by this change, it existed before. This change just makes it easier to specify a try syntax that triggers the bug.

For example, even without this patch if you run a try push with the syntax:
try: -b o -p linux64-mulet -u mochitest,mochitest -t none

All the mulet mochitests will get scheduled twice. The fix probably involves using a set instead of a list.

Updated

2 years ago
Component: General → Integration
Assignee: nobody → dustin
Created attachment 8679512 [details]
MozReview Request: Bug 1218841: only return unique tests from parse_tests_chunks; r?ahal

Bug 1218841: only return unique tests from parse_tests_chunks; r?ahal
Attachment #8679512 - Flags: review?(ahalberstadt)
(Reporter)

Comment 2

2 years ago
Comment on attachment 8679512 [details]
MozReview Request: Bug 1218841: only return unique tests from parse_tests_chunks; r?ahal

https://reviewboard.mozilla.org/r/23453/#review20909

I've been working on this a bit too (sorry should have grabbed the bug), and this doesn't quite do what we want. Consider these two try syntax strings:

A) try: -b d -p linux64 -u mochitests,mochitest-browser-chrome-1 -t none
In this case, {'test': 'mochitest-browser-chrome', 'only_chunks': '1'} will overwrite {'test': 'mochitest-browser-chrome'} and the final results will be every non mochitest-browser-chrome job plus mochitest-browser-chrome-1.

B) try: -b d -p linux64 -u mochitest-browser-chrome-1,mochitests -t none
In this case the reverse is true, and the final results will be every mochitest job, including all mochitest-browser-chrome jobs.

(it's worth noting that python makes no guarantees about dictionary ordering, so whether you get behaviour A or B could be totally random).

Buildbot's try_parser will guarantee behaviour B. But I'm sort of wondering if behaviour A isn't actually a little more useful. I think we have three options:
1. Guarantee both try syntaxes result in behaviour A
2. Guarantee both try syntaxes result in behaviour B
3. Guarantee jobs specified later on in the string, overwrite jobs specified earlier on in the string (possibly by using an OrderedDict at the end)

Let me know if that makes sense or not.
Attachment #8679512 - Flags: review?(ahalberstadt)
Fixing this would require re-writing the chunk handling completely -- keep all of the chunks split out until the end, and *then* scan the names for chunk suffixes and build the test list that way.

I see this as a different issue from the one in this bug -- I did scope bug 1213314 down to just expanding aliases, which should include uniquifying, but doesn't include replicating Buildbot's behavior when multiple overlapping test-suite definitions are given.

Do we have to solve this now?
(Reporter)

Updated

2 years ago
Attachment #8679512 - Flags: review+
(Reporter)

Comment 4

2 years ago
Comment on attachment 8679512 [details]
MozReview Request: Bug 1218841: only return unique tests from parse_tests_chunks; r?ahal

https://reviewboard.mozilla.org/r/23453/#review20915

No you're right, we don't have to solve this now. Also looking at your solution again, it does give us the guaratnee in number 3. So I think that's good enough for now.

Fwiw, I don't want to mimic buildbot for no good reason. I was just saying the order the jobs are listed in probably shouldn't impact the results.
I suppose the best determinant of how big an issue this is would be, how often does this happen in try pushes?  Which we could figure out from sampling the pushlog.  If 40% of try pushes will change their behavior and either burn a lot more resources or fail to provide results the user actually wanted, then we should fix it.  But if it's 2%, users can adjust :)
Oh, heh, this doesn't pass its own tests.  I added some sorted(..) to the tests and landed it.
(Reporter)

Updated

2 years ago
Blocks: 1218791
https://hg.mozilla.org/mozilla-central/rev/9419ab5864b7
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.