Closed Bug 1157310 Opened 5 years ago Closed 3 years ago

make it possible to selectively disable schedulers

Categories

(Release Engineering :: General, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Unassigned)

References

Details

(Whiteboard: [bbb])

Attachments

(2 files, 1 obsolete file)

When we start transitioning scheduling to Taskcluster we need to be able to do so on a somewhat piecemeal basis. I think the most logical unit to break this down by is by Schedulers. In order to prevent double triggering of jobs, before we add scheduling for something in Taskcluster we need to remove scheduling for it from Buildbot, which means we need support in misc.py for disabling various schedulers.
This needs more testing still, and I need to cover the unit test/talos cases, but I want to make sure that this approach of disabling them before they're added makes sense. The approach I was considering was removing them later (maybe in scheduler_master.cfg), but that seems more painful because it requires you to know the names of the schedulers you want to remove...
Attachment #8596060 - Flags: feedback?(jlund)
Comment on attachment 8596060 [details] [diff] [review]
support disabling some types of schedulers

Review of attachment 8596060 [details] [diff] [review]:
-----------------------------------------------------------------

I think this approach will work. My instinct was I thought the builders have a back reference to the scheduler but looking at the code, I guess that's not the case. So we will need to reconfig each time we want to play with these individual config items. Over irc, we discussed how to deal with the test/talos scheduling and toyed with the idea of disabling the sendchange from the build job at the mozharness level. This would allow us to 1) not rely on reconfigs and 2) not touch the misc.py more than we have to.

Though, depending on how fast we roll this out, we can't do everything from mozharness as mh desktop builds are still riding trains and android is not ported yet. Even still, I think disabling the sendchange at the factory.py level is easier than dealing with the schedulers (granted both would require a reconfig).

f+ on the approach. I haven't evaluated line by line critically
Attachment #8596060 - Flags: feedback?(jlund) → feedback+
(In reply to Jordan Lund (:jlund) [PTO until May 13th] from comment #2)
> Comment on attachment 8596060 [details] [diff] [review]
> support disabling some types of schedulers
> 
> Review of attachment 8596060 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think this approach will work. My instinct was I thought the builders have
> a back reference to the scheduler but looking at the code, I guess that's
> not the case.

Yeah, they connect by names, not directly through objects.

> So we will need to reconfig each time we want to play with
> these individual config items. Over irc, we discussed how to deal with the
> test/talos scheduling and toyed with the idea of disabling the sendchange
> from the build job at the mozharness level. This would allow us to 1) not
> rely on reconfigs and 2) not touch the misc.py more than we have to.
> 
> Though, depending on how fast we roll this out, we can't do everything from
> mozharness as mh desktop builds are still riding trains and android is not
> ported yet. Even still, I think disabling the sendchange at the factory.py
> level is easier than dealing with the schedulers (granted both would require
> a reconfig).

I'll probably dig into the sendchanges and unittest/talos stuff in a second patch, since it's so different than the build side. I don't think that reconfigs should be a deciding factor here though - these aren't going to be fiddled with often, it should be a one and done for each particular scheduler, and it's unlikely we'll be doing many batches in the same day.
Here's my tested version of this patch. I generated scheduler graphs once with a pristine buildbot-configs, and once with a patched version that set all of the new flags to false. Here's a bit of verification I did on the results:
➜  tmp  diff -ur graphs-{before,after}
Only in graphs-before: alder-firefox scheduler.svg
Only in graphs-before: alder-mobile scheduler.svg
Only in graphs-before: alder nightly scheduler.svg
Only in graphs-before: alder periodic scheduler.svg
Only in graphs-before: b2g_alder-b2g scheduler.svg
Only in graphs-before: b2g_alder nightly scheduler.svg
Only in graphs-before: b2g_alder periodic scheduler.svg
Only in graphs-before: spidermonkey_tier_1__alder-arm-sim scheduler.svg
Only in graphs-before: spidermonkey_tier_1__alder-compacting scheduler.svg
Only in graphs-before: spidermonkey_tier_1__alder-plaindebug scheduler.svg
Only in graphs-before: spidermonkey_tier_1__alder-plain scheduler.svg
Only in graphs-before: spidermonkey_tier_1__alder-rootanalysis scheduler.svg
Only in graphs-before: weekly-alder scheduler.svg
➜  tmp  find graphs-after -iname "*alder*"
➜  tmp  

As you can see, there are no Alder schedulers in the second version, and all other schedulers remain unchanged as far as existence/upstreams/downstreams/names.

I was going to send this back to Jordan for review, but he's off now...feel free to redirect if there's someone better suited to reviewing this.
Attachment #8600962 - Flags: review?(catlee)
Blocks: 1157242
Attachment #8600962 - Flags: review?(catlee) → review+
Attachment #8600962 - Flags: checked-in+
Attachment #8596060 - Attachment is obsolete: true
This one was shockingly easy -- all of the test schedulers are at the bottom of the big test loop, and a couple at the end. I tested it by setting the new setting for Alder, and it removed ALL of its test schedulers!
Attachment #8606540 - Flags: review?(jlund)
Attachment #8606540 - Flags: review?(jlund) → review+
Attachment #8606540 - Flags: checked-in+
Unassigning myself because this has gone as far as I need it to for now. We still need to support removal of release schedulers and maybe a couple of misc. ones (like weekly bundles).
Assignee: bhearsum → nobody
No longer blocks: bbb
Whiteboard: [bbb]
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.