Closed Bug 1231781 Opened 9 years ago Closed 7 years ago

Multiple priority support to help constrained pools

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: jonasfj)

References

Details

Attachments

(1 file)

56 bytes, text/x-github-pull-request
jhford
: review+
bstack
: review+
Details | Review
To cope with situations where we have constrained capacity (e.g. hardware test machines), we need multiple priority levels.

There's also a need to be able to increase the priority of tasks.

At Mozlando, Jonas and I discussed one option which is to allow a fixed number of named priorities in the tasks. e.g. "extralow", "low", "medium", "high", "superhigh".

On the queue side, new queues in Azure would be created whenever a task with a new priority is seen. If the queues are empty for a while (like a week), they can be removed.

The taskcluster queue currently hands out a list of endpoints for the workers to poll in order. New queues for higher priorities would be picked up within a reasonable amount of time (~20 minutes).

Some priorities should be protected by scopes, especially on worker types where we have constraints.
We'll change the scope for priority to:
  queue:task-priority:<provisionerId>/<workerType>/<priority>

So we can prevent some priorities from being used on aws workers, as too many priority levels just means more azure queues to poll..
Assignee: nobody → jhford
Blocks: 1243024
Summary: Multiple priority support → Multiple priority support to help constrained pools
Blocks: 1080265
No longer blocks: 1243024
Attached file github PR
Load-test is still to be done, but please start on the review.
Assignee: jhford → jopsen
Status: NEW → ASSIGNED
Attachment #8786059 - Flags: review?(jhford)
Attachment #8786059 - Flags: review?(bstack)
Note: for anyone blocking on this, we have both high and normal priority, please start using these now :)
I did a first pass of this and it looks good, but I'd like to look it over again.
Flags: needinfo?(jhford)
Attachment #8786059 - Flags: review?(bstack) → review+
Jonas, can you summarize what was done here and the tradeoffs made right now for priority support.  Please loop catlee into that as well since I know he was initially interested in it.
Flags: needinfo?(jopsen)
summary:
 * We have added queue.claimWork, so workers don't have poll azure queues directly anymore
 * We've load tested queue.claimWork, but we won't fully trust that it sticks until we have
   moved all workers to use it (long-polling may be infeasible we won't know until 1k hosts are polling).

Next steps:
 1) Move docker-worker to use queue.claimWork
 2) If (1) works out fine, move all other workers to queue.claimWork (somewhat gradually)
 3) When most of the workerTypes with many nodes have been moved to queue.claimWork
    we can add 5 priority levels.

Note: Motivation for going the queue.claimWork direction was that we could both reduce load on azure,
complexity in works and get some decent performance with long-polling. This all assumes our serves
won't get overloaded from all the long-polling. Which time will tell.
Flags: needinfo?(jopsen)
Comment on attachment 8786059 [details] [review]
github PR

I've gone over the code again and I'm not seeing any red flags.  R+ from me, looks great!
Flags: needinfo?(jhford)
Attachment #8786059 - Flags: review?(jhford) → review+
That PR has landed.  I believe the docker-worker changes are still being rolled out. This bug enables us to have more than the long-existing (but as far as I can tell, unused) two.
The two priorities are used by releasetasks.  Releasetasks doesn't need *more* than two priorities, so as far as I can tell this is not on the list of dependencies for turning off Buildbot.  I'd love to be correct.  Catlee, if this is important to the migration, what does it block?
Flags: needinfo?(catlee)
I'm dumb -- the answer is that our constrained testing pools will need these priorities to differentiate e.g., try from integration.
Flags: needinfo?(catlee)
(In reply to Dustin J. Mitchell [:dustin] from comment #9)
> The two priorities are used by releasetasks.  Releasetasks doesn't need
> *more* than two priorities, so as far as I can tell this is not on the list
> of dependencies for turning off Buildbot.  I'd love to be correct.  Catlee,
> if this is important to the migration, what does it block?

During today's release work, we're finding that on constrained pools, we need to be able to set

release > esr > beta > aurora > central > everything else

which would suggest at least 5-6 priority levels.
https://github.com/mozilla-releng/build-buildbot-configs/blob/7b71825/mozilla/master_common.py#L25
If we want idle time jobs like fuzzing, that's another, lower priority.
Depends on: 1364266
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Component: Queue → Services
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: