taskcluster: scale out firefoxci deployment
Categories
(Cloud Services :: Operations: Taskcluster, task)
Tracking
(Not tracked)
People
(Reporter: miles, Assigned: brian)
References
Details
Current numbers of Heroku dynos per service:
queue.web: 25
queue.claimResolver: 4
queue.deadlineResolver: 3
queue.dependencyResolver: 4
auth.web: 10
index.web: 4
secrets.web: 2
Each dyno has roughly 1 cpu and 512MB of memory. These numbers can roughly correspond to k8s pods, though we'd rather be overprovisioned than underprovisioned.
There is a push tomorrow that we'd like to be scaled out for. It's OK if this is done manually for now.
| Reporter | ||
Updated•6 years ago
|
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Comment 1•6 years ago
|
||
I assume everything not in that list is using 1 dyno.
Are all taskcluster services universally configured to reserve 1 CPU and 512MB of RAM in heroku?
Is there a way edunham or I could see historical resource utilization per-service to better fine tune our initial requests? If not, that's fine, we can just adjust downward after launch. We don't have any resource limits, only requests, configured in k8s, so nothing is going to get throttled artificially. We just have to look out for overloaded nodes in the short-term, and address overprovisioning causing us to run too many nodes in the long term.
| Assignee | ||
Comment 2•6 years ago
•
|
||
Miles, do the resource changes at https://paste.mozilla.org/YowdwOJ0 look good to you? If so I can apply both https://github.com/mozilla-services/cloudops-infra/pull/1558 and them in the morning.
I've reserved 0.9 cpu instead of 1 because the node pool is currently comprised of n1-standard-2 instances, which only have 1.94 usable cpu (https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-architecture). I think after the push tomorrow is done we should consider creating a new nodepool with a larger instance type, possibly with a higher cpu to memory ratio.
| Assignee | ||
Comment 3•6 years ago
|
||
This is done. We now have 0.9cpu and 500MB of RAM reserved for each service, and the additional replicas described in the original request.
I'll create a bug for us to followup in a couple weeks and tweak the requests and replicas.
From the Taskcluster dev side, my expectation is that once things settle down you can start to generate HPAs for these 7 services so we will not need to scale up and down manually.
| Reporter | ||
Updated•6 years ago
|
Description
•