Identity jobs occupy most of our CI resources

RESOLVED FIXED

Status

Infrastructure & Operations
WebOps: Other
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: lonnen, Assigned: jakem)

Tracking

Details

(Whiteboard: [push interrupt])

(Reporter)

Description

6 years ago
Identity is running 8 CI builds plus 1 dns check both on SCM change and every 10 minutes. Each build takes 2.5-7 minutes, and occupies 8 of our 10 workers for that time on ci.mozilla.org.

Does identity need to run both on SCM change and every 10 minutes?
If so, can we expand the number of workers?

I'd cc someone from identity but I don't know who to pull in.
(Reporter)

Comment 1

6 years ago
Jared, I see from bug 777506 that you set the identity jobs to run on a 10 minute cron to test for DNS problems. Can that be turned off?
If it's okay, I'd like to leave the cron running aggressively for at most another week. We're hardening our selenium tests, and running more frequently is helping us catch weird intermittent errors.

Is it possible to turn up the number of executor threads in the jenkins pool temporarily?
Assignee: server-ops → server-ops-devservices
Component: Server Operations → Server Operations: Developer Services
QA Contact: jdow → shyam
after talking it over with lloyd, turning down to hourly to gather data over the next week at a slightly less crazy pace.
We can totally bump up the workers on this. We didn't need more than 10 before, the hardware behind ci.m.o is beefy enough to support another 10 workers and beyond, pretty easily. I started with 10 about a year ago and a lot more projects are using ci.m.o now.
Assignee: server-ops-devservices → server-ops-webops
Component: Server Operations: Developer Services → Server Operations: Web Operations
QA Contact: shyam → cshields
(Assignee)

Comment 5

6 years ago
I found this setting in the Jenkins interface, and bumped this up from 10 to 20.

Additionally, it seems that the interface setting for this doesn't completely implement the change. It requires a slight bit of effort:

https://issues.jenkins-ci.org/browse/JENKINS-3092

That bug is fixed, but I guess is not implemented in our particular installation. The workaround (as copied from that bug) is:

"
you could even do this in your current version of hudson by entering
this in the script console after saving the config change:
hudson.model.Hudson.instance.updateComputerList()
"

I have done this, and all 20 executors are now shown in the sidebar.
Assignee: server-ops-webops → nmaul
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Assignee)

Updated

6 years ago
Whiteboard: [push interrupt]
so, since we have all these resources and our tests are still failing intermittently, cranking the cron back up to 10min to gather some extra data.
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.