Investigate Jenkin's resourcing troubles

RESOLVED FIXED

Status

Infrastructure & Operations
WebOps: IT-Managed Tools
P4
normal
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: lonnen, Assigned: solarce)

Tracking

Details

(Whiteboard: [triaged 20121019])

(Reporter)

Description

5 years ago
Jenkin's is complaining that it is under resourced, and we've had some hanging builds lately.

from https://ci.mozilla.org/manage:
"Threads are not keeping up with the demands. Check if your polling is hanging, and/or increase the number of threads if necessary."

from https://ci.mozilla.org/descriptor/hudson.triggers.SCMTrigger/:
"There are more SCM polling activities scheduled than handled, so the threads are not keeping up with the demands. Check if your polling is hanging, and/or increase the number of threads if necessary."

If I understand things right, most of our jobs poll SCM. If we can't increase the Jenkin's resources we might convert more of our jobs to listening for push notifications from GH.

Updated

5 years ago
Whiteboard: [pending triage]

Comment 1

5 years ago
I think this is likely a bit better now, due to improvements in the socorro build processes which result in far less disk I/O on jenkins1.dmz.phx1.

jenkins1.dmz.phx1 has 37GB RAM, but is currently 14GB into swap... there's enough free mem + cache to empty out swap, so we could do that and tweak swapiness... might help some.

A better solution is probably to set up some jenkins slaves and move some of the workload over to them.
Priority: -- → P4
Whiteboard: [pending triage] → [triaged 20121019]
(Assignee)

Updated

5 years ago
Depends on: 803599
(Reporter)

Comment 2

5 years ago
@jakem the bug you caught in my build scripts was introduced after this bug was filed. I'm sure it wasn't helping but I don't think I/O is the problem.
(Assignee)

Updated

5 years ago
Component: Server Operations: Web Operations → WebOps: IT-Managed Tools
Product: mozilla.org → Infrastructure & Operations
(Assignee)

Comment 3

5 years ago
Job clean up, disk clean, and jenkins tuning has improved this enough not to be an issue.
Assignee: server-ops-webops → bburton
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.