Jenkin's is complaining that it is under resourced, and we've had some hanging builds lately. from https://ci.mozilla.org/manage: "Threads are not keeping up with the demands. Check if your polling is hanging, and/or increase the number of threads if necessary." from https://ci.mozilla.org/descriptor/hudson.triggers.SCMTrigger/: "There are more SCM polling activities scheduled than handled, so the threads are not keeping up with the demands. Check if your polling is hanging, and/or increase the number of threads if necessary." If I understand things right, most of our jobs poll SCM. If we can't increase the Jenkin's resources we might convert more of our jobs to listening for push notifications from GH.
I think this is likely a bit better now, due to improvements in the socorro build processes which result in far less disk I/O on jenkins1.dmz.phx1. jenkins1.dmz.phx1 has 37GB RAM, but is currently 14GB into swap... there's enough free mem + cache to empty out swap, so we could do that and tweak swapiness... might help some. A better solution is probably to set up some jenkins slaves and move some of the workload over to them.
Priority: -- → P4
Whiteboard: [pending triage] → [triaged 20121019]
@jakem the bug you caught in my build scripts was introduced after this bug was filed. I'm sure it wasn't helping but I don't think I/O is the problem.
Component: Server Operations: Web Operations → WebOps: IT-Managed Tools
Product: mozilla.org → Infrastructure & Operations
Job clean up, disk clean, and jenkins tuning has improved this enough not to be an issue.
Assignee: server-ops-webops → bburton
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.