Closed
Bug 1295993
Opened 9 years ago
Closed 9 years ago
Large new items queues on several buildbot masters
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: aselagea, Unassigned)
References
Details
Noticed several alerts like the one below in #buildduty.
nagios-releng> Wed 07:42:28 PDT [4109] buildbot-master70.bb.releng.use1.mozilla.com:Command Queue is CRITICAL: 642 new items:oldest item is 1653s old (http://m.mozilla.org/Command+Queue)
The masters seem to claim the jobs, but they are not being processed.
| Reporter | ||
Comment 1•9 years ago
|
||
Masters affected at this point: bm70, bm73, bm77, bm94.
Disabled them in slavealloc and did a graceful shutdown. Waiting for them to finish running the current jobs.
Comment 2•9 years ago
|
||
So
buildbot-master74 is not a problem and it is a windows build master. However, the trees are closed for bug 1295950 so I'm not sure if there were simply not jobs queued for them and somehow they were redirected to the other masters.
http://nagios1.private.releng.scl3.mozilla.com/releng-scl3/cgi-bin/status.cgi?navbarsearch=1&host=buildbot-master74
I wonder if the root cause is that bug 1295950 caused a huge number of retries and the masters simply couldn't keeup up.
| Reporter | ||
Comment 3•9 years ago
|
||
All four buildbot-masters finished their graceful shutdown, so I rebooted and enabled them back in slavealloc.
Comment 4•9 years ago
|
||
The queues are not a problem now. bm70, 73, 77 haven't run jobs since they were rebooted however, the trees have been closed most of the day for bug 1295950. bm94 has run a job since it was rebooted.
| Reporter | ||
Comment 5•9 years ago
|
||
The masters look good at the moment, they all started running jobs.
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•