Closed Bug 929823 (panda-0843) Opened 11 years ago Closed 11 years ago

panda-0843 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P1)

ARM
Android

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Callek, Unassigned)

References

Details

(Whiteboard: [buildduty][buildslaves][capacity])

      No description provided.
How about rather than panda-recovery, we give it panda-extreme-unction? This panda, while disabled mind you, has made people wait for 62 hours since October 18th, while it spends 20 minutes failing to do anything, then sets RETRY.

We do have some way of actually disabling a panda, don't we?
Severity: normal → critical
Priority: P3 → P1
It has now run mochitest-8 on the merge to beta fifteen times during the course of the last five and a half hours.
Disabled in slacealloc.
Callek, this panda is still taking jobs even though it is disabled in slavealloc:
https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.html?name=panda-0843

I thought slavealloc was supposed to work for pandas now?
Flags: needinfo?(bugspam.Callek)
(In reply to Ed Morley [:edmorley UTC+1] from comment #4)
> Callek, this panda is still taking jobs even though it is disabled in
> slavealloc:
> https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.
> html?name=panda-0843
> 
> I thought slavealloc was supposed to work for pandas now?

Ugh apparantly buildbot has been running for this panda since July first! Which means it didn't get the change that allowed it to automatically shut down after every job, so it wasn't using any new code for slavealloc.

I killed all buildbot jobs on this foopy, since they were all running for that long. (And could explain some of the retry amounts for these pandas)
Flags: needinfo?(bugspam.Callek)
(In reply to Justin Wood (:Callek) from comment #5)
> Ugh apparantly buildbot has been running for this panda since July first!
> Which means it didn't get the change that allowed it to automatically shut
> down after every job, so it wasn't using any new code for slavealloc.
> 
> I killed all buildbot jobs on this foopy, since they were all running for
> that long. (And could explain some of the retry amounts for these pandas)

Great - thank you :-)
Could we check that the same hasn't occurred on any of the other foopies?
Flags: needinfo?(bugspam.Callek)
At first glance, at least the following seem to be exhibiting the same problem (found by selecting a handful of disabled tegras at random and seeing if they are still taking jobs + what master they are on):
foopy89
foopy91
foopy92
foopy94
foopy96
foopy97
foopy98

Think we may need to check them all sadly.
Moving discussion to bug 888835.
Flags: needinfo?(bugspam.Callek)
Sending this slave to recovery
-->Automated message.
recovered by "panda-recovery" bug 902657
Severity: critical → normal
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.