Bug 843744 (panda-0081)

Decommission panda-0081

RESOLVED FIXED

Status

P3
normal
RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: kmoir, Unassigned)

Tracking

Details

(Whiteboard: [buildduty][buildslaves][capacity], URL)

(Reporter)

Description

6 years ago
DC ops is looking at this panda, as it's very flaky.

See bug 817103 for more details
(Reporter)

Comment 1

6 years ago
Also forgot to mention that I chmoded og-r the panda-0081 directory on foopy85 so it would stay out of commission
(Reporter)

Comment 2

6 years ago
 echo "bug 843744" > /builds/panda-0081/disabled.flg is a better way to disable a panda
The issue that first caused this panda to fail (5 months ago) might have been an issue with the automation process and not really a problem with the panda board itself.  Since I really didn't find anything wrong with this panda board, I issued a reimage via mozpool.
Resolving all panda bugs linked from Bug 817103 that are not in troubleshoot or failed_pxe* state in lifeguard.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
pdu reboot didn't help, needs recovery
Status: RESOLVED → REOPENED
Depends on: 902657
Resolution: FIXED → ---
Sending this slave to recovery
-->Automated message.
Depends on: 948669

Comment 7

5 years ago
panda-081 -  selftest.py[INFO]: test_preseed_file_integrity[FAILED] boot.scr : 5a5c34aa07d2d8f23e1b69347d49bacf205041dd != 6261fdd19a45db13e6503c5010e3917dbb13eeed

fix - replaced SD card with correct preseed.
Back in production
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago5 years ago
Resolution: --- → FIXED
Attempting SSH reboot...Failed.
Attempting reboot via Mozpool...Failed.
Filed IT bug for reboot (bug 1067672)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Didn't recover, hasn't taken a job for 26 days.
QA Contact: armenzg → bugspam.Callek
Depends on: 1072405
Panda was added to the proper foopy and is now taking jobs.
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago4 years ago
Resolution: --- → FIXED
Failing every other job, disabled in slavealloc.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Tools repo updated on foopy so that this panda can be properly rebooted again.
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
Since then, 2 green, 1 orange, 17 retry, 0 jobs for the last 4 days. Not sure this was a good choice for backfill. Disabled.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 15

4 years ago
replaced SD card, panda passed self test
Reenabled and rebooted.
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
Attempting SSH reboot...Failed.
Attempting reboot via Mozpool...Failed.
Filed IT bug for reboot (bug 1082738)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attempting SSH reboot...Failed.
Attempting reboot via Mozpool...Failed.
Filed IT bug for reboot (bug 1083006)

Comment 19

4 years ago
replaced SD card, panda passed self test.

panda-0081	ready (request 2313518)
Score for the 2014-10-10 SD card: 11 retries, 2 failures, 5 passes.
Score for the 2014-10-20 SD card: 3 retries, 2 failures, 0 passes (and I'm already calling that a complete score since it hasn't taken a job for three days now).

It ain't the SD card. Disabled in slavealloc so we can skip another pointless card replacement the next time it fails to reboot.
Let's decomm this one.
Assignee: nobody → server-ops-dcops
Component: Buildduty → Server Operations: DCOps
Product: Release Engineering → mozilla.org
QA Contact: bugspam.Callek → dmoore
Summary: panda-0081 problem tracking → Decommission panda-0081

Updated

4 years ago
colo-trip: --- → scl3

Updated

4 years ago
Assignee: server-ops-dcops → nobody
Component: Server Operations: DCOps → Server Operations: MOC

Comment 22

4 years ago
Hey MOC team can you remove this host from nagios before I physically decomm it?  Thanks.
(In reply to Vinh Hua [:vinh] from comment #22)
> Hey MOC team can you remove this host from nagios before I physically decomm
> it?  Thanks.

Panda's are commented in nagios. Can you update the status of the bug ?
Flags: needinfo?(vhua)
Assignee: nobody → server-ops
Component: Server Operations: MOC → Server Operations
Assignee: server-ops → server-ops-dcops
Component: Server Operations → DCOps
Product: mozilla.org → Infrastructure & Operations

Comment 24

4 years ago
Panda-0081 has been physically decomm'd.
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Flags: needinfo?(vhua)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.